Graduate Student Seminar

Modelling Complex Biologging Data with Hidden Markov Models

To join this seminar virtually: Please request Zoom connection details from

Abstract:  Hidden Markov models (HMMs) are commonly used to identify latent processes from observed time series, but it is challenging to fit them to large and complex time series collected by modern sensors. Using data from threatened resident killer whales (Orcinus orca) off the western coast of Canada as a case study, we provide solutions to three common challenges faced when identifying latent behaviour from complicated biologging data. First, biologging time series often violate common assumptions of HMMs when collected at high frequencies. We thus propose a hierarchical approach which utilizes moving-window Fourier analysis to capture fine-scale dependence structures. Second, modern technology allows researchers to directly label the latent process of interest, but rare labels can have a negligible influence on parameter estimates. We introduce a weighted likelihood approach that increases the relative influence of labelled observations. Third, applying HMMs to large time series is computationally demanding, so we propose a novel EM algorithm that combines a partial E step with variance-reduced stochastic optimization within the M step. These solutions allow researchers to model biologging data with HMMs that are more interpretable, accurate, and efficient to fit than existing methods.

ESB 4192 / Zoom
Evan Sidrow, UBC Statistics PhD student
Event date time

Two MSc student presentations (Charlotte Edgar & Graeme Kempf)

To join this seminar virtually: Please request Zoom connection details from

Presentation 1

Time: 11:00am – 11:30am

Speaker: Charlotte Edgar, UBC Statistics MSc student

Title: Cellwise Robust Covariance-Regularized Regression for High-Dimensional Data

Abstract: It is common to use regularization methods when dealing with high-dimensional regression problems. The scout family, developed by Witten and Tibshirani in 2009, is a class of covariance-regularized regression procedures suitable for prediction in high-dimensional settings. The scout procedure estimates the inverse covariance matrix through two log-likelihood maximization steps that each allow for regularization and then uses the estimated inverse covariance matrix to obtain estimates of the regression coefficients. The aim of this project was to make the scout procedure robust to cellwise outliers. Cellwise outliers are common in high-dimensional datasets and recent work has led to cellwise robust covariance estimates that could be used in the scout procedure. We assess the predictive performance of robust plug-in estimators and outlier detection methods. The development of a regression method that is robust to cellwise outliers, encourages sparsity, and can be applied in high-dimensional settings would be valuable to many fields, such as genomics, and is an area undergoing current research.

Presentation 2

Time: 11:30am – 12:00pm

Speaker: Graeme Kempf, UBC Statistics MSc student

Title: The impact of disease-modifying drugs for multiple sclerosis on hospitalizations and mortality in British Columbia: A retrospective study using an illness-death multi-state model

Abstract: The efficacy of disease-modifying drugs (DMDs) for multiple sclerosis was established in clinical trials that were short and excluded older individuals and individuals living with comorbidities. This has led to a lack of knowledge of the effects of chronic DMD use and the effects of DMDs on individuals that do not meet the traditional eligibility criteria for clinical trials. Multi-state models are a technique which can advance the understanding of a disease beyond that offered by time-to-event models alone. The long-term, real-world efficacy of DMDs was explored by applying a multi-state model to administrative healthcare data. Whether exposure to any DMD is associated with fewer hospitalizations, shorter hospitalizations, and/or a reduction in the chance of dying inside or outside the hospital was investigated using multi-state techniques such as intensity-based analysis and pseudo-value regression.

ESB 4192 / Zoom
Charlotte Edgar, UBC Statistics MSc student; Graeme Kempf, UBC Statistics MSc student
Event date time