research

research

Recent and current projects in statistics education

To join this seminar virtually: Please request Zoom connection details from ea@stat.ubc.ca.

Abstract: The work of the Flexible Learning in Statistics Group ranges from conducting studies of important aspects of statistics education to developing and testing resources for difficult statistics concepts. In this seminar, students will present several recent projects: using student focus groups to assess Shiny apps, developing and testing interactive resources to improve understanding of Bayesian inference, enhancing Stat 251 labs by creating active learning material and introducing pre-lab quizzes, and conducting a study of the impact of exam question wording on the performance of students with English as an Additional Language (EAL). You’ll also hear about StatEngage, the ASDa-led project to guide students through the challenges of consulting.

Event Type
Location
ESB 4192 / Zoom
Speaker
Rachel Lobay, Joey Hotz, Gian Carlo Di-Luvi, and Kenny Chiu on behalf of the Flexible Learning in Statistics Group
Event date time
-

Causal Inference with Cocycles

To join this seminar virtually: Please request Zoom connection details from ea@stat.ubc.ca.

Abstract: Many interventions in causal inference can be represented as transformations of the variables of interest. Abstracting interventions in this way allows us to identify a local symmetry property exhibited by many causal models under interventions. Where present, this symmetry can be characterized by a type of map called a cocycle, an object that is central to dynamical systems theory. We show that such cocycles exist under general conditions and are sufficient to identify interventional distributions and, under suitable assumptions, counterfactual distributions. We use these results to derive cocycle-based estimators for causal estimands and show that they achieve semiparametric efficiency under standard conditions. Since entire families of distributions can share the same cocycle, these estimators can make causal inference robust to mis-specification by sidestepping superfluous modelling assumptions. We demonstrate both robustness and state-of-the-art performance in several simulations, and apply our method to estimate the effects of 401(k) pension plan eligibility on asset accumulation using a real dataset.

Joint work with Hugh Dance (UCL/Gatsby Unit): https://arxiv.org/abs/2405.13844

Event Type
Location
ESB 4192 / Zoom
Speaker
Benjamin Bloem-Reddy, Assistant Professor, UBC Department of Statistics
Event date time
-
Tags

Online Kernel-Based Mode Learning

To join this seminar virtually: Please request Zoom connection details from ea@stat.ubc.ca.

Abstract: The presence of big data, characterized by exceptionally large sample size, often brings the challenge of outliers and data distributions that exhibit heavy tails. An online learning estimation that incorporates anti-outlier capabilities while not relying on historical data is therefore urgently required to achieve robust and efficient estimators. In this talk, we introduce an innovative online learning approach based on a mode kernel-based objective function, specifically designed to address outliers and heavy-tailed distributions in the context of big data. The developed approach leverages mode regression within an online learning framework that operates on data subsets, which enables the continuous updating of historical data using pertinent information extracted from a new data subset. We demonstrate that the resulting estimator is asymptotically equivalent to the mode estimator calculated using the entire dataset. Monte Carlo simulations and an empirical study are presented to illustrate the finite sample performance of the proposed estimator.

Event Type
Location
ESB 4192 / Zoom
Speaker
Tao Wang, Assistant Professor, Department of Economics / Department of Mathematics and Statistics, University of Victoria
Event date time
-
Tags

Two MSc student presentations (Charlotte Edgar & Graeme Kempf)

To join this seminar virtually: Please request Zoom connection details from ea@stat.ubc.ca.

Presentation 1

Time: 11:00am – 11:30am

Speaker: Charlotte Edgar, UBC Statistics MSc student

Title: Cellwise Robust Covariance-Regularized Regression for High-Dimensional Data

Abstract: It is common to use regularization methods when dealing with high-dimensional regression problems. The scout family, developed by Witten and Tibshirani in 2009, is a class of covariance-regularized regression procedures suitable for prediction in high-dimensional settings. The scout procedure estimates the inverse covariance matrix through two log-likelihood maximization steps that each allow for regularization and then uses the estimated inverse covariance matrix to obtain estimates of the regression coefficients. The aim of this project was to make the scout procedure robust to cellwise outliers. Cellwise outliers are common in high-dimensional datasets and recent work has led to cellwise robust covariance estimates that could be used in the scout procedure. We assess the predictive performance of robust plug-in estimators and outlier detection methods. The development of a regression method that is robust to cellwise outliers, encourages sparsity, and can be applied in high-dimensional settings would be valuable to many fields, such as genomics, and is an area undergoing current research.

Presentation 2

Time: 11:30am – 12:00pm

Speaker: Graeme Kempf, UBC Statistics MSc student

Title: The impact of disease-modifying drugs for multiple sclerosis on hospitalizations and mortality in British Columbia: A retrospective study using an illness-death multi-state model

Abstract: The efficacy of disease-modifying drugs (DMDs) for multiple sclerosis was established in clinical trials that were short and excluded older individuals and individuals living with comorbidities. This has led to a lack of knowledge of the effects of chronic DMD use and the effects of DMDs on individuals that do not meet the traditional eligibility criteria for clinical trials. Multi-state models are a technique which can advance the understanding of a disease beyond that offered by time-to-event models alone. The long-term, real-world efficacy of DMDs was explored by applying a multi-state model to administrative healthcare data. Whether exposure to any DMD is associated with fewer hospitalizations, shorter hospitalizations, and/or a reduction in the chance of dying inside or outside the hospital was investigated using multi-state techniques such as intensity-based analysis and pseudo-value regression.

Location
ESB 4192 / Zoom
Speaker
Charlotte Edgar, UBC Statistics MSc student; Graeme Kempf, UBC Statistics MSc student
Event date time
-
Tags