UBC Statistics Department Colloquium Series
UBC Statistics is launching a new Department Colloquium Series! These talks will be broad, accessible, and engaging — all are welcome!
Our inaugural talk will take place on Monday, March 16th. We’re excited to launch this new series by welcoming Dr. David Haziza, Professor in the department of mathematics and statistics at the University of Ottawa.
Date: Monday, March 16, 2026
Time: 3 - 4 PM
Location: ESB 5104/5106
Title: A Debiased Machine Learning Single-Imputation Framework for Item Nonresponse in Surveys
Abstract: Machine learning methods are now increasingly studied and used in National Statistical Offices, in particular to handle item nonresponse, where some survey respondents answer certain questions but leave others missing. In most surveys, item nonresponse affects key study variables, and imputation is routinely used to handle the resulting missing data. Standard parametric imputation methods can support rigorous inference when their modeling assumptions are approximately correct. However, when the imputation model is misspecified, the resulting inferences may be potentially misleading. Machine learning offers a flexible alternative by learning complex relationships between variables from the data, which can reduce the risk of misspecification. At the same time, this flexibility introduces new challenges for survey inference, since modern learning algorithms may converge more slowly than classical parametric models and may not automatically deliver valid uncertainty quantification. In this talk, I will present a survey sampling extension of the double/debiased machine learning framework of Chernozhukov et al. (2018). The proposed approach combines machine learning-based imputation with design-based survey weighting and an orthogonalized estimating strategy, leading to root-$n$ consistent and asymptotically normal estimation of population means under realistic conditions. We also develop a consistent variance estimator, yielding asymptotically valid confidence intervals while allowing the use of a wide range of machine learning algorithms. I will briefly discuss aggregation procedures and conclude with simulation results illustrating the performance of the proposed methodology.
This colloquium series is sponsored in part by the Constance van Eeden Endowment.
Future talks in this colloquium series:
Tuesday, April 21
Speaker: Dr. Edward Kennedy (Carnegie Mellon University)
Time: 11 AM–12 PM
Location: ESB 5104/5106
Monday, June 8
Speaker: Dr. Stephanie Hicks (Johns Hopkins University)
Time: 3–4 PM
Location: ESB 5104/5106