education

education

Inclusive Approaches to Data Literacy

Building data literacy requires intentional design—both inside and outside the classroom. As statistics educators, we are uniquely positioned to help learners not only analyze data but also communicate, question, and connect with it in meaningful ways. This talk will explore several initiatives that promote inclusive and engaging approaches to developing data literacy across different educational levels.

First, I will discuss a range of outreach activities designed to introduce statistical thinking to elementary and secondary students through play, storytelling, and authentic data contexts. These activities—ranging from constructing visualizations using the Spotify API to exploring sampling methods with geodes and “dinosaur fossils”—have been implemented at events such as Florence Nightingale Day and Pursue STEM. Such initiatives align with calls to cultivate early data literacy and “real-world statistical reasoning” among pre-tertiary learners (Ben-Zvi & Garfield, 2004; Ridgway, 2016).

Second, I will highlight innovations in postsecondary statistics education, focusing on the integration of Universal Design for Learning (UDL) principles (CAST, 2018) in a large third-year course (STA304: Surveys, Sampling, and Observational Data). Through flexible grading, grace period, and generative AI policies, the course design supports diverse learners while maintaining academic rigor. Student feedback illustrates how flexibility can enhance motivation, equity, and engagement—findings that echo recent work on inclusive assessment and learning autonomy in statistics education (Engel, 2017).

Together, these projects demonstrate how flexibility, communication, and creativity can support inclusive data literacy education across age groups. By integrating outreach and UDL-informed teaching, we can expand access to data-driven inquiry and foster a more diverse and data-confident generation of learners.

To join this seminar virtually, please request Zoom connection details from ea@stat.ubc.ca. 

Tags

Discipline-Specific TA Training: A Scalable Model for Departments

Most universities offer centralized teaching development programs for Teaching Assistants (TAs), but discipline-specific initiatives – particularly in statistics and actuarial science – are often limited or informal. In response to this gap, the Department of Statistics and Actuarial Science at the University of Waterloo launched a comprehensive TA Program in 2023. This initiative encompasses all aspects of graduate teaching assistantships and includes the Foundations for University Teaching in Statistics and Actuarial Science certificate training program.

Developed in collaboration with the university’s Centre for Teaching Excellence, our program provides structured, sequential training tailored to the unique demands of statistics and actuarial science courses. It equips incoming and current graduate TAs with the skills needed to confidently and effectively fulfill their roles, including proctoring, grading, facilitating tutorials, and preparing and delivering lecture content.

In this talk, we will outline the state of our TA training prior to 2023, share the motivations behind the creation of our program, and describe its current structure. We will present data on TA participation, share feedback from past trainees, and discuss future directions for the program, including its potential adaptation by other departments and institutions.

To join this seminar virtually, please request Zoom connection details from ea@stat.ubc.ca. 

Tags

Ensembles in the Age of Overparameterization: Promises and Pathologies

To join this seminar virtually: please click here.
Abstract: Ensemble methods have historically used either high-bias base learners (e.g. through boosting) or high-variance base learners (e.g. through bagging). Modern neural networks cannot be understood through this classic bias-variance tradeoff, yet "deep ensembles" are pervasive in safety-critical and high-uncertainty application domains. This talk will cover surprising and counterintuitive phenomena that emerge when ensembling overparameterized base models like neural networks. While deep ensembles improve generalization in a simple and cost-effective manner, their accuracy and robustness are often outperformed by single (but larger) models. Furthermore, discouraging diversity amongst component models often improves the ensemble's predictive performance, counter to classic intuitions underpinning bagging and feature subsetting techniques. I will connect these empirical findings with new theoretical characterizations of overparameterized ensembles, and I will conclude with implications for uncertainty quantification, robustness, and decision making.
Tags

Modelling Complex Biologging Data with Hidden Markov Models

To join this seminar virtually: Please request Zoom connection details from ea@stat.ubc.ca.

Abstract:  Hidden Markov models (HMMs) are commonly used to identify latent processes from observed time series, but it is challenging to fit them to large and complex time series collected by modern sensors. Using data from threatened resident killer whales (Orcinus orca) off the western coast of Canada as a case study, we provide solutions to three common challenges faced when identifying latent behaviour from complicated biologging data. First, biologging time series often violate common assumptions of HMMs when collected at high frequencies. We thus propose a hierarchical approach which utilizes moving-window Fourier analysis to capture fine-scale dependence structures. Second, modern technology allows researchers to directly label the latent process of interest, but rare labels can have a negligible influence on parameter estimates. We introduce a weighted likelihood approach that increases the relative influence of labelled observations. Third, applying HMMs to large time series is computationally demanding, so we propose a novel EM algorithm that combines a partial E step with variance-reduced stochastic optimization within the M step. These solutions allow researchers to model biologging data with HMMs that are more interpretable, accurate, and efficient to fit than existing methods.