Teaching

I am fortunate to have had the opportunity to teach in a variety of roles. These have included:

Developing and teaching modules in statistics, data science and data ethics.
Designing courses for both in-person and remote learning, predominantly at the postgraduate level.
Supervising undergraduate, postgraduate and doctoral research projects.
Adapting and leading short courses on scientific writing and communication.
Running workshops and computer labs for undergraduate and postgraduate modules.
Speaking at university open days and providing one-to-one tuition to high school students.

I am a Fellow of the Higher Education Academy, which acknowledges my commitment to enhancing the quality, effectiveness, and accessibility of my teaching activities. I achieve this through reflection, learning from others, and engaging with educational research. I am a co-founder and organiser of the Mathematics Teaching Seminar at Imperial College London, which aims to promote collaborative, innovative and evidence-based approaches to mathematics education.

My Courses

Effective Data Science

Notes Videos Resources

Model building and evaluation are necessary but not sufficient skills for the effective practice of data science. In this module you will develop the technical and personal skills that are required to work successfully as a data scientist within an organisation.

During this module you will critically explore how to:

effectively scope and manage a data science project;
work openly and reproducibly;
efficiently acquire, manipulate, and present data;
interpret and explain your work for a variety of stakeholders;
ensure that your work can be put into production;
assess the ethical implications of your work as a data scientist.

This interdisciplinary course draws from fields including statistics, computing, management science and data ethics. Each topic will is investigated through a selection of lecture videos, conference presentations and academic papers, supported by hands-on lab exercises and readings on industry best-practices as published by recognised professional bodies.

Ethical Machine Learning and Data Science

Reading List Part 1 Part 2 Part 3

Data-driven decision making is now pervasive and impacts us all. Your data is used by others to make decisions about who you are, how you will behave, and what options should be made available to you. Predictive models are used to decide anything from the promotions that are offered to you by a retailer through to whether your loan application is granted by a bank.

The ways in which these predictive models can fail mathematically form a core part of the training for an aspiring statistician, data scientist or machine learning practitioner. In contrast, the potential for ethical failures in these same models is rarely covered in-depth as part of this initial training. As a result, these ethical modes of failure are often not considered until those predictive models have been put into production and are actively causing harm.

To prevent this harm, the ethical impacts of using data to make decisions must be made core to the curriculum of both statistics and data science. This course aims to address that gap.

The course takes a practical and technical approach to identifying these ethical issues. It has a strong mathematical focus and will not not require the authoring of extended essays or moral treaties. Throughout the course, you will discover actionable ways in which these topics may be integrated into a data science workflow at a range of levels.

First Year Research Project: Stochastic Point Processes

Notes

Course materials to guide a first exploration of stochastic point processes.

We will have four sessions, each focusing in on one aspect of stochastic point processes: theory, simulation, inference and applications. The materials for these sessions focus mainly on the Poisson process in one dimension, which considered both as a special example of a stochastic process and also as a statistical model for point pattern data. We will explore this simple model together and I will point you in the direction of interesting areas to further explore for yourself.

These materials were created as a starting point for undergraduate research projects at Imperial College London. The intended reader is an undergraduate student at the end of their first year of study, who is familiar with the basics of probability and statistical inference and has some programming experience. The materials might also be useful to anyone with a similar background who is learning about point process models for the first time.

Supervised Learning

Videos

This course introduces the framework of supervised learning with an emphasis on principled and uncertainty-aware modelling. (It is a stats course after all!)

We discuss both classical and modern approaches to regression and classification problems, along with issues that arise when working with flexible non-parametric models. We start with a review of the linear modelling framework for regression and classification problems before extending this class of models, first to generalized linear models and then generalised additive models. Following this, we consider a range of modern non-parametric alternatives, discussing both their benefits and drawbacks.

Throughout the course, we will evaluate and compare the performance of these models on datasets from a variety of scientific problem domains.

Extreme Value Theory

Notes Videos

The majority of approaches in statistics and data science aim to learn about the “typical” levels of a process. In some cases though, we really want to understand those outcomes that are unusually large or small. We might want to understand when the weakest link in a chain will break, to predict record-breaking athletic performances or to describe the size and frequency of the largest financial losses.

To answer these questions we must extrapolate beyond the range of observed outcomes. This can be a risky business and requires careful, domain-specific considerations. Extreme value theory gives us a principled, mathematically motivated framework for modelling the largest (or smallest) values within a dataset.

This course will introduce you to a formal strategy for modelling the tails of random variables. We will develop this modelling framework based on an area of probability theory that is rarely studied at undergraduate level, paying close attention to the issues that come with using such asymptotically motivated models. We will see how techniques from standard statistical inference can be adapted into this modelling framework and apply it to a range of examples from finance and the environmental sciences.