Who are the best F1 racing drivers?

Masters in Data Science Project, Summer 2025

Project outline

In Formula One motor racing, points are awarded to drivers at the end of each race depending on their finishing position. These points are added up over the season (along with a few additional points, for fastest laps, etc.), and at the end of the season the driver with the most points is declared the champion. But this raises many questions. Is this person really the best driver that season? And how should we measure and compare the quality of drivers across their careers? Are the best drivers simply those that win the most championships, or is there more to it than that? A different way of allocating points based on position might lead to a different champion in a given season (and different points allocations have been used over the years). Can we identify the best drivers independently of a given points system? Would some kind of points-based system be useful for identifying the best drivers across their careers? Is the best race driver the same as the driver who is best at qualifying, or are these slightly different skills? More fundamentally, it is clear that the cars produced by the different F1 teams are not all of the same standard. Perhaps the best driver doesn’t have the most points at the end of the season simply because he is not in the best car. Is it possible to separate out the effects of car quality from driver ability to level the playing field and identify the driver who would be best if they all drove identical cars? This question is clearly of interest to the top F1 teams when thinking about who to replace their current drivers with.

In this project, the student will try to answer some of these questions using a combination of statistical modelling and publicly available F1 data, such as provided by the Python package, FastF1.

Pre-requisites

The student will need to have strong statistical modelling skills, in addition to good programming skills in R and/or Python, and a keen interest in F1 motor racing. You should have taken the optional module MATH43515: Multilevel Modelling in order to undertake this project.

Some relevant resources

Papers

Eichenberger, R., Stadelmann, D. (2009) Who is the best Formula 1 driver? An economic approach to evaluating talent, Economic Analysis and Policy, 39(3):389-406.
Phillips, A. J. K. (2014) Uncovering Formula One driver performances from 1950 to 2013 by adjusting for team and competition effects, Journal of Quantitative Analysis in Sports, 10(2):261-278.
Bell, A., Smith, J., Sabel, C. E., Jones, K. (2016) Formula for success: Multilevel modelling of Formula One Driver and Constructor performance, 1950–2014, Journal of Quantitative Analysis in Sports, 12(2):99-112.
Henderson, D. A., Kirrane, L. J., (2018) A comparison of truncated and time-weighted Plackett-Luce models for probabilistic forecasting of Formula One results, Bayesian Analysis, 13(2):335-358.
van Kesteren, E-J., Bergkamp, T. (2023) Bayesian analysis of Formula One race results: disentangling driver skill and constructor advantage, Journal of Quantitative Analysis in Sports, 19(4):273-293.
Fry, J., Brighton, T., Fanzon, S. (2024) Faster identification of faster Formula 1 drivers via time-rank duality, Economics Letters, 237:111671.

Web links

Formula One (wikipedia)
Formula1.com
Data
- FastF1
- OpenF1
- Best Kaggle F1 datasets (eg. this one)
Statistical methods (wikipedia)