Predicting Formula 1 Tire Degradation

Introduction and Motivation

Formula 1 is defined by milliseconds. Tire performance is a critical factor influencing lap times, pit stop strategy, and overall race outcomes. Yet, understanding when a tire will "fall off the cliff" is often part science, part art.

The goal of this project is to build a predictive model that can estimate how many laps into a stint a tire will begin degrading significantly. We do this by combining lap timing, tire metadata, driver+team info, and weather data across multiple seasons of F1 racing.

Data Collection and Labeling

Using the FastF1 Python library, we collected detailed lap-by-lap data for every race from the 2022 to 2025 F1 seasons (up to the 2025 Canadian GP). Each lap record includes lap time, compound, tire age, driver+team stats, track info, weather readings, and session metadata.

For each stint, we label the lap at which performance degradation begins. This is done by identifying when lap times exceed a dynamic threshold calculated as:

threshold = rolling_mean + 1.5 * rolling_std

Only stints unaffected by retirements or collisions are included.

Feature Engineering

A wide variety of features are engineered from the raw data:

Modeling Approach

The target variable is tire_life — the number of laps into a stint before degradation. We applied a PowerTransformer to normalize its distribution (because tire degradation is not simply linear).

We trained and tuned 5 regressors using OptunaSearchCV for hyperparameter optimization:

Each model is wrapped in a Pipeline including preprocessing steps (e.g., one-hot encoding for categorical features).

Results

Performance metrics (MAE):

ModelMAE
XGBoost1.700
CatBoost1.715
Random Forest1.736
Gradient Boosting1.749
LightGBM1.792

The best model (XGBoost) predicts tire degradation within 1.7 laps on average.

Conclusion

This project demonstrates that it is possible to model tire degradation in Formula 1 with fairly high accuracy using open telemetry data. Integrating this into simulators or strategy planners could benefit both race engineers and fans.

Code & Reproducibility

All code, data processing functions, and model training pipelines are available here: GitHub Repository.