Building a Football Prediction Model: Week 4

by

in

This week I am adding a time series model (SARIMA) to better react to team momentum and dips in form. Again, I am predicting matches days ahead of time to increase transparency.

What is SARIMA?

Seasonal AutoRegressive Integrated Moving Average (SARIMA) is a time-series forecasting model designed to predict future values based on historical patterns.

SARIMA captures changes in performance over time (5 matches), hopefully allowing the model to detect trends and momentum quicker than the current model, such as improvements in chance creation or temporary dips (e.g. by player injury or fixture congestion).

SARIMA is highly sensitive to short-term form, but it lacks contextual understanding of opponent attacking/defending strength. It has been given a 30% weighting in my model.

Below is the expected goals for next week’s set of matches, with and without SARIMA modelling.

Take Sunderland for example, after their 3-0 win to Burnley, their xG has increased greatly to represent the recent short term momentum, from 0.38 to 1.01. SARIMA looks at Sunderland’s recent match by match xG as a time series, and detects an upwards trend.

Adding SARIMA in Python

Below is a screenshot of the code and simple explanations. The formula for SARIMA is SARIMA(𝑝,𝑑,𝑞)(𝑃,𝐷,𝑄)m. I’ve defined this formula below.

There are 3 trend elements (p,d,q) and 4 seasonal elements (P,D,Q,m) in the model

p=autoregressive term (uses previous value of series to predict current value)

d=differencing order (takes into account the change between matches)

q=moving average (takes into account previous errors into prediction to smooth randomness)

P= seasonal autoregressive term

D=Seasonal differencing

Q=Seasonal moving average term

m= match length, set to 5.

All other Business

My aim is to improve the model in some tangible way every week. A fun way of tracking accuracy of the model would be to set aside a small budget for the project (e.g. £10) and place bets on games based on the models prediction. It’s important to remember that betting sites design in a 10-15% margin for themselves, stacking the odds against the bettor no matter the outcome.


Leave a comment