Read more expander

The core of any tournament simulation is its determination of outcome probabilities for individual matches. While there are many possibilities, I adopted FIFA’s Elo-based rating system.

An Elo rating system is a statistical method for measuring the skill levels of players or teams, based on match outcomes and opponent strength.

Conveniently, this allows for updating ratings after each match and provides a formula for expected results between teams with any given rating difference.

The formula I used was:

E(D) = 1 / (10^-D/600 + 1)

This tells us the chance of a team winning a match (E) based on the difference between their Elo rating and their opponent's rating (D). The result will be a number between 0 and 1, where 0 means a certain loss and 1 means a guaranteed victory.

However, to simulate the full tournament, we must contend with progression from the group stage, where goal differential is often a crucial component. To do this, I’ve leant on research by German researcher Patrick Heuer and his colleagues, who provide a goal difference distribution which can be used as a starting point for our simulations. I standardised their distribution as:

To adapt it for games between opponents with a given ratings difference, an exponential tilt is employed:

p_D(d) ∝ p(d) × 10^{0.365 × d × (D/600)}

This means we take the symmetrical distribution shown above and multiply it by another factor which depends on both the goal difference (d) and the difference in Elo ratings (D).

The factor 0.365 is chosen to match as closely as possible the expected outcome based on this tilted goal difference distribution to the one determined by the Elo rating.

The maths involved