You are currently browsing the tag archive for the ‘poisson’ tag.

**Simulating a Football Season**

This is a nice example of how statistics are used in modeling – similar techniques are used when gambling companies are creating odds or when computer game designers are making football manager games. We start with some statistics. The soccer stats site has the data we need from the 2018-19 season, and we will use this to predict the outcome of the 2019-20 season (assuming teams stay at a similar level, and that no-one was relegated in 2018-19).

**Attack and defense strength**

For each team we need to calculate:

- Home attack strength
- Away attack strength
- Home defense strength
- Away defense strength.

For example for Liverpool (LFC)

LFC Home attack strength = (LFC home goals in 2018-19 season)/(average home goals in 2018-19 season)

LFC Away attack strength = (LFC away goals in 2018-19 season)/(average away goals in 2018-19 season)

LFC Home defense strength = (LFC home goals conceded in 2018-19 season)/(average home goals conceded in 2018-19 season)

LFC Away defense strength = (LFC away goals conceded in 2018-19 season)/(average away goals conceded in 2018-19 season)

**Calculating lamda**

We can then use a Poisson model to work out some probabilities. First though we need to find our lamda value. To make life easier we can also use the fact that the lamda value for a Poisson gives the mean value – and use this to give an approximate answer.

So, for example if Liverpool are playing at home to Arsenal we work out Liverpool’s lamda value as:

LFC home lamda = league average home goals per game x LFC home attack strength x Arsenal away defense strength.

We would work out Arsenal’s away lamda as:

Arsenal away lamda = league average away goals per game x Arsenal away attack strength x Liverpool home defense strength.

Putting in some values gives a home lamda for Liverpool as 3.38 and an away lamda for Arsenal as 0.69. So we would expect Liverpool to win 3-1 (rounding to the nearest integer).

**Using Excel**

I then used an Excel spreadsheet to work out the home goals in each fixture in the league season (green column represents the home teams)

and then used the same method to work out the away goals in each fixture in the league (yellow column represents the away team)

I could then round these numbers to the nearest integer and fill in the scores for each match in the table:

Then I was able to work out the point totals to produce a predicted table:

Here we had both Liverpool and Manchester City on 104 points, but with Manchester City having a better goal difference, so winning the league again.

**Using a Poisson model.**

The poisson model allows us to calculate probabilities. The mode is:

P(k goals) = (e^{-λ}λ^{k})/k!

λ is the symbol lamda which we calculated before.

So, for example with Liverpool at home to Arsenal we calculate

Liverpool’s home lamda = league average home goals per game x LFC home attack strength x Arsenal away defense strength.

**Liverpool’s home lamda = 1.57 x 1.84 x 1.17 = 3.38**

Therefore

P(Liverpool score 0 goals) = (e^{-3.38}3.38^{0})/0! = 0.034

P(Liverpool score 1 goal) = (e^{-3.38}3.38^{1})/1! = 0.12

P(Liverpool score 2 goals) = (e^{-3.38}3.38^{2})/2! = 0.19

P(Liverpool score 3 goals) = (e^{-3.38}3.38^{3})/3! = 0.22

P(Liverpool score 4 goals) = (e^{-3.38}3.38^{1})/1! = 0.19

P(Liverpool score 5 goals) = (e^{-3.38}3.38^{5})/5! = 0.13 etc.

**Arsenal’s away lamda = 1.25 x 1.30 x 0.42 = 0.68**

P(Arsenal score 0 goals) = (e^{-0.68}0.68^{0})/0! = 0.51

P(Arsenal score 1 goal) = (e^{-0.68}0.68^{1})/1! = 0.34

P(Arsenal score 2 goals) = (e^{-0.68}0.68^{2})/2! = 0.12

P(Arsenal score 3 goals) = (e^{-0.68}0.68^{3})/3! = 0.03 etc.

**Probability that Arsenal win**

Arsenal can win if:

Liverpool score 0 goals and Arsenal score 1 or more

Liverpool score 1 goal and Arsenal score 2 or more

Liverpool score 2 goals and Arsenal score 3 or more etc.

i.e the approximate probability of Arsenal winning is:

0.034 x 0.49 + 0.12 x 0.15 + 0.19 x 0.03 = 0.04.

Using the same method we could work out the probability of a draw and a Liverpool win. This is the sort of method that bookmakers will use to calculate the probabilities that ensure they make a profit when offering odds.

**Modeling Volcanoes – When will they erupt?**

A recent post by the excellent Maths Careers website looked at how we can model volcanic eruptions mathematically. This is an important branch of mathematics – which looks to assign risk to events and these methods are very important to statisticians and insurers. Given that large-scale volcanic eruptions have the potential to end modern civilisation, it’s also useful to know how likely the next large eruption is.

The Guardian has recently run a piece on the dangers that large volcanoes pose to humans. Iceland’s Eyjafjallajökull volcano which erupted in 2010 caused over 100,000 flights to be grounded and cost the global economy over $1 billion – and yet this was only a very minor eruption historically speaking. For example, the Tombora eruption in Indonesia (1815) was so big that the explosion could be heard over 2000km away, and the 200 million tones of sulpher that were emitted spread across the globe, lowering global temperatures by 2 degrees Celsius. This led to widespread famine as crops failed – and tens of thousands of deaths.

**Super volcanoes**

Even this destruction is insignificant when compared to the potential damage caused by a super volcano. These volcanoes, like that underneath Yellowstone Park in America, have the potential to wipe-out millions in the initial explosion and and to send enough sulpher and ash into the air to cause a “volcanic winter” of significantly lower global temperatures. The graphic above shows that the ash from a Yellowstone eruption could cover the ground of about half the USA. The resultant widespread disruption to global food supplies and travel would be devastating.

So, how can we predict the probability of a volcanic eruption? The easiest model to use, if we already have an estimated probability of eruption is the Poisson distribution:

This formula calculates the probability that X equals a given value of k. λ is the mean of the distribution. If X represents the number of volcanic eruptions we have Pr(X ≥1) = 1 – Pr(x = 0). This gives us a formula for working out the probability of an eruption as 1 -e^{-λ}. For example, the Yellowstone super volcano erupts around every 600,000 years. Therefore if λ is the number of eruptions every year, we have λ = 1/600,000 ≈ 0.00000167 and 1 -e ^{-λ} also ≈ 0.00000167. This gets more interesting if we then look at the probability over a range of years. We can do this by modifying the formula for probability as 1 -e^{-tλ} where t is the number of years for our range.

So the probability of a Yellowstone eruption in the next 1000 years is 1 -e^{-0.00167} ≈ 0.00166, and the probability in the next 10,000 years is 1 -e^{-0.0167} ≈ 0.0164. So we have approximately a 2% chance of this eruption in the next 10,000 years.

A far smaller volcano, like Katla in Iceland has erupted 16 times in the past 1100 years – giving a average eruption every ≈ 70 years. This gives λ = 1/70 ≈ 0.014. So we can expect this to erupt in the next 10 years with probability 1 -e^{-0.14} ≈ 0.0139. And in the next 30 years with probability 1 -e^{-0.42} ≈ 0.34.

The models for volcanic eruptions can get a lot more complicated – especially as we often don’t know the accurate data to give us an estimate for the λ. λ can be estimated using a technique called Maximum Likelihood Estimation – which you can read about here.

If you enjoyed this post you might also like:

Black Swans and Civilisation Collapse. How effective is maths at guiding government policies?