**Statistics to win penalty shoot-outs**

With the World Cup upon us again we can perhaps look forward to yet another heroic defeat on penalties by England. England are in fact the worst country of any of the major footballing nations at taking penalties, having won only 1 out of 7 shoot-outs at the Euros and World Cup. In fact of the 35 penalties taken in shoot-outs England have missed 12 – which is a miss rate of over 30%. Germany by comparison have won 5 out of 7 – and have a miss rate of only 15%.

With the stakes in penalty shoot-outs so high there have been a number of studies to look at optimum strategies for players.

**Shoot left when ahead
**

One study published in Psychological Science looked at all the penalties taken in penalty shoot-outs in the World Cup since 1982. What they found was pretty incredible – goalkeepers have a subconscious bias for diving to the right when their team is behind.

As is clear from the graphic, this is not a small bias towards the right, but a very strong one. When their team is behind the goalkeeper apparently favours his (likely) strong side 71% of the time. The strikers’ shot meanwhile continues to be placed either left or right with roughly the same likelihood as in the other situations. So, this built in bias makes the goalkeeper much less likely to help his team recover from a losing position in a shoot-out.

**Shoot high**

Analysis by Prozone looking at the data from the World Cups and European Championships between 1998 and 2010 compiled the following graphics:

The first graphic above shows the part of the goal that scoring penalties were aimed at. With most strikers aiming bottom left and bottom right it’s no surprise to see that these were the most successful areas.

The second graphic which shows where penalties were saved shows a more complete picture – goalkeepers made nearly all their saves low down. A striker who has the skill and control to lift the ball high makes it very unlikely that the goalkeeper will save his shot.

The last graphic also shows the risk involved in shooting high. This data shows where all the missed penalties (which were off-target) were being aimed. Unsurprisingly strikers who were aiming down the middle of the goal managed to hit the target! Interestingly strikers aiming for the right corner (as the goalkeeper stands) were far more likely to drag their shot off target than those aiming for the left side. Perhaps this is to do with them being predominantly right footed and the angle of their shooting arc?

**Win the toss and go first**

The Prozone data also showed the importance of winning the coin toss – 75% of the teams who went first went on to win. Equally, missing the first penalty is disastrous to a team’s chances – they went on to lose 81% of the time. The statistics also show a huge psychological role as well. Players who needed to score to keep their teams in the competition only scored a miserable 14% of the time. It would be interesting to see how these statistics are replicated over a larger data set.

**Don’t dive**

A different study which looked at 286 penalties from both domestic leagues and international competitions found that goalkeepers are actually best advised to stay in the centre of the goal rather than diving to one side. This had quite a significant affect on their ability to save the penalties – increasing the likelihood from around 13% to 33%. So, why don’t more goalkeepers stay still? Well, again this might come down to psychology – a diving save looks more dramatic and showcases the goalkeeper’s skill more than standing stationary in the centre.

**So, why do England always lose on penalties?**

There are some interesting psychological studies which suggest that England suffer more than other teams because English players are inhibited by their high public status (in other words, there is more pressure on them to perform – and hence that pressure is harder to deal with). One such study noted that the best penalty takers are the ones who compose themselves prior to the penalty. England’s players start to run to the ball only 0.2 seconds after the referee has blown – making them much less composed than other teams.

However, I think you can put too much analysis on psychology – the answer is probably simpler – that other teams beat England because they have technically better players. English footballing culture revolves much less around technical skill than elsewhere in Europe and South America – and when it comes to the penalty shoot-outs this has a dramatic effect.

As we can see from the statistics, players who are technically gifted enough to lift their shots into the top corners give the goalkeepers virtually no chance of saving them. England’s less technically gifted players have to rely on hitting it hard and low to the corner – which gives the goalkeeper a much higher percentage chance of saving them.

**Test yourself**

You can test your penalty taking skills with this online game from the Open University – choose which players are best suited to the pressure, decide what advice they need and aim your shot in the best position.

If you liked this post you might also like:

Championship Wages Predict League Position? A look at how statistics can predict where teams finish in the league.

Premier League Wages Predict League Positions? A similar analysis of Premier League teams.

]]>This carries on the previous investigation into Farey sequences, and is again based on the current Nrich task Ford Circles. Below are the Farey sequences for F_{2}, F_{3} and F_{4}. You can read about Farey sequences in the previous post.

This time I’m going to explore the link between Farey sequences and circles. First we need the general equation for a circle:

This has centre (p,q) and radius r. Therefore

**Circle 1:**

has centre:

and radius:

**Circle 2:**

has centre:

and radius:

Now we can plot these circles in Geogebra – and look for the values of a,b,c,d which lead to the circles touching at a point.

**When a = 1, b = 2, c = 2, d = 3:**

Do we notice anything about the numbers a/b and c/d ? a/b = 1/2 and c/d = 2/3 ? These are consecutive terms in the F_{3 }sequence. So do other consecutive terms in the Farey sequence also generate circles touching at a point?

**a = 1, b = 1, c = 2, d = 3**

Again we can see that the fractions 1/1 and 2/3 are consecutive terms in the F_{3 }sequence. So by drawing some more circle we can graphically represent all the fractions in the F_{3 }sequence:

So these four circles represent the four non-zero fractions of in the F_{3 }sequence!

and this is the visual representation of the non-zero fractions of in the F_{4 }sequence. Amazing!

**Modelling more Chaos**

This post was inspired by Rachel Thomas’ Nrich article on the same topic. I’ll carry on the investigation suggested in the article. We’re going to explore chaotic behavior – where small changes to initial conditions lead to widely different outcomes. Chaotic behavior is what makes modelling (say) weather patterns so complex.

**f(x) = sin(x)**

This time let’s do the same with f(x) = sin(x).

**Starting value of x = 0.2**

**Starting value of x = 0.2001**

**Both graphs superimposed **

This time the graphs do not show any chaotic behavior over the first 40 iterations – a small difference in initial condition has made a negligible difference to the output. Even after 200 iterations we get the 2 values x = 0.104488151 and x = 0.104502319.

**f(x) = tan(x)**

Now this time with f(x) = tan(x).

**Starting value of x = 0.2**

**Starting value of x = 0.2001**

**Both graphs superimposed **

This time both graphs remained largely the same up until around the 38th data point – with large divergence after that. Let’s see what would happen over the next 50 iterations:

Therefore we can see that tan(x) is much more susceptible to small initial state changes than sin(x). This makes sense by considering the graphs of tan(x) and sin(x). Sin(x) remains bounded between -1 and 1, whereas tan(x) is unbounded with asymptotic behaviour as we approach pi/2.

]]>This is a mini investigation based on the current Nrich task Farey Sequences.

As Nrich explains:

I’m going to look at Farey sequences (though I won’t worry about rearranging them in order of size). Here are some of the first Farey sequences. The missing fractions are all ones which simplify to a fraction already on the list (e.g. 2/4 is missing because this is the same as 1/2)

You should be able to notice that the next Farey sequence always contains the previous Farey sequence, so the problem becomes working out which of the new fractions added will not cancel down to something already on the list.

**Highest Common Factors**

Fractions will not cancel down (simplify) if the numerator and denominator have a highest common factor (HCF) of 1. For example 2/4 simplifies because the highest common factor of 2 and 4 is 2. Therefore both top and bottom can be divided by 2. 4/5 does not simplify because the HCF of 4 and 5 is 1.

We call 2 numbers which have a HCF of 1 **relatively prime.**

for example for the number 4: 1 and 3 are both relatively prime (HCF of 1 and 4 =1, HCF of 3 and 4 = 1).

**Relatively prime numbers**

2: 1

3: 1,2

4: 1,3

5: 1,2,3,4

6: 1,5

7: 1,2,3,4,5,6

8: 1,3,5,7

9: 1,2,4,5,7,8

You might notice that these give the required numerators for any given denominator – i.e when the denominator is 9, we want a numerator of 1,2,4,5,7,8.

**Euler totient function**

Euler’s totient function is a really useful function in number theory – which counts the number of relatively prime numbers a given number has. For example from our list we can see that 9 has 6 relatively prime numbers.

Euler’s totient function is defined above – it’s not as complicated as it looks! The strange symbol on the right hand side is the product formula – i.e we multiply terms together. It’s easiest to understand with some examples. To find Euler’s totient function we first work out the prime factors of a number. Say we have the number 8. The prime factors of 8 are 2^{3}. Therefore the only unique prime factor is 2.

Therefore the Euler totient function tells me to simply do 8 (1 – 1/2) = 4. This is how many relatively prime numbers 8 has.

Let’s look at another example – this time for the number 10. 10 has the prime factorisation 5 x 2. Therefore it has 2 unique primes, 2 and 5. Therefore the Euler totient function tells me to do 10(1-1/2)(1-1/5) = 4.

One more example, this time with the number 30. This has prime factorisation 2 x 3 x 5. This has unique prime factors 2,3,5 so I will do 30(1 -1/2)(1-1/3)(1-1/5) =8.

**An equation for the number of fractions in the Farey sequence**

Therefore I can now work out how many fractions will appear in a given Farey sequence. I notice that for (say) F_{5} I will add Euler’s totient for n = 2, n = 3, n = 4 and n = 5. I then add 2 to account for 0/1 and 1/1. Therefore I have:

For example to find F_{6}

There are lots of things to investigate about Farey functions – could you prove why all Farey sequences have an odd number of terms? You can also look at how well the Farey sequence is approximated by the following equation:

For example when n = 10 this gives:

and when n = 1000 this gives:

These results compare reasonably well as an estimation to the real answers of 33 and 304,193 respectively.

]]>

**Modelling Chaos**

This post was inspired by Rachel Thomas’ Nrich article on the same topic. I’ll carry on the investigation suggested in the article. We’re going to explore chaotic behavior – where small changes to initial conditions lead to widely different outcomes. Chaotic behavior is what makes modelling (say) weather patterns so complex.

Let’s start as in the article with the function:

**f(x) = 4x(1-x)**

We can then start an iterative process where we choose an initial value, calculate f(x) and then use this answer to calculate a new f(x) etc. For example when I choose x = 0.2, f(0.2) = 0.64. I then use this value to find a new value f(0.64) = 0.9216. I used a spreadsheet to plot 40 iterations for the starting values of x = 0.2 and x = 0.2001. This generated the following spreadsheet (cut to show the first 10 terms):

I then imported this table into Desmos to map how the change in the starting value from 0.2 to 0.2001 affected the resultant graph.

**Starting value of x = 0.2**

**Starting value of x = 0.2001**

**Both graphs superimposed **

We can see that for the first 10 terms the graphs are virtually the same – but then we get a wild divergence, before the graphs seem to synchronize more closely again. One thing we notice is that the data is bounded between 0 and 1. Can we prove why this is?

If we start with a value of x such that:

0<x<1.

then when we plot f(x) = 4x – 4x^{2} we can see that the graph has a maximum at x = 1/2:

.

Therefore any starting value of x between 0 and 1 will also return a new value bounded between 0 and 1. Starting values of x > 1 and x < -1 will tend to negative infinity because x^{2} grows much more rapidly than x.

**f(x) = ax(1-x)**

Let’s now explore what happens as we change the value of a whilst keeping our initial starting values of x = 0.2 and x = 0.2001

a = 0.8

both graphs are superimposed but are identical at the scale we are using. We can see that both values are attracted to 0 (we can say that 0 is an **attractor** for our system).

a = 1.2

Again both graphs are superimposed but are identical at the scale we are using. We can see that both values are attracted to 1/6 (we can say that 1/6 is an **attractor** for our system).

In general, for f(x) = ax(1-x) with -1≤x≤1, the attractors are given by x = 0 and x = 1 – 1/a, but it depends on the starting conditions as to whether we will end up being attracted to this point.

**f(x) = 0.8x(1-x)**

So, let’s look at f(x) = 0.8x(1-x) for different starting values 1≤x≤1. Our attractors are given by x = 0 and x = 1 – 1/0.8 = -0.25.

When our initial value is x = 0 we remain at the point x = 0.

When our initial value is x = -0.25 we remain at the point x = -0.25.

When our initial value is x < -0.25 we tend to negative infinity.

When our initial value is -0.25 < x ≤ 1 we tend towards x = 0.

**Starting value of x = -0.249999:**

Therefore we can say that x = 0 is a **stable attractor**, initial values close to x = 0 will still tend to 0.

However x = -0.25 is a **fixed point** rather than a stable attractor**, **as

x = -0.250001 will tend to infinity very rapidly,

x = -0.25 stays at x = -0.25.

x = -0.249999 will tend towards 0.

Therefore there is a stable equilibria at x = 0 and an unstable equilibria at x = -0.25.

]]>

**Modelling tides: What is the effect of a full moon?**

Let’s have a look at the effect of the moon on the tides in Phuket. The Phuket tide table above shows the height of the tide (meters) on given days in March, with the hours along the top. So if we choose March 1st (full moon) we get the following graph:

**Phuket tide at full moon:**

If I use the standard sine regression on Desmos I get the following:

This doesn’t look a very useful graph – but the R squared value is very close to one – so what’s gone wrong? Well, Desmos has done what we asked it to do – found a sine curve that goes through the points, it’s just that it’s chosen a b value of close to 120 – meaning that the curve has a very small period. So to prevent Desmos doing this, we need to fix the period first. If we are in radians the we use the formula period = 2pi / b. Therefore looking at the original graph we can see that this period is around 12. Therefore we have:

period = 2pi/b

12 = 2pi/b

b = 2pi/12 or pi/6.

Plotting this new graph gives something that looks a lot nicer:

**Phuket tide at new moon:**

**Analysis:**

Both graphs show a very close fit to the original data – though both under-value the tide at 2300. We can see that the full moon has indeed had an effect on the amplitude of the sine curves – with the amplitude of 1.21m for the full moon and only 1.03m for the new moon.

**Further study:**

We could then see if this relationship holds throughout the year – is there a general formula to explain the moons effect on the amplitude? We could also see how we have to modify the sine wave to capture the tidal height over an entire week or month. Can we capture it with a single equation (perhaps a damped sine wave?) or is it only possible as a piecewise function? We could also use some calculus to find the maximum and minimum points.

There is a very nice pdf which goes into more detail on the maths behind modeling tides here. There we go – a nice simple investigation which can be expanded in a number of directions.

]]>

**Circular Motion: Modelling a ferris wheel**

This is a nice simple example of how the Tracker software can be used to demonstrate the circular motion of a Ferris wheel. This is sometimes asked in IB maths exams – so it’s nice to get a visual representation of what is happening.

First I took a video from youtube of a Ferris wheel, loaded it into Tracker, and then used the program to track the position of a single carriage as it moved around the circle. I then used Tracker’s graphing capabilities to plot the height of the carriage (y) against time (t). This produces the following graph:

As we can see this is a pretty good fit for a sine curve. So let’s use the regression tool to find what curve fits this:

The pink curve with the equation:

y = -116.1sin(0.6718t+2.19)

fits reasonably well. If we had the original dimensions of the wheel we could scale this so the y scale represented the metres off the ground of the carriage.

There we go! Short and simple, but a nice starting point for an investigation on circular motion.

]]>**The Folium of Descartes**

The folium of Descartes is a famous curve named after the French philosopher and mathematician Rene Descartes (pictured top right). As well as significant contributions to philosophy (“I think therefore I am”) he was also the father of modern geometry through the development of the x,y coordinate system of plotting algebraic curves. As such the Cartesian plane (as we call the x,y coordinate system) is named after him.

**Pascal and Descartes**

Descartes was studying what is now known as the folium of Descartes (folium coming from the Latin for leaf) in the first half of the 1600s. Prior to the invention of calculus, the ability to calculate the gradient at a given point was a real challenge. He placed a wager with Pierre de Fermat, a contemporary French mathematician (of Fermat’s Last Theorem fame) that Fermat would be unable to find the gradient of the curve – a challenge that Fermat took up and succeeded with.

**Calculus – implicit differentiation:**

Today, armed with calculus and the method of implicit differentiation, finding the gradient at a point for the folium of Descartes is more straightforward. The original Cartesian equation is:

which can be differentiated implicitly to give:

Therefore if we take (say) a =1 and the coordinate (1.5, 1.5) then we will have a gradient of -1.

**Parametric equations**

It’s sometimes easier to express a curve in a different way to the usual Cartesian equation. Two alternatives are polar coordinates and parametric coordinates. The parametric equations for the folium are given by:

In order to use parametric equations we simply choose a value of t (say t =1) and put this into both equations in order to arrive at a coordinate pair in the x,y plane. If we choose t = 1 and have set a = 1 as well then this gives:

x(1) = 3/2

y(1) = 3/2

therefore the point (1.5, 1.5) is on the curve.

You can read a lot more about famous curves and explore the maths behind them with the excellent “50 famous curves” from Bloomsburg University.

]]>**Project Euler: Coding to Solve Maths Problems**

Project Euler, named after one of the greatest mathematicians of all time, has been designed to bring together the twin disciplines of mathematics and coding. Computers are now become ever more integral in the field of mathematics – and now creative coding can be a method of solving mathematics problems just as much as creative mathematics has always been.

The first problem on the Project Euler Page is as follows:

*If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3, 5, 6 and 9. The sum of these multiples is 23.*

*Find the sum of all the multiples of 3 or 5 below 1000.*

This is a reasonably straight forward maths problem which we can solve using the summation of arithmetic sequences (I’ll solve it below!) but more interestingly is how a computer code can be written to solve this same problem. Given that I am something of a coding novice, I went to the Project Nayuki website which has an archive of solutions. Here is a slightly modified version of the solution given on Project Nayki, designed to run in JAVA:

The original file can be copied from here, I then pasted this into an online JAVA site jdoodle. The only modification necessary was to replace:

*public final class p001 implements EulerSolution* with *public class p001*

Then after hitting execute you get the following result:

i.e the solution is returned as 233,168. Amazing!

But before we get carried away, let’s check the answer using some more old fashioned maths. We can break down the question into simply trying to find the sum of multiples of 3 under 1000, the sum of the multiples of 5 under 1000 and then subtracting the multiples of 15 under 1000 (as these will have been double counted). i.e:

(3 + 6 + 9 + … 999) + (5 + 10 + 15 + … 995) – (15 + 30 + 45 + …990)

This gives:

S_333 = 333/2 (2(3)+ 332(3)) = 166,833

+

S_199 = 199/2 (2(5) + 198(5)) = 99, 500

–

S_66 = 66/2 (2(15) +65(15) = 33, 165.

166,833 + 99, 500 – 33, 165 = 233, 168 as required.

Now that we have seen that this works we can modify the original code. For example if we replace:

if (i % 3 == 0 || i % 3 == 0)

with

if (i % 5 == 0 || i % 7 == 0)

This will find the sum of all the multiples of 5 or 7 below 1000. Which returns the answer 156,361.

Replacing the same line with:

if (i % 5 == 0 || i % 7 == 0 || i % 3 == 0)

will find the sum of all the multiples of 3 or 5 or 7 below 1000, which returns the answer 271,066. To find this using the previous method we would have to do:

Sum of 3s + Sum of 5s – Sum of 15s + Sum of 7s – Sum of 21s – Sum 35s – Sum of 105s. Which starts to show why using a computer makes life easier.

This would be a nice addition to any investigation on Number Theory – or indeed a good project for anyone interested in Computer Science as a possible future career.

]]>**Spotting Asset Bubbles**

Asset bubbles are formed when a service, product or company becomes massively over-valued only to crash, taking with it most of its investors’ money. There are many examples of asset bubbles in history – the Dutch tulip bulb mania and the South Sea bubble are two of the most famous historical examples. In the tulip mania bubble of 1636-37, the price of tulip bulbs became astronomically high – as people speculated that the rising prices would keep rising yet further. At its peak a single tulip bulb was changing hands for around 10 times the annual wage of a skilled artisan, before crashing to become virtually worthless.

More recent bubble include the Dotcom crash of the early 2000s – where investors piled in trying to spot in what ways the internet would revolutionise businesses. Huge numbers of internet companies tried to ride this wave by going public with share offerings. This led to massive overvaluation and a crash when investors realised that many of these companies were worthless. Pets.com is often given as an example of this exuberance – its stock collapsed from $11 to $0.19 in just 6 months, taking with it $300 million of venture capital.

Therefore spotting the next bubble is something which economists take very seriously. You want to spot the next bubble, but equally not to miss out on the next big thing – a difficult balancing act! The graph at the top of the page is given as a classic bubble. It contains all the key phases – an initial slow take-off, a steady increase as institutional investors like banks and hedge funds get involved, an exponential growth phase as the public get involved, followed by a crash and a return to its long term mean value.

**Comparing the Bitcoin graph to an asset bubble**

The above graph is charting the last year of Bitcoin growth. We can see several similarities – so let’s try and plot this on the same axis as the model. The orange dots represent data points for the initial model – and then I’ve fitted the Bitcoin graph over the top:

It’s not a bad fit – if this was going to follow the asset bubble model then it would be about to crash rapidly before returning to the long term mean of around $4000. Whether that happens or it continues to rise, you can guarantee that there will be thousands of economists and stock market analysts around the world doing this sort of analysis (albeit somewhat more sophisticated!) to decide whether Bitcoin really will become the future of money – or yet another example of an asset bubble to be studied in economics textbooks of the future.

]]>