This is a great puzzle which the Guardian ran last week:

*Fill in the equations below using any of the basic mathematical operations, +, –, x, ÷, and as many brackets as you like, so that they make arithmetical sense.*

*10 9 8 7 6 5 4 3 2 1 = 2017*

There are many different ways of solving this – see if you can find the simplest way!

Scroll down to see some possible answers.

I had a play around with this and this is my effort:

(10+(9 x 8 x 7) -(6 + 5) ) x 4 + 3 + (2 x 1) = 2017

An even nicer answer was provided on the Guardian – which doesn’t even need brackets:

10 x 9 x 8 x 7 / 6 / 5 x 4 x 3 + 2 – 1 = 2017

Any other solutions?

]]>**Finger Ratio Predicts Maths Ability?**

Some of the studies on the 2D: 4D finger ratios (as measured in the picture above) are interesting when considering what factors possibly affect mathematical ability. A 2007 study by Mark Brosnan from the University of Bath found that:

*“Boys with the longest ring fingers relative to their index fingers tend to excel in math. The boys with the lowest ratios also were the ones whose abilities were most skewed in the direction of math rather than literacy.*

*With the girls, there was no correlation between finger ratio and numeracy, but those with higher ratios–presumably indicating low testosterone levels–had better scores on verbal abilities. The link, according to the researchers, is that testosterone levels in the womb influence both finger length and brain development.*

*In men, the ring (fourth) finger is usually longer than the index (second); their so-called 2D:4D ratio is lower than 1. In females, the two fingers are more likely to be the same length. Because of this sex difference, some scientists believe that a low ratio could be a marker for higher prenatal testosterone levels, although it’s not clear how the hormone might influence finger development.”*

In the study, Brosnan photocopied the hands of 74 boys and girls aged 6 and 7. He worked out the 2D:4D finger ratio by dividing the length of the index finger (2D) with the length of the ring finger (4D). They then compared the finger ratios with standardised UK maths and English tests. The differences found were small, but significant.

Another study of 136 men and 137 women, looked at the link between finger ratio and aggression. The results are plotted in the graph above – which clearly show this data follows a normal distribution. The men are represented with the blue line, the women the green line and the overall cohort in red. You can see that the male distribution is shifted to the left as they have a lower mean ratio. (Males: mean 0.947, standard deviation 0.029, Females: mean 0.965, standard deviation 0.026).

The 95% confidence interval for average length is 0.889-1.005 for males and 0.913-1.017 for females. That means that 95% of the male and female populations will fall into these categories.

The correlation between digit ratio and everything from personality, sexuality, sporting ability and management has been studied. If a low 2D:4D ratio is indeed due to testosterone exposure in the womb (which is not confirmed), then that raises the question as to why testosterone exposure should affect mathematical ability. And if it is not connected to testosterone, then what is responsible for the correlation between digit ratios and mathematical talent?

I think this would make a really interesting Internal Assessment investigation at either Studies or Standard Level. Also it works well as a class investigation at KS3 and IGCSE into correlation and scatter diagrams. Does the relationship still hold for when you look at algebraic skills rather than numeracy? Or is algebraic talent distinct from numeracy talent?

A detailed academic discussion of the scientific literature on this topic is available here.

If you enjoyed this post you might also like:

]]>**Modelling Radioactive decay**

We can model radioactive decay of atoms using the following equation:

**N(t) = N _{0} e^{-λt}**

Where:

**N _{0}**: is the initial quantity of the element

**λ**: is the radioactive decay constant

**t**: is time

**N(t)**: is the quantity of the element remaining after time t.

So, for Carbon-14 which has a half life of 5730 years (this means that after 5730 years exactly half of the initial amount of Carbon-14 atoms will have decayed) we can calculate the decay constant **λ. **

After 5730 years, N(5730) will be exactly half of N_{0}, therefore we can write the following:

**N(5730) = 0.5N _{0} = N_{0} e^{-λt}**

therefore:

**0.5 = e ^{-λt}**

and if we take the natural log of both sides and rearrange we get:

**λ = ln(1/2) / -5730**

**λ ≈0.000121**

We can now use this to solve problems involving Carbon-14 (which is used in Carbon-dating techniques to find out how old things are).

eg. You find an old parchment and after measuring the Carbon-14 content you find that it is just 30% of what a new piece of paper would contain. How old is this paper?

We have

**N(t) = N _{0} e^{-0.000121t}**

**N(t)/N _{0}** =

**0.30** = **e ^{-0.000121t}**

**t = ln(0.30)/(-0.000121)**

**t = 9950 years old.**

**Probability density functions**

We can also do some interesting maths by rearranging:

**N(t) = N _{0} e^{-λt}**

**N(t)/N _{0}** =

and then plotting **N(t)/N _{0}** against time.

**N(t)/N _{0}** will have a range between 0 and 1 as when t = 0,

We can then manipulate this into the form of a probability density function – by finding the constant a which makes the area underneath the curve equal to 1.

solving this gives a = λ. Therefore the following integral:

will give the fraction of atoms which will have decayed between times t1 and t2.

We could use this integral to work out the half life of Carbon-14 as follows:

Which if we solve gives us t = 5728.5 which is what we’d expect (given our earlier rounding of the decay constant).

We can also now work out the expected (mean) time that an atom will exist before it decays. To do this we use the following equation for finding E(x) of a probability density function:

and if we substitute in our equation we get:

Now, we can integrate this by parts:

So the expected (mean) life of an atom is given by 1/λ. In the case of Carbon, with a decay constant λ ≈0.000121 we have an expected life of a Carbon-14 atom as:

E(t) = 1 /0.000121

E(t) = 8264 years.

Now that may sound a little strange – after all the half life is 5730 years, which means that half of all atoms will have decayed after 5730 years. So why is the mean life so much higher? Well it’s because of the long right tail in the graph – we will have some atoms with very large lifespans – and this will therefore skew the mean to the right.

]]>**Amanda Knox and Bad Maths in Courts**

This post is inspired by the recent BBC News article, “Amanda Knox and Bad Maths in Courts.” The article highlights the importance of good mathematical understanding when handling probabilities – and how mistakes by judges and juries can sometimes lead to miscarriages of justice.

**A scenario to give to students:**

*A murder scene is found with two types of blood – that of the victim and that of the murderer. As luck would have it, the unidentified blood has an incredibly rare blood disorder, only found in 1 in every million men. The capital and surrounding areas have a population of 20 million – and the police are sure the murderer is from the capital. The police have already started cataloging all citizens’ blood types for their new super crime-database. They already have nearly 1 million male samples in there – and bingo – one man, Mr XY, is a match. He is promptly marched off to trial, there is no other evidence, but the jury are told that the odds are 1 in a million that he is innocent. He is duly convicted. The question is, how likely is it that he did not commit this crime? *

**Answer:**

*We can be around 90% confident that he did not commit this crime. Assuming that there are approximately 10 million men in the capital, then were everyone cataloged on the database we would have on average 10 positive matches. Given that there is no other evidence, it is therefore likely that he is only a 1 in 10 chance of being guilty. Even though P(Fail Test/Innocent) = 1/1,000,000, P(Innocent/Fail test) = 9/10.
*

**Amanda Knox**

Eighteen months ago, Amanda Knox and Raffaele Sollecito, who were previously convicted of the murder of British exchange student Meredith Kercher, were acquitted. The judge at the time ruled out re-testing a tiny DNA sample found at the scene, stating that, “The sum of the two results, both unreliable… cannot give a reliable result.”

This logic however, whilst intuitive is not mathematically correct. As explained by mathematician Coralie Colmez in the BBC News article, by repeating relatively unreliable tests we can make them more reliable – the larger the pooled sample size, the more confident we can be in the result.

**Sally Clark**

One of the most (in)famous examples of bad maths in the court room is that of Sally Clark – who was convicted of the murder of her two sons in 1999. It has been described as, “one of the great miscarriages of justice in modern British legal history.” Both of Sally Clark’s children died from cot-death whilst still babies. Soon afterwards she was arrested for murder. The case was based on a seemingly incontrovertible statistic – that the chance of 2 children from the same family dying from cot-death was 1 in 73 million. Experts testified to this, the jury were suitably convinced and she was convicted.

The crux of the prosecutor’s case was that it was so statistically unlikely that this had happened by chance, that she must have killed her children. However, this was bad maths – which led to an innocent woman being jailed for four years before her eventual acquittal.

**Independent Events**

The 1 in 73 million figure was arrived at by simply looking at the probability of a single cot-death (1 in 8500 ) and then squaring it – because it had happened twice. However, this method only works if both events are independent – and in this case they clearly weren’t. Any biological or social factors which contribute to the death of a child due to cot-death will also mean that another sibling is also at elevated risk.

**Prosecutor’s Fallacy**

Additionally this figure was presented in a way which is known as the “prosecutor’s fallacy” – the 1 in 73 million figure (even if correct) didn’t represent the probability of Sally Clark’s innocence, because it should have been compared against the probability of guilt for a double homicide. In other words, the probability of a false positive is not the same as the probability of innocence. In mathematical language, P(Fail Test/Innocent) is not equal to P(Innocent/Fail test).

Subsequent analysis of the Sally Clark case by a mathematics professor concluded that rather than having a 1 in 73 million chance of being innocent, actually it was about 4-10 times more likely this was due to natural causes rather than murder. Quite a big turnaround – and evidence of why understanding statistics is so important in the courts.

This topic has also been highlighted recently by the excellent ToK website, Lancaster School ToK.

If you enjoyed this topic you might also like:

]]>Golden Balls, hosted by Jasper Carrot, is based on a version of the Prisoner’s Dilemma. For added interest, try and predict what the 2 contestants are going to do. Any psychological cues to pick up on?

Game theory is an interesting branch of mathematics with links across a large number of disciplines – from politics to economics to biology and psychology. The most well known example is that of the Prisoner’s Dilemma. (Illustrated below). Two prisoners are taken into custody and held in separate rooms. During interrogation they are told that if they testify to everything (ie betray their partner) then they will go free and their partner will get 10 years. However, if they both testify they will both get 5 years, and if they both remain silent then they will both get 6 months in jail.

So, what is the optimum strategy for prisoner A? In this version he should testify – because whichever strategy his partner chooses this gives prisoner A the best possible outcome. Looking at it in reverse, if prisoner B testifies, then prisoner A would have been best testifying (gets 5 years rather than 10). If prisoner B remains silent, then prisoner A would have been best testifying (goes free rather than 6 months).

This brings in an interesting moral dilemma – ie. even if the prisoner and his partner are innocent they are is placed in a situation where it is in his best interest to testify against their partner – thus increasing the likelihood of an innocent man being sent to jail. This situation represents a form of plea bargaining – which is more common in America than Europe.

Part of the dilemma arises because if both men know that the optimum strategy is to testify, then they both end up with lengthy 5 year jail sentences. If only they can trust each other to be altruistic rather than selfish – and both remain silent, then they get away with only 6 months each. So does mathematics provide an amoral framework? i.e. in this case mathematically optimum strategies are not “nice,” but selfish.

Game theory became quite popular during the Cold War, as the matrix above represented the state of the nuclear stand-off. The threat of Mutually Assured Destruction (MAD) meant that neither the Americans or the Russians had any incentive to strike, because that would inevitably lead to a retaliatory strike – with catastrophic consequences. The above matrix uses negative infinity to represent the worst possible outcome, whilst both sides not striking leads to a positive pay off. Such a game has a very strong Nash Equilibrium – ie. there is no incentive to deviate from the non strike policy. Could the optimal maths strategy here be said to be responsible for saving the world?

Game theory can be extended to evolutionary biology – and is covered in Richard Dawkin’s The Selfish Gene in some detail. Basically whilst it is an optimum strategy to be selfish in a single round of the prisoner’s dilemma, any iterated games (ie repeated a number of times) actually tend towards a co-operative strategy. If someone is nasty to you on round one (ie by testifying) then you can punish them the next time. So with the threat of punishment, a mutually co-operative strategy is superior.

You can actually play the iterated Prisoner Dilemma game as an applet on the website Game Theory. Alternatively pairs within a class can play against each other.

An interesting extension is this applet, also on Game Theory, which models the evolution of 2 populations – residents and invaders. You can set different responses – and then see what happens to the respective populations. This is a good reflection of interactions in real life – where species can choose to live co-cooperatively, or to fight for the same resources.

The first stop for anyone interested in more information about Game Theory should be the Maths Illuminated website – which has an entire teacher unit on the subject – complete with different sections,a video and pdf documents. There’s also a great article on Plus Maths – Does it Pay to be Nice? all about this topic. There are a lot of different games which can be modeled using game theory – and many are listed here . These include the Stag Hunt, Hawk/ Dove and the Peace War game. Some of these have direct applicability to population dynamics, and to the geo-politics of war versus peace.

If you enjoyed this post you might also like:

]]>**Graham’s Number – literally big enough to collapse your head into a black hole**

Graham’s Number is a number so big that it would *literally* collapse your head into a black hole were you fully able to comprehend it. And that’s not hyperbole – the informational content of Graham’s Number is so astronomically large that it exceeds the maximum amount of entropy that could be stored in a brain sized piece of space – i.e. a black hole would form prior to fully processing all the data content. This is a great introduction to notation for *really* big numbers. Numberphile have produced a fantastic video on the topic:

Graham’s Number makes use of Kuth’s up arrow notation (explanation from wikipedia:)

In the series of hyper-operations we have

1) Multiplication:

For example,

2) Exponentiation:

For example,

3) Tetration:

For example,

- etc.

4) Pentation:

and so on.

Examples:

Which clearly can lead to some absolutely huge numbers very quickly. Graham’s Number – which was arrived at mathematically as an upper bound for a problem relating to vertices on hypercubes is (explanation from Wikipedia)

where the number of *arrows* in each layer, starting at the top layer, is specified by the value of the next layer below it; that is,

and where a superscript on an up-arrow indicates how many arrows are there. In other words, *G* is calculated in 64 steps: the first step is to calculate *g*_{1} with four up-arrows between 3s; the second step is to calculate *g*_{2} with *g*_{1} up-arrows between 3s; the third step is to calculate *g*_{3} with *g*_{2} up-arrows between 3s; and so on, until finally calculating *G* = *g*_{64} with *g*_{63} up-arrows between 3s.

So a number so big it can’t be fully processed by the human brain. This raises some interesting questions about maths and knowledge – Graham’s Number is an example of a number that exists but is beyond full human comprehension – it therefore is an example of a upper bound of human knowledge. Therefore will there always be things in the Universe which are beyond full human understanding? Or can mathematics provide a shortcut to knowledge that would otherwise be inaccessible?

If you enjoyed this post you might also like:

How Are Prime Numbers Distributed? Twin Primes Conjecture – a discussion about the amazing world of prime numbers.

Wau: The Most Amazing Number in the World? – a post which looks at the amazing properties of Wau

]]>This is a really interesting puzzle to study – which fits very well when studying geometric series, proof and the history of maths.

The two most intuitive answers are either that it has no sum or that it sums to zero. If you group the pattern into pairs, then each pair (1, -1) = 0. However if you group the pattern by first leaving the 1, then grouping pairs of (-1,1) you would end up with a sum of 1.

Firstly it’s worth seeing why we shouldn’t just use our formula for a geometric series:

with r as the multiplicative constant of -1. This formula requires that the absolute value of r is less than 1 – otherwise the series will not converge.

The series 1,-1,1,-1…. is called Grandi’s series – after a 17th century Italian mathematician (pictured) – and sparked a few hundred years worth of heated mathematical debate as to what the correct summation was.

Using the Cesaro method (explanation pasted from here )

If *a*_{n} = (−1)^{n+1} for *n* ≥ 1. That is, {*a*_{n}} is the sequence

Then the sequence of partial sums {*s*_{n}} is

so whilst the series not converge, if we calculate the terms of the sequence {(*s*_{1} + … + *s*_{n})/*n*} we get:

so that

So, using different methods we have shown that this series “should” have a summation of 0 (grouping in pairs), or that it “should” have a sum of 1 (grouping in pairs after the first 1), or that it “should” have no sum as it simply oscillates, or that it “should” have a Cesaro sum of 1/2 – no wonder it caused so much consternation amongst mathematicians!

This approach can be extended to the complex series, which is looked at in the blog God Plays Dice

This is a really great example of how different proofs can sometimes lead to different (and unexpected) results. What does this say about the nature of proof?

]]>**The Mathematics of Crime and Terrorism**

The ever excellent Numberphile have just released a really interesting video looking at what mathematical models are used to predict terrorist attacks and crime. Whereas a Poisson distribution assumes that events that happen are completely independent, it is actually the case that one (say) burglary in a neighbourhood means that another burglary is much more likely to happen shortly after. Therefore we need a new distribution to model this. The one that Hannah Fry talks about in the video is called the Hawkes process – which gets a little difficult. Nevertheless this is a nice video for showing the need to adapt models to represent real life data.

]]>**The Watson Selection Task – a logical puzzle**

The Watson Selection Task is a logical problem designed to show how bad we are at making logical decisions. Watson first used it in 1968 – and found that only 10% of the population would get the correct answer. Indeed around 65% of the population make the same error. Here is the task:

The participants were given the following instructions:

*Here is a rule: “every card that has a D on one side has a 3 on the other.” Your task is to select all those cards, but only those cards, which you would have to turn over in order to discover whether or not the rule has been violated.*

Give yourself a couple of minutes to work out what you think the answer is – and then highlight the space below where the answer is written in white text.

The correct answer is to pick the D card and the 7 card

This result is normally quite unexpected – but it highlights one of the logical fallacies that we often fall into:

A implies B does not mean that B implies A

All cats have 4 legs (Cats = A, legs = B, A implies B)

All 4 legged animals are cats (B implies A)

We can see that here we would make a logical error if we concluded that all 4 legged animals were cats.

In the logic puzzle we need to turn over only 2 cards, D and 7. This is surprising because most people will also say that you need to turn over card with a 3. First we need to be clear about what we are trying to do: We want to find evidence that the rule we are given is false.

If we turn over the D and find a number other than 3, we have evidence that the rule is false – therefore we need to turn over D.

If we turn over the 7 and find a D on the other side, we have evidence that the rule is false – therefore we need to turn over the 7.

But what about the 3? If we turn over the 3 and find a D then we have no evidence that the rule is false (which is what we are looking for). If we turn over the 3 and find another letter then this **also** gives us no evidence that the rule is false. After all our rule says that all Ds have 3s on the other side, but it **doesn’t** say that all 3s have Ds on the other side.

**Are mathematicians better at this puzzle than historians?**

Given the importance of logical thought in mathematics, people have done studies to see if undergraduate students in maths perform better than humanities students on this task. Here are the results:

You can see that there is a significant difference between the groups. Maths students correctly guessed the answer D7 29% of the time, but only 8% of history students did. The maths university lecturers performed best – getting the answer right 43% of the time.

**Making different mistakes**

You can also analyse the mistakes that students made- by only looking at the proportions of incorrect selections. Here again are significant differences which show that the groups are thinking about the problem in different ways. DK7 was chosen by around 1/5 of both maths students and lecturers, but by hardly any history students.

You can read about these results in much more depth in the following research paper Mathematicians and the Selection Task – where they also use Chi Squared testing for significance levels.

]]>

Sinx/x can’t be integrated into an elementary function – instead we define:

Where Si(x) is a special function. This may sound strange – but we already come across another similar case with the integral of 1/x. In this case we define the integral of 1/x as ln(x). ln(x) is a function with its own graph and I can use it to work out definite integrals of 1/x. For example the integral of 1/x from 1 to 5 will be ln(5) – ln(1) = ln(5).

The graph of Si(x) looks like this:

Or, on a larger scale:

You can see that it is symmetrical about the y axis, has an oscillating motion and as x gets large approaches a limit. In fact this limit is pi/2.

Because Si(0) = 0, you can write the following integrals as:

**How to integrate sinx/x ?**

It’s all very well to define a new function – and say that this is the integral of sinx/x – but how was this function generated in the first place?

Well, one way to integrate difficult functions is to use Taylor and Maclaurin expansions. For example the Maclaurin expansion of sinx/x for values near x=0 is:

This means that in the domain close to x = 0, the function sinx/x behaves in a similar way to the polynomial above. The last part of this expression O( ) just means everything else in this expansion will be x^6 or greater.

**Graph of sinx/x**

**Graph of 1 – x^2/6 + x^4/120**

In the region close to x=0 these functions behave in a very similar manner (this would be easier to see with similar scales so let’s look on a GDC):

So for the region above (x between 0 and 2) the 2 graphs are virtually indistinguishable.

Therefore if we want to integrate sinx/x for values close to 0 we can just integrate our new function 1 – x^2/6 + x^4/120 and get a good approximation.

Let’s try how accurate this is. We can use Wolfram Alpha to tell us that:

and let’s use Wolfram to work out the integral as well:

Our approximation is accurate to 3 dp, 1.371 in both cases. If we wanted greater accuracy we would simply use more terms in the Maclaurin expansion.

So, by using the Maclaurin expansion for terms near x = 0 and the Taylor expansion for terms near x = a we can build up information as to the values of the Si(x) function.

]]>