You are currently browsing the tag archive for the ‘correlation’ tag.

Anscombe’s Quartet was devised by the statistician Francis Anscombe to illustrate how important it was to not just rely on statistical measures when analyzing data.  To do this he created 4 data sets which would produce nearly identical statistical measures.  The scatter graphs above generated by the Python code here.

Statistical measures

1) Mean of x values in each data set = 9.00
2) Standard deviation of x values in each data set  = 3.32
3) Mean of y values in each data set = 7.50
4) Standard deviation of x values in each data set  = 2.03
5) Pearson’s Correlation coefficient for each paired data set = 0.82
6) Linear regression line for each paired data set: y = 0.500x + 3.00

When looking at this data we would be forgiven for concluding that these data sets must be very similar – but really they are quite different.

Data Set A:

x = [10,8,13,9,11,14,6,4,12,7,5]

y = [8.04, 6.95,7.58,8.81,8.33, 9.96,7.24,4.26,10.84,4.82,5.68]

Data Set A does indeed fit a linear regression – and so this would be appropriate to use the line of best fit for predictive purposes.

Data Set B:

x = [10,8,13,9,11,14,6,4,12,7,5]

y = [9.14,8.14,8.74,8.77,9.26,8.1,6.13,3.1,9.13,7.26,4.74]

You could fit a linear regression to Data Set B – but this is clearly not the most appropriate regression line for this data.  Some quadratic or higher power polynomial would be better for predicting data here.

Data Set C:

x = [10,8,13,9,11,14,6,4,12,7,5]

y = [7.46,6.77,12.74,7.11,7.81,8.84,6.08,5.39,8.15,6.42,5.73]

In Data set C we can see the effect of a single outlier – we have 11 points in pretty much a perfect linear correlation, and then a single outlier.  For predictive purposes we would be best investigating this outlier (checking that it does conform to the mathematical definition of an outlier), and then potentially doing our regression with this removed.

Data Set D:

x = [8,8,8,8,8,8,8,19,8,8,8]

y = [6.58,5.76,7.71,8.84,8.47,7.04,5.25,12.50,5.56,7.91,6.89]

In Data set D we can also see the effect of a single outlier – we have 11 points in a vertical line, and then a single outlier.  Clearly here again drawing a line of best fit for this data is not appropriate – unless we remove this outlier first.

The moral of the story

So – the moral here is always use graphical analysis alongside statistical measures.  A very common mistake for IB students is to rely on Pearson’s Product coefficient without really looking at the scatter graph to decide whether a linear fit is appropriate.  If you do this then you could end up with a very low mark in the E category as you will not show good understanding of what you are doing.  So always plot a graph first!

wage bill

Is there a correlation between Premier League wages and league position?

The Guardian has just released its 2012-13 Premier League season data analysis – which shows exactly how much each club in the Premier League spent on wages last year (see the bar chart above).  This can be easily plotted on a scatter graph to test how strong the correlation is between spending and league position. (y axis is league position, x axis is wage bill in millions of pounds).

scatter1

The mean spending on wages is 89 million pounds.  Our regression line is y = -0.08x + 17.52.  We can see some of the big outliers are QPR (with a big wage bill but low premier league position) and Everton (with a low wage bill relative to others who finished in a similar position).

The Pearson’s product moment correlation coefficient (r) is -0.73.  This is negative because in our case league position is numerically lower the higher up the league you are.  This shows a pretty strong correlation between league spending and league position.  An r value of -1 would be a perfect correlation in our case, whereas 0 would be no correlation.

Is there a correlation between turnover and league position?

turnover

We can also see what the correlation is between league position and overall club turnover (see the bar chart above).  Here we can see there is a huge gulf between the top few clubs and everyone else in the league.  There’s only 40 million pounds difference between the bottom ranked club for revenue Wigan and Newcastle, with the 7th biggest revenue.  But then a massive jump up to those with the top 6 revenues.

scatter2

This time we have a mean turnover of 128 million pounds and a regression line of y = -0.05x + 16.89.   The Pearson’s r value this time is r = -0.79, so there is a slightly stronger correlation than from wages – and this is a strong correlation overall.  So, both wage bills and turnover provide a pretty good predictor of where a team will finish – and also a decent yardstick to measure how well a team has done relative to their resources.

If you like this post you might also like:

Do Championship Wages Predict League position?  A comparison between the Premier League and the Championship (England’s second tier).

Does Sacking a Manager Improve Results? How an improvement in team results is often just down to a statistical result – regression to the mean.

Maths Studies IA Exploration Topics – A large number of examples of statistics investigations to explore.

Essential resources for IB students:

1) Revision Village

Screen Shot 2021-05-19 at 9.55.51 AM

Revision Village has been put together to help IB students with topic revision both for during the course and for the end of Year 12 school exams and Year 13 final exams.  I would strongly recommend students use this as a resource during the course (not just for final revision in Y13!) There are specific resources for HL and SL students for both Analysis and Applications.

Screen Shot 2018-03-19 at 4.42.05 PM.png

There is a comprehensive Questionbank takes you to a breakdown of each main subject area (e.g. Algebra, Calculus etc) and then provides a large bank of graded questions.  What I like about this is that you are given a difficulty rating, as well as a mark scheme and also a worked video tutorial.  Really useful!

Screen Shot 2021-05-19 at 10.05.18 AM

The Practice Exams section takes you to a large number of ready made quizzes, exams and predicted papers.   These all have worked solutions and allow you to focus on specific topics or start general revision.  This also has some excellent challenging questions for those students aiming for 6s and 7s.

Each course also has a dedicated video tutorial section which provides 5-15 minute tutorial videos on every single syllabus part – handily sorted into topic categories.

2) Exploration Guides and Paper 3 Resources

Screen Shot 2021-05-19 at 6.32.13 PM

I’ve put together four comprehensive pdf guides to help students prepare for their exploration coursework and Paper 3 investigations. The exploration guides talk through the marking criteria, common student mistakes, excellent ideas for explorations, technology advice, modeling methods and a variety of statistical techniques with detailed explanations. I’ve also made 17 full investigation questions which are also excellent starting points for explorations.  The Exploration Guides can be downloaded here and the Paper 3 Questions can be downloaded here.

About

All content on this site has been written by Andrew Chambers (MSc. Mathematics, IB Mathematics Examiner). Please contact here for information on webinar training or for business ideas.

Website Stats

  • 8,167,534 views

Getting a 7 in IB Maths Exploration Coursework

Getting a 7 in IB Maths Exploration Coursework

I’ve teamed up with Udemy – the world’s leading provider of online courses to create a comprehensive online guide to the exploration.  It includes 9 tutorial videos totaling 2 hours 30 minutes of essential information designed to ensure you get the best possible grade.

You can sign up for this course for 40% off the standard price by using the coupon: JULYDISCOUNT.  (Expires 20/08/21).

See the free preview here.

IB Maths Exploration Guide

IB Maths Exploration Guide

A comprehensive 63 page pdf guide to help you get excellent marks on your maths investigation. Includes:

  1. Investigation essentials,
  2. Marking criteria guidance,
  3. 70 hand picked interesting topics
  4. Useful websites for use in the exploration,
  5. A student checklist for top marks
  6. Avoiding common student mistakes
  7. A selection of detailed exploration ideas
  8. Advice on using Geogebra, Desmos and Tracker.

Available to download here.

IB HL Paper 3 Practice Questions (120 page pdf)

IB HL Paper 3 Practice Questions 

Seventeen full investigation questions – each one designed to last around 1 hour, and totaling around 40 pages and 600 marks worth of content.  There is also a fully typed up mark scheme.  Together this is around 120 pages of content.

Available to download here.

Modelling Guide


IB Exploration Modelling Guide 

A 50 page pdf guide full of advice to help with modelling explorations – focusing in on non-calculator methods in order to show good understanding.

Modelling Guide includes:

Linear regression and log linearization, quadratic regression and cubic regression, exponential and trigonometric regression, comprehensive technology guide for using Desmos and Tracker.

Available to download here.

Statistics Guide

IB Exploration Statistics Guide

A 55 page pdf guide full of advice to help with modelling explorations – focusing in on non-calculator methods in order to show good understanding.

Statistics Guide includes: Pearson’s Product investigation, Chi Squared investigation, Binomial distribution investigation, t-test investigation, sampling techniques, normal distribution investigation and how to effectively use Desmos to represent data.

Available to download here.

IB Revision Notes

IB Revision Notes

Full revision notes for SL Analysis (60 pages), HL Analysis (112 pages) and SL Applications (53 pages).  Beautifully written by an experienced IB Mathematics teacher, and of an exceptionally high quality.  Fully updated for the new syllabus.  A must for all Analysis and Applications students!

Available to download here.

Recent Posts

Follow IB Maths Resources from Intermathematics on WordPress.com