If you are a teacher then please also visit my new site: intermathematics.com for over 2000+ pdf pages of resources for teaching IB maths!

Anscombe’s Quartet – the importance of graphs!

Anscombe’s Quartet was devised by the statistician Francis Anscombe to illustrate how important it was to not just rely on statistical measures when analyzing data.  To do this he created 4 data sets which would produce nearly identical statistical measures.  The scatter graphs above generated by the Python code here.

Statistical measures

1) Mean of x values in each data set = 9.00
2) Standard deviation of x values in each data set  = 3.32
3) Mean of y values in each data set = 7.50
4) Standard deviation of x values in each data set  = 2.03
5) Pearson’s Correlation coefficient for each paired data set = 0.82
6) Linear regression line for each paired data set: y = 0.500x + 3.00

When looking at this data we would be forgiven for concluding that these data sets must be very similar – but really they are quite different.

Data Set A:

x = [10,8,13,9,11,14,6,4,12,7,5]

y = [8.04, 6.95,7.58,8.81,8.33, 9.96,7.24,4.26,10.84,4.82,5.68]

Data Set A does indeed fit a linear regression – and so this would be appropriate to use the line of best fit for predictive purposes.

Data Set B:

x = [10,8,13,9,11,14,6,4,12,7,5]

y = [9.14,8.14,8.74,8.77,9.26,8.1,6.13,3.1,9.13,7.26,4.74]

You could fit a linear regression to Data Set B – but this is clearly not the most appropriate regression line for this data.  Some quadratic or higher power polynomial would be better for predicting data here.

Data Set C:

x = [10,8,13,9,11,14,6,4,12,7,5]

y = [7.46,6.77,12.74,7.11,7.81,8.84,6.08,5.39,8.15,6.42,5.73]

In Data set C we can see the effect of a single outlier – we have 11 points in pretty much a perfect linear correlation, and then a single outlier.  For predictive purposes we would be best investigating this outlier (checking that it does conform to the mathematical definition of an outlier), and then potentially doing our regression with this removed.

Data Set D:

x = [8,8,8,8,8,8,8,19,8,8,8]

y = [6.58,5.76,7.71,8.84,8.47,7.04,5.25,12.50,5.56,7.91,6.89]

In Data set D we can also see the effect of a single outlier – we have 11 points in a vertical line, and then a single outlier.  Clearly here again drawing a line of best fit for this data is not appropriate – unless we remove this outlier first.

The moral of the story

So – the moral here is always use graphical analysis alongside statistical measures.  A very common mistake for IB students is to rely on Pearson’s Product coefficient without really looking at the scatter graph to decide whether a linear fit is appropriate.  If you do this then you could end up with a very low mark in the E category as you will not show good understanding of what you are doing.  So always plot a graph first!

Essential Resources for IB Teachers

If you are a teacher then please also visit my new site.  This has been designed specifically for teachers of mathematics at international schools.  The content now includes over 2000 pages of pdf content for the entire SL and HL Analysis syllabus and also the SL Applications syllabus.  Some of the content includes:

1. Original pdf worksheets (with full worked solutions) designed to cover all the syllabus topics.  These make great homework sheets or in class worksheets – and are each designed to last between 40 minutes and 1 hour.
2. Original Paper 3 investigations (with full worked solutions) to develop investigative techniques and support both the exploration and the Paper 3 examination.
3. Over 150 pages of Coursework Guides to introduce students to the essentials behind getting an excellent mark on their exploration coursework.
4. A large number of enrichment activities such as treasure hunts, quizzes, investigations, Desmos explorations, Python coding and more – to engage IB learners in the course.

There is also a lot more.  I think this could save teachers 200+ hours of preparation time in delivering an IB maths course – so it should be well worth exploring!

Essential Resources for both IB teachers and IB students

I’ve put together a 168 page Super Exploration Guide to talk students and teachers through all aspects of producing an excellent coursework submission.  Students always make the same mistakes when doing their coursework – get the inside track from an IB moderator!  I have also made Paper 3 packs for HL Analysis and also Applications students to help prepare for their Paper 3 exams.  The Exploration Guides can be downloaded here and the Paper 3 Questions can be downloaded here.