You are currently browsing the tag archive for the ‘benfords law’ tag.
Spotting fake data with Benford’s Law
In the current digital age it’s never been easier to fake data – and so it’s never been more important to have tools to detect data that has been faked. Benford’s Law is an extremely useful way of testing data – because when people fake data they tend to do so in a predictable way. Benford’s Law looks at the probability that a number in certain data set (many measurements, street address, stock prices etc.) begins with a given number (its leading digit). Whilst we might expect the leading digits (d) would be equally likely occur, in reality they follow the following equation:
So for example we can see that a leading digit of 1 is much more likely than a leading digit of a 9:
Testing some data
I wanted to test some data to see if it did indeed follow Benford’s Law. So, I downloaded an Excel file with 531 data points from the CDC website. This gave the moving 7-day average Covid cases per 100,000 people for every day from 12th March 2020 to 3rd October 2021. I then used the nice Excel techniques shown above in the video to manipulate the data into a useful form. Once this had been done I could then use Desmos to plot this data (dot plot and left aligned frequency histogram). You can see this data below:
The red curve is the continuous (rather than discrete) curve created by working out the expected frequencies for each digit. On Desmos I generated this by the following equation:
We can see that our data largely follows our expected curve – so we would not have any evidence to suggest faked data! We could conduct a Chi-Squared test to measure the goodness of fit of our data (this is also explained in the video).
Conclusion
This is a simple but effective method to test for faked data – if data fails this test it doesn’t necessarily mean it was faked (eg. data on heights of men in cm will clearly have nearly all 1s as leading digits!) but most non-random real life data measurements do follow this rule. Try to find your own data (try to do this with a large data set) and try for yourself.
If you are a teacher then please also visit my new site: intermathematics.com for over 2000+ pdf pages of resources for teaching IB maths!
Benford’s Law – Using Maths to Catch Fraudsters
Benford’s Law is a very powerful and counter-intuitive mathematical rule which determines the distribution of leading digits (ie the first digit in any number). You would probably expect that distribution would be equal – that a number 9 occurs as often as a number 1. But this, whilst intuitive, is false for a large number of datasets. Accountants looking for fraudulant activity and investigators looking for falsified data use Benford’s Law to catch criminals.
The probability function for Benford’s Law is:
This clearly shows that a 1 is by far the most likely number to occur – and that you have nearly a 60% chance of the leading digit being 3,2 or 1. Any criminal trying to make up data who didn’t know this law would be easily caught out.
Scenario for students 1:
You are a corrupt bank manager who is secretly writing cheques to your own account. You can write any cheques for any amount – but you want it to appear natural so as not to arouse suspicion. Write yourself 20 cheque amounts. Try not to get caught!
Look at the following fraudualent cheques that were written by an Arizona manager – can you see why he was caught?
Scenario for students 2:
Use the formula for the probability density function to find the probability of the respective leading digits. Look at the leading digits for the first 50 Fibonacci numbers. Does the law hold?
There is also an excellent Numberphile video on Benford’s Law. Wikipedia has a lot more on the topic, as have the Journal of Accountancy.
If you enjoyed this topic you might also like:
Amanda Knox and Bad Maths in Courts – some other examples of mathematics and the criminal justice system.
Cesaro Summation: Does 1 – 1 + 1 – 1 … = 1/2? – another surprising mathematical result.
Essential Resources for IB Teachers
If you are a teacher then please also visit my new site. This has been designed specifically for teachers of mathematics at international schools. The content now includes over 2000 pages of pdf content for the entire SL and HL Analysis syllabus and also the SL Applications syllabus. Some of the content includes:
- Original pdf worksheets (with full worked solutions) designed to cover all the syllabus topics. These make great homework sheets or in class worksheets – and are each designed to last between 40 minutes and 1 hour.
- Original Paper 3 investigations (with full worked solutions) to develop investigative techniques and support both the exploration and the Paper 3 examination.
- Over 150 pages of Coursework Guides to introduce students to the essentials behind getting an excellent mark on their exploration coursework.
- A large number of enrichment activities such as treasure hunts, quizzes, investigations, Desmos explorations, Python coding and more – to engage IB learners in the course.
There is also a lot more. I think this could save teachers 200+ hours of preparation time in delivering an IB maths course – so it should be well worth exploring!
Essential Resources for both IB teachers and IB students
1) Exploration Guides and Paper 3 Resources
I’ve put together a 168 page Super Exploration Guide to talk students and teachers through all aspects of producing an excellent coursework submission. Students always make the same mistakes when doing their coursework – get the inside track from an IB moderator! I have also made Paper 3 packs for HL Analysis and also Applications students to help prepare for their Paper 3 exams. The Exploration Guides can be downloaded here and the Paper 3 Questions can be downloaded here.
Benford’s Law – Using Maths to Catch Fraudsters
Benford’s Law is a very powerful and counter-intuitive mathematical rule which determines the distribution of leading digits (ie the first digit in any number). You would probably expect that distribution would be equal – that a number 9 occurs as often as a number 1. But this, whilst intuitive, is false for a large number of datasets. Accountants looking for fraudulant activity and investigators looking for falsified data use Benford’s Law to catch criminals.
The probability function for Benford’s Law is:
This clearly shows that a 1 is by far the most likely number to occur – and that you have nearly a 60% chance of the leading digit being 3,2 or 1. Any criminal trying to make up data who didn’t know this law would be easily caught out.
Scenario for students 1:
You are a corrupt bank manager who is secretly writing cheques to your own account. You can write any cheques for any amount – but you want it to appear natural so as not to arouse suspicion. Write yourself 20 cheque amounts. Try not to get caught!
Look at the following fraudualent cheques that were written by an Arizona manager – can you see why he was caught?
Scenario for students 2:
Use the formula for the probability density function to find the probability of the respective leading digits. Look at the leading digits for the first 50 Fibonacci numbers. Does the law hold?
There is also an excellent Numberphile video on Benford’s Law. Wikipedia has a lot more on the topic, as have the Journal of Accountancy.
If you enjoyed this topic you might also like:
Amanda Knox and Bad Maths in Courts – some other examples of mathematics and the criminal justice system.
Cesaro Summation: Does 1 – 1 + 1 – 1 … = 1/2? – another surprising mathematical result.
1) Exploration Guides and Paper 3 Resources
I’ve put together four comprehensive pdf guides to help students prepare for their exploration coursework and Paper 3 investigations. The exploration guides talk through the marking criteria, common student mistakes, excellent ideas for explorations, technology advice, modeling methods and a variety of statistical techniques with detailed explanations. I’ve also made 17 full investigation questions which are also excellent starting points for explorations. The Exploration Guides can be downloaded here and the Paper 3 Questions can be downloaded here.