Screen Shot 2022-12-11 at 7.06.37 PM

Can Artificial Intelligence (Chat GPT) Get a 7 on an SL Maths paper?

ChatGPT is a large language model that was trained using machine learning techniques. One of the standout features of ChatGPT is its mathematical abilities. It can perform a variety of calculations and solve equations.  This advanced capability is made possible by the model’s vast amounts of training data and its ability to understand and manipulate complex mathematical concepts. 

I didn’t write that previous paragraph – I asked Chat GPT to do it for me. The current version of Chat GPT is truly stunning – so I thought I would see if it is able to get a 7 on an SL Mathematics paper.  I simply typed the questions into the interface to see what the result was. 

AI vs an SL Paper

Screen Shot 2022-12-11 at 7.08.33 PM

I chose a 2018 SL Paper.  Let’s look at a breakdown of its scores (The full pdf of AI’s answers and marks is available to download here ).

(1) A function question.   AI gets 5 out of 6.  It makes a mistake by not swapping x and y for the inverse function.

(2) A box and whisker plot question. AI gets 2 out of 6.  It doesn’t know the IB’s definition of outlier.

(3) Interpreting a graph.  AI gets 0 out of 6.  It needs some diagrammatic recognition to be able to do this.

(4) Functions and lines.  AI gets 4 out of 7.  Bafflingly it solves it solves 2+4 -c = 5 incorrectly.

(5) Integration and volume of revolution.  AI gets 1 out of 7.  The integral is incorrect (off by a factor of 1/2).  Doesn’t sub in the limits for the integral.

(6) Vectors from a diagram.  AI gets 0 out of 6.  It needs some diagrammatic recognition to be able to do this.

(7) Normals to curves.  AI gets 7 out of 7.

(8) Inflection points and concavity.  AI gets 12 out of 13.  It solves 6x+18 <0 incorrectly on the last line!

(9) Vectors in 3D.  AI gets 7 out of 16.  Solves cos(OBA) = 0 incorrectly and can’t find the area of a triangle based on vector information.

(10) Sequences and trig.  AI  gets 11 out of 15.  

Total:  49/90.  This is a Level 5.  [Level 6 is 54.  7 is 65+]. 

Considering that there were 2 full questions that had to be skipped this is pretty good.  It did make some very surprising basic mistakes – but overall was still able to achieve a solid IB Level 5, and it did this in about 5-10 minutes (the only slow part was entering the questions).  If this system was hooked up to text recognition and diagrammatic recognition and then fine-tuned for IB Maths I think this would be able to get a Level 7 very easily.

Engines like Wolfram Alpha are already exceptional at doing maths as long as questions are interpreted to isolate the maths required.  This seems to be a step change – with a system able to simply process all information as presented and then to interpret what maths is required by itself.  

So, what does this mean?  Well probably that no field of human thought is safe!  AI systems are now unbelievably impressive at graphics design, art, coding, essay writing and chat functions – so creative fields which previously considered too difficult for computers are now very much in play.