You are currently browsing the tag archive for the ‘AI’ tag.

Screen Shot 2023-03-28 at 7.46.37 PM

GPT-4 vs ChatGPT. The beginning of an intelligence revolution?

The above graph (image source) is one of the most incredible bar charts you’ll ever see – this is measuring the capabilities of GPT4, Open AI’s new large language model with its previous iteration, ChatGPT.  As we can see, GPT4 is now able to score in the top 20% of takers across a staggering field of subjects.  This on its own is amazing – but the really incredible part is that the green sections represent improvements since ChatGPT – and that ChatGPT was only released 3 ½ months ago.

GPT4 is now able to successfully pass nearly all AP subjects, would pass the bar exam to qualify as a lawyer and is now even making headway on Olympiad style mathematics papers.   You can see that ChatGPT had already mastered many of the humanities subjects – and that now GPT4 has begun to master the sciences, maths, economics and law.

We can see an example of the mathematical improvements in GPT4 below from a recently released research paper.  Both AIs were asked a reasonably challenging integral problem:

Screen Shot 2023-03-28 at 7.38.12 PM

GPT4 response:

Screen Shot 2023-03-28 at 7.56.00 PM

Screen Shot 2023-03-28 at 7.56.16 PM

Screen Shot 2023-03-28 at 7.56.21 PM

GPT4 is correct – and excellently explained, whereas the ChatGPT response (viewable in the paper) was just completely wrong.  It’s not just that GPT4 is now able to do maths like this – after all, so can Wolfram Alpha, but that the large language model training method allows it do complicated maths as well as everything else.  The research paper this appears in is entitled “Sparks of Artificial General Intelligence” because this appears to be the beginning of the holy grail of AI research – a model which has intelligence across multiple domains – and as such begins to reach human levels of intelligence across multiple measures.

An intelligence explosion?

Nick Bostrom’s Superintelligence came out several years ago to discuss the ideas behind the development of intelligent systems and in it he argues that we can probably expect explosive growth – perhaps even over days or weeks – as a system reaches a critical level of intelligence and then drives its own further development.  Let’s look at the maths behind this.  We start by modelling the rate of growth of intelligence over time:

Screen Shot 2023-03-28 at 7.16.02 PM

Optimisation power is a measure of how much resource power is being allocated to improving the intelligence of the AI.  The resources driving this improvement are going to come the company working on the project (in this case Open AI), and also global research into AI in the nature of published peer review papers on neural networks etc.  However there is also the potential for the AI itself to work on the project to improve its own intelligence.  We can therefore say that the optimisation power is given by:

Screen Shot 2023-03-28 at 7.16.08 PM

Whilst the AI system is still undeveloped and unable to contribute meaningfully to its own intelligence improvements we will have:

Screen Shot 2023-03-28 at 7.16.12 PM

If we assume that the company provides a constant investment in optimising their AI, and similarly there is a constant investiment worldwise, then we can treat this as a constant:

Screen Shot 2023-03-28 at 7.16.18 PM

Responsiveness to optimisation describes how easily a system is able to be improved upon.  For example a system which is highly responsive can be easily improved upon with minimal resource power.  A system which shows very little improvements despite a large investment in resource power has low responsiveness.

If we also assume that responsiveness to optimization, R, remains constant over some timeframe then we can write:

Screen Shot 2023-03-28 at 7.16.24 PM

We can then integrate this by separating the variables:

Screen Shot 2023-03-28 at 7.16.27 PM

This means that the intelligence of the system grows in a linear fashion over time.

However when the AI system reaches a certain threshold of intelligence it will become the main resource driving its own intelligence improvements (and much larger than the contribution of the company or the world).  At this point we can say:

Screen Shot 2023-03-28 at 7.16.31 PM

In other words the optimization power is a function of the AI’s current level of intelligence.  This then creates a completely different growth trajectory:

Screen Shot 2023-03-28 at 7.16.36 PM

Which we again solve as follows:

Screen Shot 2023-03-28 at 7.16.41 PMWhich means that now we have the growth of the intelligence of the system exhibiting exponential growth.

What does this mean in practice in terms of AI development?

Screen Shot 2023-03-28 at 7.16.48 PM

We can see above an example of how we might expect such intelligence development to look.  The first section (red) is marked by linear growth over short periods.  As R or D is altered this may create lines with different gradient but growth is not explosive. 

At the point A the AI system gains sufficient intelligence to be the main driving force in its own future intelligence gains.  Note that this does not mean that it has to be above the level of human intelligence when this happens (though it may be) – simply that in the narrow task of improving intelligence it is now superior to the efforts of the company and the world researchers.

So, at point A the exponential growth phase begins (purple) – in this diagram taking the AI system explosively past human intelligence levels.  Then at some unspecified point in the future (B on the diagram), this exponential growth ends and the AI approaches the maximum capacity intelligence for the system.

So it is possible that there will be an intelligence explosion once an AI system gets close to human levels of intelligence – and based on current trends it looks like this is well within reach within the next 5 years.  So hold on tight – things could get very interesting!

Screen Shot 2022-12-24 at 1.31.32 PM

Creating a Neural Network: AI Machine Learning

A neural network is a type of machine learning algorithm modeled after the structure and function of the human brain. It is composed of a large number of interconnected “neurons,” which are organized into layers. These layers are responsible for processing and transforming the input data and passing it through to the output layer, where the final prediction or decision is made.

Image recognition

Screen Shot 2022-12-24 at 1.30.00 PM

Neural networks can be used to classify images of (say) cats and dogs by training a model on a large dataset of labeled images. The model is presented with an input image, and its job is to predict the correct label (e.g., “cat” or “dog”) for the image.

To train the model, the input images are passed through the network and the model makes predictions based on the patterns it has learned from the training data. If the prediction is incorrect, the model adjusts the weights of the connections between neurons in order to improve its accuracy on future predictions.

Our own model

Screen Shot 2022-12-24 at 1.32.42 PM

I want to create a very simple model to “recognise” faces.  I first start with a 5 by 5 grid, and define what I think is a perfect face.  This is shown above.  I can then convert this to numerical values by defining the white spaces as 0 and the black squares as 1.

Screen Shot 2022-12-24 at 1.32.48 PM

I can then represent this information as 5 column vectors:

Screen Shot 2022-12-24 at 1.33.00 PM

Building a weighting model

Screen Shot 2022-12-24 at 1.33.07 PM

Next I need to decide which squares would be acceptable for a face.  I’ve kept the black squares for the most desirable, and then added some grey shade for squares that could also be included. I can then convert this into numerical data by deciding on a weight that each square should receive:

Screen Shot 2022-12-24 at 1.33.12 PM

Here I am using 1 to represent a very desirable square, 0.5 for a somewhat desirable square and -1 for an undesirable square.  I can also represent this weighting model as 5 column vectors:

Screen Shot 2022-12-24 at 1.33.17 PM

Using the dot product

I can then find the sum of the dot products of the 5 x vectors with the 5 w vectors.  In formal notation this is given by:

Screen Shot 2022-12-24 at 1.33.30 PM

What this means is that I find the dot product of x_1 and w_1 and then add this to the dot product of x_2 and w_2 etc. For example with:

Screen Shot 2022-12-24 at 1.33.36 PM

This gives:

Screen Shot 2022-12-24 at 1.33.41 PM

Which is:

Screen Shot 2022-12-24 at 1.33.50 PM

Doing this for all 5 vectors gives:

Screen Shot 2022-12-24 at 1.33.53 PM

So my perfect face has a score of 5.  So I can therefore give an upper and lower bound what what would be considered a face.  Let’s say:

Screen Shot 2022-12-24 at 1.33.58 PM

Testing our model: A Face

Screen Shot 2022-12-24 at 1.34.12 PM

I want to see if the above image would be recognised as a face by our model.  This has the following:

Screen Shot 2022-12-24 at 1.34.21 PM

And when we calculate the sum of the dot products we get:

Screen Shot 2022-12-24 at 1.34.25 PM

Which would be recognised as a face.

Testing our model: Not a Face

Screen Shot 2022-12-24 at 1.34.32 PM

There are 2 to the power 25 different patterns that can be generated (over 33 million), so we would expect that the majority do not get recognised as a face.  I randomly generated a 25 length binary string and created the image above.  When we use our model it returns:

Screen Shot 2022-12-24 at 1.34.39 PM

Which would not be recognised as a face.

Using Python and modifying the design

Screen Shot 2022-12-24 at 1.35.35 PM

I decided to modify the weighting so that undesirable squares received -2, to make them less likely to appear.  I then changed the weighting so that I wanted a score between 4.5 and 5.5 inclusive.

Screen Shot 2022-12-24 at 1.59.06 PM

I then wrote some Python code that would randomly generate 200,000 images and then run this algorithm to check whether this was recognised as a face or not.

The results

Screen Shot 2022-12-24 at 2.11.22 PM

You can see some of the results above – whilst not perfect, they do have a feel of a face about them.  And my favourite is below:

Screen Shot 2022-12-24 at 1.35.58 PM

A nice cheeky grin!  You can see the power of this method – this was an extremely simple model and yet achieves some results very quickly.  With images of 1 million pixels and much more advanced weighting algorithms, modern AI systems can accurately identify and categorise a huge variety of images.

Screen Shot 2022-12-11 at 7.06.37 PM

Can Artificial Intelligence (Chat GPT) Get a 7 on an SL Maths paper?

ChatGPT is a large language model that was trained using machine learning techniques. One of the standout features of ChatGPT is its mathematical abilities. It can perform a variety of calculations and solve equations.  This advanced capability is made possible by the model’s vast amounts of training data and its ability to understand and manipulate complex mathematical concepts. 

I didn’t write that previous paragraph – I asked Chat GPT to do it for me. The current version of Chat GPT is truly stunning – so I thought I would see if it is able to get a 7 on an SL Mathematics paper.  I simply typed the questions into the interface to see what the result was. 

AI vs an SL Paper

Screen Shot 2022-12-11 at 7.08.33 PM

I chose a 2018 SL Paper.  Let’s look at a breakdown of its scores (The full pdf of AI’s answers and marks is available to download here ).

(1) A function question.   AI gets 5 out of 6.  It makes a mistake by not swapping x and y for the inverse function.

(2) A box and whisker plot question. AI gets 2 out of 6.  It doesn’t know the IB’s definition of outlier.

(3) Interpreting a graph.  AI gets 0 out of 6.  It needs some diagrammatic recognition to be able to do this.

(4) Functions and lines.  AI gets 4 out of 7.  Bafflingly it solves it solves 2+4 -c = 5 incorrectly.

(5) Integration and volume of revolution.  AI gets 1 out of 7.  The integral is incorrect (off by a factor of 1/2).  Doesn’t sub in the limits for the integral.

(6) Vectors from a diagram.  AI gets 0 out of 6.  It needs some diagrammatic recognition to be able to do this.

(7) Normals to curves.  AI gets 7 out of 7.

(8) Inflection points and concavity.  AI gets 12 out of 13.  It solves 6x+18 <0 incorrectly on the last line!

(9) Vectors in 3D.  AI gets 7 out of 16.  Solves cos(OBA) = 0 incorrectly and can’t find the area of a triangle based on vector information.

(10) Sequences and trig.  AI  gets 11 out of 15.  

Total:  49/90.  This is a Level 5.  [Level 6 is 54.  7 is 65+]. 

Considering that there were 2 full questions that had to be skipped this is pretty good.  It did make some very surprising basic mistakes – but overall was still able to achieve a solid IB Level 5, and it did this in about 5-10 minutes (the only slow part was entering the questions).  If this system was hooked up to text recognition and diagrammatic recognition and then fine-tuned for IB Maths I think this would be able to get a Level 7 very easily.

Engines like Wolfram Alpha are already exceptional at doing maths as long as questions are interpreted to isolate the maths required.  This seems to be a step change – with a system able to simply process all information as presented and then to interpret what maths is required by itself.  

So, what does this mean?  Well probably that no field of human thought is safe!  AI systems are now unbelievably impressive at graphics design, art, coding, essay writing and chat functions – so creative fields which previously considered too difficult for computers are now very much in play. 

Website Stats

  • 9,480,563 views

About

All content on this site has been written by Andrew Chambers (MSc. Mathematics, IB Mathematics Examiner).

New website for International teachers

I’ve just launched a brand new maths site for international schools – over 2000 pdf pages of resources to support IB teachers.  If you are an IB teacher this could save you 200+ hours of preparation time.

Explore here!

Free HL Paper 3 Questions

P3 investigation questions and fully typed mark scheme.  Packs for both Applications students and Analysis students.

Available to download here

IB Maths Super Exploration Guide

A Super Exploration Guide with 168 pages of essential advice from a current IB examiner to ensure you get great marks on your coursework.

Available to download here.

Recent Posts

Follow IB Maths Resources from Intermathematics on WordPress.com