GPT-4 vs ChatGPT. The beginning of an intelligence revolution?

The above graph (image source) is one of the most incredible bar charts you’ll ever see – this is measuring the capabilities of GPT4, Open AI’s new large language model with its previous iteration, ChatGPT.  As we can see, GPT4 is now able to score in the top 20% of takers across a staggering field of subjects.  This on its own is amazing – but the really incredible part is that the green sections represent improvements since ChatGPT – and that ChatGPT was only released 3 ½ months ago.

GPT4 is now able to successfully pass nearly all AP subjects, would pass the bar exam to qualify as a lawyer and is now even making headway on Olympiad style mathematics papers.   You can see that ChatGPT had already mastered many of the humanities subjects – and that now GPT4 has begun to master the sciences, maths, economics and law.

We can see an example of the mathematical improvements in GPT4 below from a recently released research paper.  Both AIs were asked a reasonably challenging integral problem:

GPT4 response:

GPT4 is correct – and excellently explained, whereas the ChatGPT response (viewable in the paper) was just completely wrong.  It’s not just that GPT4 is now able to do maths like this – after all, so can Wolfram Alpha, but that the large language model training method allows it do complicated maths as well as everything else.  The research paper this appears in is entitled “Sparks of Artificial General Intelligence” because this appears to be the beginning of the holy grail of AI research – a model which has intelligence across multiple domains – and as such begins to reach human levels of intelligence across multiple measures.

An intelligence explosion?

Nick Bostrom’s Superintelligence came out several years ago to discuss the ideas behind the development of intelligent systems and in it he argues that we can probably expect explosive growth – perhaps even over days or weeks – as a system reaches a critical level of intelligence and then drives its own further development.  Let’s look at the maths behind this.  We start by modelling the rate of growth of intelligence over time:

Optimisation power is a measure of how much resource power is being allocated to improving the intelligence of the AI.  The resources driving this improvement are going to come the company working on the project (in this case Open AI), and also global research into AI in the nature of published peer review papers on neural networks etc.  However there is also the potential for the AI itself to work on the project to improve its own intelligence.  We can therefore say that the optimisation power is given by:

Whilst the AI system is still undeveloped and unable to contribute meaningfully to its own intelligence improvements we will have:

If we assume that the company provides a constant investment in optimising their AI, and similarly there is a constant investiment worldwise, then we can treat this as a constant:

Responsiveness to optimisation describes how easily a system is able to be improved upon.  For example a system which is highly responsive can be easily improved upon with minimal resource power.  A system which shows very little improvements despite a large investment in resource power has low responsiveness.

If we also assume that responsiveness to optimization, R, remains constant over some timeframe then we can write:

We can then integrate this by separating the variables:

This means that the intelligence of the system grows in a linear fashion over time.

However when the AI system reaches a certain threshold of intelligence it will become the main resource driving its own intelligence improvements (and much larger than the contribution of the company or the world).  At this point we can say:

In other words the optimization power is a function of the AI’s current level of intelligence.  This then creates a completely different growth trajectory:

Which we again solve as follows:

Which means that now we have the growth of the intelligence of the system exhibiting exponential growth.

What does this mean in practice in terms of AI development?

We can see above an example of how we might expect such intelligence development to look.  The first section (red) is marked by linear growth over short periods.  As R or D is altered this may create lines with different gradient but growth is not explosive.

At the point A the AI system gains sufficient intelligence to be the main driving force in its own future intelligence gains.  Note that this does not mean that it has to be above the level of human intelligence when this happens (though it may be) – simply that in the narrow task of improving intelligence it is now superior to the efforts of the company and the world researchers.

So, at point A the exponential growth phase begins (purple) – in this diagram taking the AI system explosively past human intelligence levels.  Then at some unspecified point in the future (B on the diagram), this exponential growth ends and the AI approaches the maximum capacity intelligence for the system.

So it is possible that there will be an intelligence explosion once an AI system gets close to human levels of intelligence – and based on current trends it looks like this is well within reach within the next 5 years.  So hold on tight – things could get very interesting!