**Using Maths to model the spread of Coronavirus (COVID-19)**

This coronavirus is the latest virus to warrant global fears over a disease pandemic. Throughout history we have seen pandemic diseases such as the Black Death in Middle Ages Europe and the Spanish Flu at the beginning of the 20th century. More recently we have seen HIV responsible for millions of deaths. In the last few years there have been scares over bird flu and SARS – yet neither fully developed into a major global health problem. So, how contagious is COVID-19, and how can we use mathematics to predict its spread?

Modelling disease outbreaks with real accuracy is an incredibly important job for mathematicians and all countries employ medical statisticians for this job . Understanding how diseases spread and how fast they can spread through populations is essential to developing effective medical strategies to minimise deaths. If you want to save lives maybe you should become a mathematician rather than a doctor!

Currently scientists know relatively little about the new virus – but they do know that it’s the same coronavirus family as SARS and MERS which can both cause serious respiratory problems. Scientists are particularly interested in trying to discover how infectious the virus is, how long a person remains contagious, and whether people can be contagious before they show any symptoms.

**In the case of COVID-19 we have the following early estimated values: **[From a paper published by medical statisticians in the UK on January 24]

**R**_{0}. between 3.6 and 4. This is defined as how many people an infectious person will pass on their infection to in a totally susceptible population. The higher the R_{0}. value the more quickly an infection will spread. By comparison seasonal flu has a R_{0}. value around 2.8.

**Total number infected** by January 21: prediction interval 9,217–14,245. Of these an estimated 3,050–4,017 currently with the virus and the others recovered (or died). This is based on an estimation that only around 5% of cases have been diagnosed. By February 4th they predict 132,751–273,649 will be infected.

**Transmission rate β** estimated at 1.07. β represents the transmission rate per day – so on average an infected person will infect another 1.07 people a day.

**Infectious period** estimated at 3.6 days. We can therefore calculate μ (the per capita recovery rate) by μ = 1/(3.6). This tells us how quickly people will be removed from the population (either recovered and become immune or died)

**SIR Model**

The basic model is based on the SIR model. The SIR model looks at how much of the population is susceptible to infection (S), how many of these go on to become infectious (I), and how many of these are removed (R) from the population being considered (i.e they either recover and thus won’t catch the virus again, or die).

The Guardian datablog have an excellent graphic to show the contagiousness relative to deadliness of different diseases [click to enlarge, or follow the link]. We can see that seasonal flu has an R_{0}. value of around 2.8 and a fatality rate of around 0.1%, whereas measles has an R_{0}. value of around 15 and a fatality rate of around 0.3%. This means that measles is much more contagious than seasonal flu.

You can notice that we have nothing in the top right hand corner (very deadly and very contagious). This is just as well as that could be enough to seriously dent the human population. Most diseases we worry about fall into 2 categories – contagious and not very deadly or not very contagious and deadly.

The equations above represent a SIR (susceptible, infectious, removed) model which can be used to model the spread of diseases like flu.

dS/dt represents the rate of change of those who are susceptible to the illness with respect to time. dI/dt represents the rate of change of those who are infected with respect to time. dR/dt represents the rate of change of those who have been removed with respect to time (either recovered or died).

For example, if dI/dt is high then the number of people becoming infected is rapidly increasing. When dI/dt is zero then there is no change in the numbers of people becoming infected (number of infections remain steady). When dI/dt is negative then the numbers of people becoming infected is decreasing.

**Modelling for COVID-19**

N is the total population. Let’s take as the population of Wuhan as 11 million.

μ is the per capita recovery (Calculated by μ = 1/(duration of illness) ). We have μ = 1/3.6 = 5/18.

β the transmission rate as approximately 1.07

Therefore our 3 equations for rates of change become:

dS/dt = -1.07 S I /11,000,000

dI/dt = 1.07 S I /11,000,000 – 5/18 I

dR/dt = 5/18 I

Unfortunately these equations are very difficult to solve – but luckily we can use a computer program or spreadsheet to plot what happens. We need to assign starting values for S, I and R – the numbers of people susceptible, infectious and removed. With the following values for January 21: S = 11,000,000, I = 3500, R = 8200, β = 1.07, μ = 5/18, I designed the following Excel spreadsheet (instructions on what formula to use here):

This gives a prediction that around 3.9 million people infected within 2 weeks! We can see that the SIR model that we have used is quite simplistic (and significantly different to the expert prediction of around 200,000 infected).

So, we can try and make things more realistic by adding some real life considerations. The current value of β (the transmission rate) is 1.07, i.e an infected person will infect another 1.07 people each day. We can significantly reduce this if we expect that infected people are quarantined effectively so that they do not interact with other members of the public, and indeed if people who are not sick avoid going outside. So, if we take β as (say) 0.6 instead we get the following table:

Here we can see that this change to β has had a dramatic effect to our model. Now we are predicting around 129,000 infected after 14 days – which is much more in line with the estimate in the paper above.

As we are seeing exponential growth in the spread, small changes to the parameters will have very large effects. There are more sophisticated SIR models which can then be used to better understand the spread of a disease. Nevertheless we can see clearly from the spreadsheet the interplay between susceptible, infected and recovered which is the foundation for understanding the spread of viruses like COVID-19.

[Edited in March to use the newly designated name COVID-19]