**Using Maths to model the spread of Coronavirus (COVID-19)**

This coronavirus is the latest virus to warrant global fears over a disease pandemic. Throughout history we have seen pandemic diseases such as the Black Death in Middle Ages Europe and the Spanish Flu at the beginning of the 20th century. More recently we have seen HIV responsible for millions of deaths. In the last few years there have been scares over bird flu and SARS – yet neither fully developed into a major global health problem. So, how contagious is COVID-19, and how can we use mathematics to predict its spread?

Modelling disease outbreaks with real accuracy is an incredibly important job for mathematicians and all countries employ medical statisticians for this job . Understanding how diseases spread and how fast they can spread through populations is essential to developing effective medical strategies to minimise deaths. If you want to save lives maybe you should become a mathematician rather than a doctor!

Currently scientists know relatively little about the new virus – but they do know that it’s the same coronavirus family as SARS and MERS which can both cause serious respiratory problems. Scientists are particularly interested in trying to discover how infectious the virus is, how long a person remains contagious, and whether people can be contagious before they show any symptoms.

**In the case of COVID-19 we have the following early estimated values: **[From a paper published by medical statisticians in the UK on January 24]

**R _{0}. between 3.6 and 4.** This is defined as how many people an infectious person will pass on their infection to in a totally susceptible population. The higher the R

_{0}. value the more quickly an infection will spread. By comparison seasonal flu has a R

_{0}. value around 2.8.

**Total number infected** by January 21: prediction interval 9,217–14,245. Of these an estimated 3,050–4,017 currently with the virus and the others recovered (or died). This is based on an estimation that only around 5% of cases have been diagnosed. By February 4th they predict 132,751–273,649 will be infected.

**Transmission rate β** estimated at 1.07. β represents the transmission rate per day – so on average an infected person will infect another 1.07 people a day.

**Infectious period** estimated at 3.6 days. We can therefore calculate μ (the per capita recovery rate) by μ = 1/(3.6). This tells us how quickly people will be removed from the population (either recovered and become immune or died)

**SIR Model**

The basic model is based on the SIR model. The SIR model looks at how much of the population is susceptible to infection (S), how many of these go on to become infectious (I), and how many of these are removed (R) from the population being considered (i.e they either recover and thus won’t catch the virus again, or die).

The Guardian datablog have an excellent graphic to show the contagiousness relative to deadliness of different diseases [click to enlarge, or follow the link]. We can see that seasonal flu has an R_{0}. value of around 2.8 and a fatality rate of around 0.1%, whereas measles has an R_{0}. value of around 15 and a fatality rate of around 0.3%. This means that measles is much more contagious than seasonal flu.

You can notice that we have nothing in the top right hand corner (very deadly and very contagious). This is just as well as that could be enough to seriously dent the human population. Most diseases we worry about fall into 2 categories – contagious and not very deadly or not very contagious and deadly.

The equations above represent a SIR (susceptible, infectious, removed) model which can be used to model the spread of diseases like flu.

dS/dt represents the rate of change of those who are susceptible to the illness with respect to time. dI/dt represents the rate of change of those who are infected with respect to time. dR/dt represents the rate of change of those who have been removed with respect to time (either recovered or died).

For example, if dI/dt is high then the number of people becoming infected is rapidly increasing. When dI/dt is zero then there is no change in the numbers of people becoming infected (number of infections remain steady). When dI/dt is negative then the numbers of people becoming infected is decreasing.

**Modelling for COVID-19**

N is the total population. Let’s take as the population of Wuhan as 11 million.

μ is the per capita recovery (Calculated by μ = 1/(duration of illness) ). We have μ = 1/3.6 = 5/18.

β the transmission rate as approximately 1.07

Therefore our 3 equations for rates of change become:

dS/dt = -1.07 S I /11,000,000

dI/dt = 1.07 S I /11,000,000 – 5/18 I

dR/dt = 5/18 I

Unfortunately these equations are very difficult to solve – but luckily we can use a computer program or spreadsheet to plot what happens. We need to assign starting values for S, I and R – the numbers of people susceptible, infectious and removed. With the following values for January 21: S = 11,000,000, I = 3500, R = 8200, β = 1.07, μ = 5/18, I designed the following Excel spreadsheet (instructions on what formula to use here):

This gives a prediction that around 3.9 million people infected within 2 weeks! We can see that the SIR model that we have used is quite simplistic (and significantly different to the expert prediction of around 200,000 infected).

So, we can try and make things more realistic by adding some real life considerations. The current value of β (the transmission rate) is 1.07, i.e an infected person will infect another 1.07 people each day. We can significantly reduce this if we expect that infected people are quarantined effectively so that they do not interact with other members of the public, and indeed if people who are not sick avoid going outside. So, if we take β as (say) 0.6 instead we get the following table:

Here we can see that this change to β has had a dramatic effect to our model. Now we are predicting around 129,000 infected after 14 days – which is much more in line with the estimate in the paper above.

As we are seeing exponential growth in the spread, small changes to the parameters will have very large effects. There are more sophisticated SIR models which can then be used to better understand the spread of a disease. Nevertheless we can see clearly from the spreadsheet the interplay between susceptible, infected and recovered which is the foundation for understanding the spread of viruses like COVID-19.

[Edited in March to use the newly designated name COVID-19]

## 9 comments

Comments feed for this article

January 28, 2020 at 10:34 pm

mrcreamermaths“β the transmission rate as approximately 1.07”

β represents the contact rate – which is how likely someone will get the disease when in contact with someone who is ill.

Have I missed something – how can the contact rate be > 1 ?

It is a probability isn’t it?

January 28, 2020 at 10:53 pm

Ibmathsresources.comThanks for you comment – I don’t think I worded the explanation very well. If you think of beta as the average number of disease-spreading contacts made by each infected individual per day then a beta of 1.07 (per day) means on average an infected person will infect 1.07 other people each day. I might try to make that clearer in the post! Sometimes the SIR models give a value for beta by dividing by N as well (i.e in this case 1.07/11,000,000), but in the equations above the division by N is already there.

January 29, 2020 at 7:41 pm

mrcreamermathsThanks – I found the explanation below, which clears it up for me.

I recall the 80s as a young man in my 20s. When no one was discussing the maths (no internet for geeks). But Maggie was a trained scientist and UK policy was very effective in persuading people to change their habits so that p, the transmission rate fell sharply which was good. And also γ, the total contact rate fell sharply, about which I was more equivocal.

https://en.wikipedia.org/wiki/Transmission_risks_and_rates

The effective contact rate (denoted β) in a given population for a given infectious disease is measured in effective contacts per unit time. This may be expressed as the total contact rate (the total number of contacts, effective or not, per unit time, denoted γ), multiplied by the risk of infection, given contact between an infectious and a susceptible individual. This risk is called the transmission risk and is denoted p. Thus:

β = γ X p

The total contact rate, γ, will generally be greater than the effective contact rate, β, since not all contacts result in infection. That is to say, p is almost always less than 1 and it can never be greater than 1, since it is effectively the probability of transmission occurring.

January 29, 2020 at 9:48 pm

Ibmathsresources.comThanks for the added information above about β. Looking at the pictures on the news of deserted streets in Wuhan, it looks like the Chinese government will be pretty successful in reducing the contact rate – so hopefully this will be effective in slowing any spread…

February 7, 2020 at 3:34 pm

Edith TangSorry may I ask if the link to the formulas for creating the SIR spread sheet still work? And why is the number of people removed 8200 on the first day?

February 7, 2020 at 6:39 pm

Ibmathsresources.comThe values are from the study published on 21st Jan.

S = 11,000,000 as there are around 11 million in Wuhan.

I = 3500 as there were an estimated 3500 (approx) infected

R = 8200 as there were an estimated 8200 (approx) recovered.

Looks like the website with the link to the excel formula is down.

February 7, 2020 at 10:51 pm

Edith TangThanks very much! Then may I ask which formula did you use for the excel sheet?

February 18, 2020 at 11:01 am

Moises LópezPlease! I need the formula to do my IA, if you can do another spreadsheet.with all the formulas, because I dont know about the interaction rate.

October 22, 2020 at 7:38 am

joedamn, when this was written there were “9,217–14,245” total cases now there’s 41,000,000