top of page

The Calculus behind the Coronavirus

As the Sars-CoV-2 virus (aka the coronavirus) continues to proliferate through the world, people start to ask long-term questions, such as 'how many people are estimated to die from this virus?', 'there aren't that many people infected, why is it so dangerous?' and 'how long until it's safe to go out again?'. These questions, surprisingly, can all be answered by using mathematics.


To remove ourselves from this virus-infested world, let's imagine a completely new virus-infested world. This virus, dubbed the FakeFlu, is an infectious virus which also takes a relatively short amount of time to recover from. Rather than jumping straight into the maths, let's just simulate the spread of the virus using a bit of computer science and basic epidemiological modelling.


In epidemiology (the study of epidemics), one theory of the way diseases spread is known as the SIR model, standing for Susceptible, Infected and Recovered. Some people prefer the term Removed instead of Recovered, as some people don't recover from the disease, but die instead. As the optimist that I am, I will be using the word Recovered, in the hopes that the little dots we use in our model recover from the FakeFlu and don't die. The basic theory is that people who are susceptible move into the infected category based on a transmission coefficient, beta, and people in the infected category move into the recovered category based on a recovery coefficient, gamma. Keep these in mind, as they will come into play later on. For now, just watch the disease spread through the population of a small town of dots.




As you can see, the FakeFlu spreads slowly at first, only infecting maybe 2-4 new people per day. Eventually, however, as the number of infected people rise, the number of people who are in direct contact with those infected (the 8 surrounding squares) are more likely to become infected, so the total number of infected people rise. Notice, however, that there is only a thin band of infected people at the outer edge of the 'ripple' of the spread of the disease. This is important as even if only a small proportion of the population are actually infected, the disease can still spread very quickly, and at the end of 31 days the entire population has been infected at one point or another. This is why even small numbers of currently infected people can be a cause of major concern, as it doesn't mean that the disease has died down completely.


Now, if we start to graph the number of susceptible, infected and recovered people over time, we begin to see some shapes which are very common in maths - and if it's common in maths, the chances are we can use them to predict future cases, as long as they follow the established trend.




These shapes are known as the logistical curves (the suscpetible and recovered graphs) and a shape similar to a bell curve (the infected graph). Where do these shapes come from?


These formulae are governed by a series of differential equations. These equations tell us how the population of susceptibles, infected and recovered change over time. Using some basic assumptions about how diseases spread, we can produce three differential equations which, together, can tell us how the rate of change of these three populations over time.


The first is the susceptible differential equation. This equation tells us how the number of susceptibles change over time. This number is always negative, as (ignoring birth rates), the number of people who have never been infected by the virus must always go down, as they must be infected in order to move out of the susceptible group. Moreover, this number is proportional to the number of infected people there are at any given time - if there are more infected people, you have a greater chance of becoming infected. The number of infected people must be multiplied by beta, the transmission rate, as you will not become infected by every single infected person you come into contact with - the transmission rate accounts for this. It is also proportional to S/N, the number of susceptible people at any given time divided by the population. This is important, as if the number of susceptible people only represent a small proportion of the population, then this means most people have either been infected or have recovered, and so they can't become infected again. This means the disease has very few new hosts, and so it can't spread as fast.


The first part of the infected differential equation is similar to the susceptible differential equation. Since the only place for susceptible people to go is the infected category, the number of people who leave the susceptible category per day must equal the number of people who join the infected category per day, ie the rate of decrease of susceptibles is the rate of increase of infected. The rate of change of the infected population also includes a second part - the number of people who leave the infected category. The number of people who leave the infected category is proportional to the number of infected - if more infected people exist, more of them will recover. Now, not every single infected person recovers at the same time or at the same rate, and the recovery coefficient, gamma, comes into play here, which controls the average number of people who recover per day. Adding these two components together gives the overall rate of change of infected people.


Finally, the recovered differential equation is very straightforward. The number of people who recover per day depends on the average number of recoveries per day, gamma, and the number of infected people overall, by the same reasoning as in the infected differential equation. Therefore, dR/dt is gamma*I.



Actually solving these differential equations is difficult, and it took the 1918 Spanish Flu, which killed 2% of the world population at the time, to show people that they needed to be solved and was (partially) the cause of the funding into this research. However, using a numerical method known as Euler's Method, we can use this differential equation to approximate the true curves.


Euler's method is very straightforward. It uses the differential equation to produce a gradient at a point, which can be used to produce a line tangent to the actual equation at that point. It then assumes that the tangent line is a 'good enough' approximation for the graph close to the original point. We can then use a small change in x to find the approximate change in y of the original function using the tangent line to find a new value of y. It then repeats this process to approximate the original function. To approximate the function more accurately, simply use smaller changes in x.



Finally, we can use this method to produce a graph which could predict the effects of our FakeFlu disease. Here we go!


Interestingly, our differential equations tell us that the infected population would reach a much higher peak much earlier than the actual population trends followed. This shows that our model was not perfect - in fact, it was better than perfect. It predicted a worse case scenario than what actually happened, which (hopefully) would instigate world leaders into taking more drastic action to prevent the disease from become as bad as the model.


For fun, I decided to use this model and true epidemics to simulate how the disease would spread through our small town of dots. Below are the models for Cholera, a disease which ravaged industrial England and France, as well as Ebola, the disease that made headlines in 2014-2016.


Cholera:


Ebola:


I'll leave the reader to make their own deductions about the severity of the disease, and the model's successes and failures. Thank you for reading!

Comments


Post: Blog2_Post
bottom of page