COVID-19 DYNAMICS
by
Garrett A. Hughes
Principal Architect
ModelingComplexSystems.net
Although they have great potential utility, computers are practically useless without a fundamental understanding of how engineering systems work. | ||
―Steven C. Chapra and Raymond P. Canale― |
Part One: Infection Dynamics
Abstract
Modeling and Simulating the infection dynamics of a system consisting of the SARS-CoV-2 novel coronavirus in an exposed human population.
Describes how to construct a discrete event System Dynamics model of the system. Defines important state variables and system parameters. Simulations focus on description of the infection rates under different initial conditions. Illustrates counterintuitive nature of system behavior at low rates of infection. Part One of a three part series. Additional series articles to include (1) recovery/mortality/health care impacts, and (2) protective measures effectiveness. For demonstration purposes, this web-based article facilitates distribution of the Java language program constructed for this project. All .txt data files associated with the described simulations are made available. The .txt files include additional data of interest. This website provides access to ongoing simulation results associated with this and future articles using an easily accessible table format. A link to a private WordPress discussion forum is provided where all aspects of this and future articles may be discussed.
Introduction
This is the first in a series of articles that will address questions related to the worldwide spread of the coronavirus, SARS-CoV-2, and the disease that it causes, COVID-19. These articles will feature the construction of a discrete event System Dynamics model that will help answer these questions, and aid us in making the choices needed to combat the spread of this virus.
Questions that we will address in this series of articles include:
- How fast will the virus spread
- What is the probability of any individual becoming infected
- How many people will become infected
- What is the potential mortality rate
- What measures can we take to prevent the spread of the virus
- How effective will these measures be
- When will this be over and activities returned to normal
- What will be the nature of the loading on health care systems
Disclaimer: These articles do not provide answers to questions that can be construed as personal medical advice.
The information contained in these articles is designed to help us all make the best possible choices to defend ourselves against a biological threat to our lives and livelihoods.
Why this Modeling Effort?
You may wonder why the world needs another model to describe this epidemic when there are so many models of an epidemiological nature already available. Some of these are described in an excellent text by Fred Brauer et al. Mathematical Models in Epidemiology. Other epidemiological models of a networked nature are covered by Mark Newman in his equally informative text Networks: An Introduction. Yet we persist.
First and foremost, since this is primarily an educational website, we want to demonstrate how models, in general, can be built from scratch to solve problems. What better opportunity to do this, than now, amidst a problem whose relevancy derives from both its local and global nature.
Second, although we might hope that all models are created equal, they are not. That does not necessarily mean that they cannot be accurate in their predictions, it is just that some may be more difficult to understand, and more difficult to use. By building our own model, using techniques with a proven reputation for transparency, we can tailor a model to our own needs, while fully understanding what's going on underneath the hood. We also get to choose the tools with which to build and interact with the model, and have our choice of the programming language in which to construct it.
Finally, we acquire the benefits of having to fully understand a system in order to build a model with the appropriate system and problem behavior. And our using the model will allow us to evaluate the truth or falsehood of statements about the nature of the system's behavior, or the efficacy of a particular solution.
Overall Plan
We plan to construct the full model in three parts: the first of which is described in this paper. This part deals only with representing the infection rate. We will model how the system behaves under different parameters associated with the incubation period. And we will examine the effect of varying the number of contacts that an infected individual can have with the susceptible population on a daily basis. In part two — a separate article— we will construct the recovery portion of the system, and in part three — another article — we will add the effect of protective sheltering, often refereed to in the literature as suppression.
We will begin with the simplest possible configuration of the systems of interest. We will add complexity only as required by evidence obtained from real world scenarios, or as a function of our insight into the potential behavior of the systems. Our plan is to build a model that will scale accurately from very small to very large populations existing in regions that vary greatly in density.
After part three has been completed we will attempt to verify the accuracy of the complete model by comparing its predicted behavior with that of real world systems. In any event, the model should at least be able to predict the expected behavior modes of real world systems. In systems containing multiple feedback loops, such as these, predicting behavior can be far from intuitive. The results of this analysis will be compiled in yet another article.
Problems Addressed in This Article
One of the problems that we address in this article is how fast this virus can spread in an exposed population. We will be able to simulate the spread of the virus in a population of arbitrary size: this includes both very small and very large populations.
We will look at the influence of the length of the incubation period on the rate of infection. We will determine the probability of individuals becoming infected during the course of the epidemic. We will explore how changing the opportunities that an infected individual can have to interact with the general population will affect the infection rate. And we will see how the infection rate varies with the sizes of both the infected and susceptible populations.
We will also be able to estimate how quickly an epidemic of this nature can peak in a fully exposed population. That in itself should be frightful enough to keep us motivated to combat the spread of the virus, but also direct our attention to find an optimal solution as quickly as possible.
Our Approach
We will model system behavior using the approach known formally as System Dynamics. We will modify this approach as needed to have the model represent discrete rather than continuous events. We will code the model using the Java programming language. The source code will be available for download.
The construction of this model will allow us to duplicate the capabilities of two generic, mathematical, epidemiological models: the fully- mixed susceptible-infected model, aka the SI model, and the fully-mixed susceptible-infected-recovered, or SIR model. We will, however, expand on their capabilities.
Assumptions
We are limiting our modeling and description of COVID-19 dynamics in this article to infection dynamics.
We assume that once a person becomes infected and is incubating the virus, that they are capable of infecting others in the susceptible population. Once their incubation period completes, they present with symptoms and are isolated from the susceptible population. We are aware that this is not always the case, but are making this assumption to simplify our initial model. If necessary, we can always include this and any other exceptions in the model at a later date.
In this initial modeling effort we assume that the incubation period can be characterized by a uniform probability distribution resulting in an integer random variable whose minimum value is one day. Other distributions may easily be implemented as required. In the case of the distribution being uniform, the user gets to choose the minimum and the maximum value of the incubation period. Each person in the susceptible population who becomes infected is assigned their own incubation period. In this sense the incubating population is agent-based (modeled individually).
Once a person presents with the disease we keep them in the presenting population for the remainder of the simulation. They have no further effect on the susceptible population. Again we realize that caregivers may indeed become infected while treating the sick, but we have obviated that possibility at this time. If necessary we will add that aspect in the model when we include the recovery section.
The most important assumption in the model is that of the maximum infection multiplier. This is the maximum number of people in the susceptible population to which the virus can be transmitted on a single day by a person incubating the virus. This value sets an upper limit on the transmission rate. The user inputs a value for the maximum infection multiplier at runtime. During a simulation run, The actual value used for the infection multiplier depends on the fraction of people remaining in the susceptible population. This assumption is the same assumption made by epidemiologists when building a generic epidemic model that they refer to as an SI model.
We assume that the total population does not change during a model run, and we use this number as a checksum of the three different populations that we simulate.
Causal-Loop Diagram
A causal-loop (CL) diagram is a significant feature of modeling in the System Dynamics paradigm. A CL diagram is a means of expressing the logical behavior of a system as a linked series of cause and effect relationships. That is, the effect in one relationship becomes the cause in the next. One can also think of a CL diagram as a logical argument, where each entity in the diagram represents either a premise or a conclusion depending on its spatial position as a member of an ordered pair. CL diagrams can be and often are n-dimensional (although difficult to depict in other than three space!). Another way of thinking about CL diagrams is as a network of relations, where a relation can be inferred by the presence of a linking arrow between two objects. In that vein we refer to a CL diagram as a directed graph harboring a network of relations. (For a precise definition of the digraph of a relation see Bernard Kolman et al., Discrete Mathematical Structures, ch 4, 4 ed.)
System dynamicists are mildly obsessed with feedback loops and consider them to be the most important logical means of discovering the behavior of a system. Drawing Cl diagrams is a relatively straightforward way of depicting feedback loops in a system. Going one step further, we define a system as being "complex", if it contains one or more feedback loops..
The real advantage of CL diagrams is that they can be used to determine the fundamental, behavior-driving aspects of a system. They are to system dynamicists what Feynman diagrams are to physicists: simple but loaded with information content. An experienced system dynamicist can intuit much of the behavior of a system simply by looking at a CL diagram. If new to the game, just drawing a CL diagram will help reveal much of the complex behavior of a system.
Drawing CL diagrams is the first step in a System Dynamics analysis of a system. It is the first in a series of steps designed to capture both the qualitative and quantitative aspects of a system's behavior. Once drawn, CL diagrams are translated into a more formal diagram using symbols that have distinct semantic content. These depictions are called Flow Diagrams and we will encounter them in the next section. We will use them to add considerable detail to our model description. Flow diagrams are the system dynamicist's equivalent of the UML diagrams used to design computer software. In fact, they are used for the same purpose.
At this point we would like you to open the CL diagram that we are using to represent the COVID-19 infection dynamics.
Infection Dynamics Causal-Loop Diagram
We have discovered three feedback loops that drive the behavior of infection dynamics. These loops connect three populations of interest: the susceptible population, the incubating population, and the presenting population.
Starting with the susceptible population we notice that it is being depleted by the infection rate. An increase in the infection rate will cause a decrease in the susceptible population. This relation is one of inverse proportionality and that fact is displayed by placing a minus sign next to the arrowhead of the link connecting the infection rate to the susceptible population.
The susceptible population in turn influences a variable called the infection multiplier. The infection multiplier is the number of persons per day that an infected individual can infect. An increase in the susceptible population will cause an increase in the infection multiplier. To show that this relation is one of direct proportionality, a plus sign is placed next to the arrowhead leading from the susceptible population to the infection multiplier.
Two important additional relations affect the infection multiplier: the total population and the maximum infection multiplier. Note that an increase in the total population would result in a decrease of the infection multiplier, and that an increase in the maximum infection multiplier would increase the infection multiplier. What the maximum infection multiplier actually does is increase or decrease the range of the infection multiplier.
Note: The fact that a decrease of the maximum infection multiplier would cause a decrease in the infection multiplier does not warrant a minus sign next to the arrowhead of the link. That sign is determined by the proportional relationship. A proportional change warrants a plus sign, while an inversely proportional relationship warrants a minus signThe infection multiplier in turn influences the value of the infection rate, which increase as the infection multiplier increases.
Now we are prepared to discuss the influence of this loop itself on the behavior of the system. We will refer to it as the susceptible population loop. Note the presence of a minus sign in the center of the loop. Such loops are referred to as negative feedback loops. We give the loop that designation because the combination of relations that comprise the loop act together to keep the system in a stable state. The word homeostasis comes to mind. It works like this.
Suppose the infection rate were to increase for some reason, all else being equal. That would cause the susceptible population to decrease, which would in turn cause the infection multiplier to decrease. And here is the important part: a decrease in the infection multiplier would cause the infection rate to decrease. When a change in a variable in a loop causes a change in the opposite direction to that same variable, after the perturbation has traversed the loop, we call that loop a negative feedback loop. This is the mechanism of homeostasis at work. A perturbation in one direction is always compensated for by a change in the opposite direction once the initial signal has traversed the loop.
Contrast that behavior with the next loop in the series. An increase in the infection rate will cause an increase in the incubating population, which will in turn cause an increase in the infection rate! The initial perturbation is amplified over and over as it traverses the loop. Compound interest is generated in this fashion. Both the infection rate and the incubating population will grow without bounds. This is the cause of exponential growth or decline in a system. Note that a decrease in the infection rate will cause a corresponding decrease in the incubating population and will continue until the incubating population is depleted. We are going to see both exponential growth and decline of the infection rate when we simulate the behavior of this system. Remember that the susceptible population loop acts to keep the infection rate under control. A loop that generates continual amplification of a signal is called a positive feedback loop, and is indicated by placing a plus sign in the center of the loop.
When an object or variable is a member of two loops simultaneously, it will be acting in response to two different signals. When the loops are of opposite polarity, as these are here, you can expect some very interesting behavior. For some period of time one loop may remain dominant, and then as conditions change, the other loop may assume dominance. We'll witness the dramatic effects of this competition in the rapid rise and decline of the infection rate during a coronavirus epidemic: particularly the one we are experiencing while this article is being written.
The last loop in this chain harbors the presentation rate, that is, the rate at which asymptomatic, infected individuals actually present with symptoms. The presentation rate acts to deplete the incubating population: its goal being to bring the incubating population to zero. Not all negative feedback loops have a goal of zero. Your body's goal is to maintain a fixed core temperature ranging between 36.1°C and 37.2°C. Your body accomplishes this feat using a highly redundant set of control systems regulated by negative feedback loops. Note that this last loop also acts to deplete the incubating population based on the incubation period.
Follow the bouncing cursor. An increase in the incubation period will cause a decrease in the presentation rate, which in turn will cause an increase in the incubating population and thus cause the infection rate to grow even larger. That is, the longer an infected individual incubates the virus, the faster the infection rate will become.
Awareness of the presence and location of positive and negative feedback loops within a system, can act as a guide to suggesting ways to change the behavior of a system in a desired direction. On the other hand feedback loops explain why it is so difficult to manage problems in biosystems like human beings with simple cause and effect solutions.
The last link in the chain from presentation rate to presenting population is not a part of a loop. The link indicates that a rise of the presentation rate will cause a corresponding rise in the presenting population. A fall in the presentation rate will not necessarily cause a decline in the presenting population but will act to slow its rate of growth. This is another connotation of the plus sign associated with a positive link polarity.
Some feedback loops are very large. In fact in this diagram you can traverse a loop which begins and ends at the presentation rate, tracking through the entire system in the process:
presentation rate -> incubating population -> infection rate -> susceptible population -> infection multiplier -> infection rate -> incubating population ->presentation rate
How can you determine the polarity of this loop or any other without having to account for each and every positive and negative sign in the loop? Simply count the number of negative links. If the number is odd, it is a negative feedback loop. If the number is even it is a positive feedback loop.
The fact that this larger loop is positive is an indication that the system responsible for the infection rate is inherently unstable. This should give us cause for concern. We are certainly directly experiencing the nature of this instability at this time giving us great cause for concern.
Note: Because we assume that most people reading this article will be unfamiliar with CL diagramming, we have towed the party line in our description. If you read the first textbook on this subject (Michael R. Goodman, Study Notes in System Dynamics, 1974), you will find that our description does not deviate from his. However, experience has shown that the polarity of feedback loops cannot always be predicted a priori, nor can we say for sure without further understanding, what the actual nature of the relations are that make up the nodes and links of a CL diagram. We may be dealing with a nonlinear function that can exhibit both amplification and degradation of a process (but not at the same time). If this is the case, the polarity of a link and the loop may change dynamically at runtime. Ideally we should have our quantitative model indicate the polarities graphically at runtime. Nevertheless, drawing CL diagrams is an extremely valuable means of discovering feedback loops, and well worth the time and effort to construct them. By sharing our CL diagrams with our colleagues and readers, we commit to an interpretation of our concept of the nature of a system. This allows others to critique and enhance our concept of a system's structure and potential behavior.
Flow Diagram
The System Dynamics methodology uses flow diagrams to specify the quantitative nature of a system. Flow diagrams are constructed with symbols that have specific meaning. System Dynamics and flow diagramming were introduced by Jay Forrester in the late 1950s and early 1960s (See his work Industrial Dynamics, 1961). System Dynamics was expanded to include Cauasal-Loop diagramming, the best earliest examples of which can be found in the work of Dennis Meadows et al. in Dynamics of Growth in a Finite World (1974).
At this time you should open the Infection Dynamics flow diagram, as it will be the centerpiece of the discussion that follows.
Infection Dynamics Flow Diagram
Our discussion will focus on the infection dynamics of the COVID-19 disease. Infection dynamics are modeled in the section of the flow diagram enclosed by the red dotted lines. We have included other sections of the complete model to illustrate those aspects of the system that we will be discussing in articles Part Two and Part Three of this series. The sections of the flow diagram illustrating these Parts will be modified at that time to include their full scope. We have simply stubbed them in here.
The symbols in the flow diagram correspond to those which can be found in Goodman (op. cit.), with one exception—the queue—which we will be discussing in detail.
The rectangles in the diagram are know as "Levels" and their function is to represent the accumulation of objects or value within the system. By objects we mean just about anything that can increase or decrease in number or amount, and whose numerical value is preserved from time step to time step during a simulation. In our case the levels are keeping track of the sizes of the susceptible population, the incubating population, and the presenting population.
We are modeling only infection dynamics in this article. Therefore it is not necessary to release the presenting population into the recovered state. We assume that the presenting population plays no role in infecting other people, as visibly ill patients are normally isolated from the susceptible population at the time they become symptomatic. We realize that health care workers may indeed become infected as a result of their work, but consider their numbers to be small relative to those being infected by the incubating population. This assumption can always be addressed at a later time.
We model rates of change with symbols that are intended to represent valves. Some System Dynamics models present in the literature use a more conventional symbol for a valve that does not include the attached rectangle. In either case, these symbols are known as "Rates". The rates in our model are the infection rate and the presentation rate. The infection rate models the number of people per day who become infected but remain asymptomatic. They are moved out of the susceptible population and into the incubating population. The presentation rate is the number of people per day that present with symptoms. These people are moved from the incubating population to the presenting population.
Of note is the term "queue" appearing in both the incubating population and the presenting population levels. In effect, people who enter these levels are placed in a multiserver queue and remain there based on the number of days that they will be incubating the virus. No such modeling artifact exists in a traditional Systems Dynamics formulation of a level. Yes, there is an entity called a "Delay" but it is not of a discrete nature and the time spent in a delay is based on an exponential distribution. Here we need a discrete formulation to accurately generate the given number of people who will exit the incubating population when they become symptomatic. We will discuss just how those queues are implemented in the "Code" section of this article.
The infection rate is the rate at which people become infected with the virus, and are moved from the susceptible population to the incubating population, albeit without symptoms. The infection rate is computed as a function of the incubating population and a variable called the infection multiplier. The infection multiplier is called an "auxiliary variable". Auxiliary variables are represented using circles. They are used to compute any variables that are not rates or levels. There are several different varieties of auxiliary variables in the System Dynamics toolkit. We show only two of them in the flow diagram.
The dotted lines ending in arrowheads indicate the source of a variable used in the computation of another variable. Notice that the infection multiplier is computed using the current value of the susceptible population along with two other sources. These other sources are constants, which are represented by a short solid line segment. Constants are either coded into the model if they are natural constants like the value of Pi, or entered by the user at runtime. In this model both the total population and the maximum infection multiplier are entered at runtime.
One of the variables of most interest during an epidemic is the probability of becoming infected on any given day. This probability is computed and reported by the auxiliary variable in the model called the probability of infection. We compute it by dividing the infection rate for a given day by the susceptible population at the beginning of that day.
The incubation period assigned to any individual simulated in this model is generated by a special type of auxiliary variable. Its semantic is designated by a double circle because it is a stand-alone function. The function we are using here distributes the incubation periods in a uniform fashion between a minimum and maximum entered by the user at runtime. There is nothing sacred about the use of this function. Other functions can easily be coded and added to the choices available to the user. The incubation period for this coronavirus has been cited as having a range from 2 to 14 days, with a mean of 5 days. This is obviously not a uniform distribution (the mean is not 8), but appropriate values based on current statistics can be used to build a distribution that does meet these requirements. Just be aware that the distribution that we are currently using assigns a value for the incubation period to each incoming member of the incubating population with equal probability.
We will discuss many of the functional relationships we have mentioned here in the following section, where we will review the code that implements these functions.
Technical Note: System Dynamics (SD) in Goodman's (op. cit.) and Forrester's (op. cit.) formulation was designed from the bottom up to model behavior that takes place in continuous time. SD can be used elegantly to model both physical and human events that occur in continuous time. These include phenomena such as the behavior of the planets in the solar system, or the average growth and decline of the GDP in a national economy. Normally the user of SD uses a commercial modeling language and integrated development environment (IDE) to build a model. The mathematical basis [pun intended] of a traditional SD model is that of a Markov process using a probability transition matrix to drive the system from one state to the next. That rate is chosen by the user to be consistent with minimizing numerical roundoff and truncation errors. This makes for rather lengthy simulation runs if the time constants of the rate equations are orders of magnitude smaller than the specified time horizon. (Known as a a "stiff" model in the jargon)
One of the imposed side effects of this Markov approach is that every Level in the model grows and declines at an exponential rate, which is not always suitable for modeling a system's behavior. Modern SD languages obviate this inherent behavior to some extent. In light of this constraint we feel that the best use of SD is to take advantage of the descriptive power of causal-loop and flow diagram illustrations, then write our own one-off code for each model. In that way we obviate any restrictions that the ideology of its design and the IDE may place on modeling system behavior.
Code
Now it's time to take a look under the hood.
For purposes of the following discussion you will not be required to install and execute the code; just follow along as we explain what it does. Download the PDF file of the code here:
COVID-19 Dynamics Java Code PDF
If you want to create an executable you can download a zip file containing the Java source code here:
COVID-19 Dynamics Java Code ZIP
Note: the COVID-19 Infection Dynamics model is written in the Java programming language derived from version 13.0.2 of the JDK. The code is designed to run from the Command console. See our website (the one you are probably reading this from now) for detailed instructions on how to install the JDK and build an executable program. To obtain the instructions navigate the following path: Table of Contents -> Studies Year One -> Problems. Consult the solutions to problems 1 and 2.
It will help to have the flow diagram figure open as well.
The following code descriptions are preceded with the line numbers of the code.
- 1-17
Comments to document the nature of the program, its creation date, version number and author.
- 19-26
Code that will be imported from the Java programming library.
- 29
Top of the container class holding our code. The container is called COVID19.
- 31:54
Variable names and declarations. These variable live all the while the program is running. Note the variable names that correspond to those of the objects in the flow diagram.
- 61-298
Functions (known as methods in the Java programming language) that are called during the execution of the program. This keeps the main code from becoming overly cluttered and makes them easy to find and maintain.
- 299
The beginning of the main program.
- 311-316
Calls to the functions that provide the initial values and parameters to the program.
- 318
The declaration of the "queue" that holds the members of the incubating population in groups according to their collective incubation periods.
- 322-331
Code that initializes the queue. This is not really a queue as normally written in code, but more of a shift register, which shifts its queue population toward the head of the queue each day by one position. When a group reaches the head of the queue—the zeroth location in the register—that group is moved from the incubating population to the presenting population.
- 333-359
Documentation on how the incubation period is assigned to each member entering the queue,and how they are placed in the appropriate location in the shift register which is implemented by an array type.
- 365
Beginning of Time (figuratively speaking).
- 368
Open a buffered file to catch the data generated by the program . This is the same text file used to generate the graphs of the values of the variables as they change throughout the program run. A failure to open the file as requested will generate an error which is "caught" and ignored near the bottom of the main code, but causes the program to fail more or less gracefully.
- 374
Beginning of the loop which the program executes for each day in the simulation.
- 385
Computation of the infamous infection multiplier. In essence the infection multiplier is a linear function of the form y = mx + b, which you know from high school algebra is the equation of straight line. Here the infMul takes on the role of y, the MAX_infMul is the slope m, and the ratio of susceptible population to the total population plays the role of x. The constant b is equal to zero. The effect of this function is to lower the value of the infection multiplier as the susceptible population decreases relative to the total population. This simulates the difficulty that an infected person has of infecting a member of the susceptible population as that population decreases. This function need not be linear, but this assumption seems to work fine in practice.
- 396--451
Computation of other variables in the model which are well documented in the code.
- 454-461
Computes the current population of the incubating queue by summing up the number of members of each group waiting at their days-to-go position in the queue.
- 469 -471C
Computes the checksum by which we assure that the total population in the model remains the same over the entire run. Great pains are taken in the code to use integer arithmetic to assure integer values are always assigned to population variables. This works pretty well except when the populations get up in the billions. We lose track of a few members at this stage of the game, but still maintain a pretty good track record!
- 478
End the loop which ends a days work. The program loops back to the "do" statement (line 374) if there are more days to go. Otherwise the program falls out of the loop. Cleans up after itself. Prints the current time. and falls through the catch phrase to end the program.
- 487
The catch phrase is only executed on an "IOException" error which is tossed out if our data file fails to open on request for some reason.
- 492
Done executing the main program.
- 494
Bottom of the container class called "COVID19", which is holding our code.
Similitude
In his excellent text on networks (op. cit.), Mark E. J. Newman devotes a densely packed chapter to epidemics: both the networked and non-networked
variety. In his first model description, Newman explains the nature of the SI epidemiological model. This is the simplest of approaches to modeling an epidemic. The reason for such a designation lies in its assumptions, which make use of a "fully-mixed or mass-action approximation", and the fact that the infected portion of the population remains as potential contacts. Here is Newman's description of this garden variety approach.
Quote: it is assumed that every individual has an equal chance, per unit time, of coming into contact with every other— people mingle and meet completely at random in this approach. This is, of course, not a realistic representation of the way the world is. In the real world, people have contact with only a small fraction of the population of the world, and that fraction is not chosen at random, which is precisely why networks play an important role in the spread of the disease. Nonetheless a familiarity with traditional approaches will be useful in our study of network epidemiology, so we will spend a little time looking at the basic principles.
If there is one thing we have learned over the years about modeling a system, it is: always start with the simplest possible representation. If that representation works to describe the behavior of the system, and is also capable of mimicking the problem of interest, model no further. That is exactly the approach we are taking in developing our model of COVID-19 dynamics. We are starting with the simplest possible representation of the system. At some point we will compare its predictions with what is going on in our unreal world, and make modifications only as needed.
The model in this article only attempts to model the potential infection rates of a fully exposed population in the real world system. That is why our formulation is similar to the SI (susceptible-infected) model familiar to epidemiologists. Interestingly enough, our model was derived on first principles, quite independently of any description of the SI model. Our next article will include the recovering population, and most likely would resemble the well known SIR (susceptible-infected-recovered) model. But you are in for a surprise; we may already be at that level with our infection dynamics only model.
In Newman's presentation of the SI model, he derives an expression for the infection rate. His expression reads
βSX/n
which has units of [ persons/unit time ]. The symbols are defined as follows
- β
The average number of contacts an individual has with other members of the extant population per unit time (includes both infected and susceptible individuals) [(persons/unit time)/person]
- S
the number of susceptible individuals in the population [persons]
- X
the number of infected individuals in the population [persons]
- n
the total population [persons]
- S/n
the probability that any person you meet is in the susceptible population [dimensionless]
- βS/n
number of contacts that an infected person has with a susceptible person per unit time [(persons/unit time)/person]
Thus ( βS/n )*X gives the average rate of new infections [persons/unit time]
Compare this result with our formulation of the infection rate (infR) as it appears in the code.
- infR = infMul * popInc [persons/day]
- infMul
the infection multiplier [ (persons/day)/person ]
- popInc
incubating population, i.e. infected individuals
- infMul = Max_infMul * (popSus/popTot) [ (persons/day)/person ]
- MAX_infMul
maximum infection multiplier [ (persons/day)/person ]
- popSus
susceptible population [persons]
- popTot := total population [persons]
infR = Max_infMul * (popSus/popTot) * popInc
The two formulations are the same in form with one subtle but important difference. Our infected population is not the same as the infected population in the SI model. Our infected population, which acts on the susceptible population, consists only of those individuals who are incubating the virus. The infected individuals in the presenting population play no role in the actual infection process. For that reason the asymptotic behavior of the susceptible population approaches but does not reach zero. Some portion of the susceptible population should escape infection altogether! Newman even shows this behavior for the SIR model in Figure 17.2 of his text.
Our model will decrease the infection rate much faster than the SI model. You can clearly see why by looking at the logic displayed in the causal-loop diagram. An increase in the presentation rate causes a decrease in the incubating population. A decrease in the incubating population causes a corresponding decrease in the infection rate.
But notice: here's where our initial logic, embodied in the causal-loop diagram, breaks down. The link polarities indicate that a decrease in the infection rate will cause an increase in the susceptible population, which will in turn cause an increase in the infection multiplier, causing a corresponding increase in the infection rate. What's going on? It turns out that the relation between the susceptible population and the infection rate is a nonlinear one. If you were to plot infection rate as a function of the susceptible population you would witness the nonlinearity for yourself. In fact, it is this nonlinearity that causes the presenting population to grow in a logistic fashion. But we are getting way ahead of ourselves here. We'll do that plot after we run some simulations. Just be aware that causal loop diagrams have this limitation when viewed as a simple linear cause and effect relation.
Even though Mark somewhat grudgingly describes the mathematical formulations of the SI and SIR models, in his quest to move on to something that he deems more appropriate, we feel that our "SI/SIR" approach is, in reality, not far off the mark of reality, and may indeed suffice to describe the behavior of the systems of interest we are studying. Recent events have demonstrated how quickly SAR-CoV-2 can move through a region, almost as if the region consisted of a fully mixed population. Indeed, most of us have quite a bit of mobility, and take advantage of that mobility in our daily lives to go shopping, visit with friends, get gas, receive packages from across the country or world, gather in churches and schools and regional events, and much more, on a daily basis. It is only our living quarters that remain in a fixed location connected by a network of roads. We ourselves are not constrained to remain there—at least not in the absence of an epidemic. (We will look at the efficacy of "sheltering" in article three of this series.) It is because of our personal physical mobility that the virus has so many opportunities to mix into our susceptible populations, generating a large number of random encounters. For that reason an SI/SIR modeling formulation may turn out to be a viable approach.
Simulations
The following I/O table provides one location on this website for viewing all of the current and future runs that we will associate with the "COVID-19 Infection Dynamics" model. Please open the table for viewing at this time.
COVID-19 Infection Dynamics I/O Table
Every run shown in the "Output" column of this table is associated with an analysis, whose title appears in the "Analysis Title" column of the table. A "run" consists of all the data generated by the model during a simulation carried out on a daily basis for the time horizon specified. The data is arranged in columns headed by the name of the variable associated with the data for each column. All of the data names can be found in the flow diagram or in the code. Clicking on a title in the "Analysis Title" column will open a PDF file generated from an Excel spreadsheet (just in case you don't have Excel handy). If you want to do your own analysis of the data, you can click on a run number in the "Output" column. The data for that run will be downloaded as a .zip file. Once unzipped (right click on the file name and select "Extract All") , a text file containing all the data from that run will appear. You can load this file into the data processing software of your choice (ours just happens to be Excel or MATLAB).
We only have time to do a partial analysis of each run, so there is plenty of "new" information and understanding to be gained from doing your own data analysis.
All the input parameters that were used in a run are given in the "Input/Output" table. Blank cells take the value of the first non-empty cell above it. The term "RMA" when referring to a population is an abbreviation for the census term "Rochester Metropolitan Area".
Of course, you can always get the Java code up and running to generate any simulation of your choice, but that can be a major undertaking for someone not familiar with the Command console , the JDK, and the Java runtime environment (JRE). If you have problems doing just that, after reading our solutions to the Problems posed for "Studies Year One" (the link is in the TOC), we suggest you raise them in the Forum.
Be sure to check the Site News on the Home page for updates to articles that may include references to new analyses and runs, as well as new articles themselves.
Note: For quick access to this and future I/O Tables, you can find the appropriate link in the Featured Articles Index.
On the Use of Models and Modeling
We use modeling in our everyday lives to make decisions that constantly influence our behavior. This process goes on in our minds in both a conscious or unconsciousness manner. We use these models to form policy on which we act. We are usually fully aware of the expected outcomes and their utility. Even when we must decide between one or more options where the utility to us varies depending on the context. Many of these policies, decisions and outcomes are so ingrained we don't need to give them much thought.
Thought process: "It's Friday night, pizza night. Let's order in." The expectation is that the pizza will be delivered within the hour. The only critical decisions to make are related to the toppings. We trust our decisions on this type of reasoning because we have ordered pizza for a long time, and it has always been good to excellent, especially if we have ordered it from the same establishment. We call these behaviors "habits". In reality they are well-established models in our minds of the way the world works. We don't have to do a lot of modeling (picking toppings) because we are well aware of the consequences, if we select incorrectly, and are sharing the pizza with someone else.
Along the same lines. You walk into a clothing store. Mannequins fill the store in all manner of dress designed to attract your attention to a particular style or fabric. These purport to be literal models of the way you will look if you buy the identical line of clothing. You probably have a great imagination, and reality doesn't set in until you try the outfit on in a dressing room. Good thing you didn't buy it on sale with a "no returns" policy before you tried it on. This is the way we test our mental models when we are about to make a decision that has expensive consequences. "Try before you buy".
Stated yet another way, Models help inform policy which in turn influences our decisions in an—if...then—manner. So much the better if we have the opportunity to test the outcomes of those decisions before we fully adopt a policy, and must accept responsibility for the consequences. At least this is the rational way to adopt policy and make decisions.
Note: For a comprehensive overview of public policy formation, from conceptualization through Implementation, we highly recommend consulting David L. Weimer and Aidan R. Vinig's text: Policy Analysis.
Human beings, as we well know, don't often follow these guidelines, especially if they are unaware of the consequences, choose to willfully ignore the consequences for whatever reasons, or are mentally or physically incapable of doing so.
The subject of human nature, and the propensity of humans to make both rational and non-rational decisions is a much discussed topic in the literature. For more insight into the subject we highly recommend the works of Steven Pinker, Robert Sapolsky, and Daniel Goleman. We link human biology and behavior to modeling because, when presented with numerical models, such as the one we are describing in his article, people react in unpredictable ways. Many people simply reject numerical models, especially those that deal with human biology or human nature: stating that they don't need a computer programmer to tell them what to do. And they are right in one sense. We do not build models to tell people what what to do. (Only the AI folks have that temerity!) We build models to explain the logical consequences of their decisions based on the policies under which they are operating. Some term those policies "intuition" which is nothing more than a synonym for experience.
What really motivates us to build numerical models to aid in decision making are two compelling factors: complexity and uncertainty. We need help of a nature that we can only get from computer programs and technology. We don't build models to recommend a course of action, but to describe a problem in the context of a complex system, give people as many ways of exploring both the nature of the system and the apparent problems, and then let them take the responsibility for adopting a policy leading to decisions that prescribe courses of action. Here we have used System Dynamics to lay out the logic associated with the behavior of a complex system system using causal-loop diagramming. By definition, you can't do this in your head. Some stubborn people insist they can by adopting a solution to a system-wide problem that depends on only one cause-and-effect link in the system. We have seen how this type of decision making can come back around the links in a system to bite you in the ass. That's why these people have trouble sitting down and listening to reason. For practice in seeing the real benefit of using causal-loop diagrams as essentially chained logic diagrams, we recommend applying them to solving problems found in a traditional logic textbook, such as that written by Patrick J. Hurley, Logic: A concise introduction.Note: We at MCS have expanded the nature of causal-loop diagrams, and refer to our formulation as inference diagrams, or inference networks.
The causal-loop diagrams in SD give us a place to start thinking about what the numerical models will look like. In particular they assist us in the hard thinking required to uncover the nature of a complex system.
That leaves us with just one more issue to discuss: Decision Making Under Uncertainty: Models and Choices. Fortunately Charles A. Holloway has written an eponymous book on the subject, which gives the topic far more play, erudition, and depth than we could, or have the space to do here. It turns out that many of the most important decisions we need to make in our lives, especially the life and death decisions associated with COVID-19, must be made under the pale of a great deal of uncertainty. Learning how to make the best choices under these conditions is one of the most helpful attributes of a computer based numerical model. We can run a complex simulation hundreds of times, in most cases, to get the best possible statistics we need to help us make decisions with the best possible outcomes. That approach will help us understand what Nassim Nicholas Taleb means when he says Fooled by Randomness: The hidden role of chance in life and in the markets.
We leave you with one recursive thought: thinking is the hard part.
References
- Brauer, Fred, Carlos Castillo-Chavz, and Zhilan Feng; Mathematical Models in Epidemiology, Springer, 2019.
- Chapra, Steven C., and Raymond P. Canale; Numerical Methos for Engineers, 2nd. ed., McGraw Hill Publishing Company, 1985.
- Ferguson, Neil M. et al.; Impact of non-pharmaceutical interventions (NPIs) to reduce COVID-19 mortality and healthcare demand, Imperial College London, DOI: https://doi:org/10 25561/77482.
- Forrester, Jay W., Industrial Dynamics, The M.I.T. Press,1961.
- Forrester, Jay W., World Dynamics, 2nd. ed., Wright-Allen Press, 1973.
- Goleman, Daniel, Emotional Intelligence:Why it can matter more than IQ, Bantam Books, 1995.
- Goleman, Daniel, Social Intelligence: The new science of human relationships, Bantam Books, 2006.
- Goodman, Michael, Study Notes in System Dynamics, Wright Allen Press, 1974.
- Holloway, Charles A.., Decision Making Under Uncertainty: Models and Choices, Prentice Hall, Inc., 1979.
- Hurley, Patrick J., Logic: A concise introduction, Cengage Learning, 12 ed.
- Kolman, Bernard, Robert C. Busby, Sharon Cutler Ross; Discrete Mathematical Structures, 4 ed., 1987, Prentice Hall,
- Lauer, Stephen A., and Kyra H. Grantz, et al.; The Incubation Period of Coronavirus Disease 2019 (COVID-19) From Publicaly Reported Confirmed Cases: Estimation and Application, Annals of Internal Medicne, doi:10:.7326/M20-0504, 10 March 2020.
- Meadows, Dennis L., William W. Behrens II, Donella H. Meadows, Roger Naill, Jøren Randers, Erich K.O. Zahn, Dynamics of Growth in a Finite World, Wright-Allen Press. 1974.
- Meadows, Donella H., Dennis L. Meadosa, Jørgen Randers, and William W. Behrens III; The Limits to Growth, Universe Books, 1972.
- Newman, E. J., Mark, Networks:An Introduction, Oxford UNiversity Press. 2010.
- Pinker, Steven, How the Mind Works,
W. W. Norton, Inc., 1997.
- Sapolsky, Robert M., Behave: The biology of humans at our best and worst, Penguin Press, 2017.
- Taleb, Nassim Nicholas, Fooled By Randomness: The hidden role of change in life and in the markets, Random House, 2004.
- Weimer, David L. and Aidan E. Vining, Policy Analysis, 5 ed., Routledge, 2011.