A little Pandemic
This post is part of a series of posts explaining some basic concepts of probability theory with the help of POPACheck. Check out the first post here.
A virus outbreak
The COVID-19 pandemic started in 2020 has demonstrated the dangers posed by the worldwide spread of new viruses.
The spread of viruses and other infectious diseases has been, however, extensively studied through epidemiological models for a long time.
These models try to give a mathematical representation of the mechanisms by which the disease spreads into the population, with the purpose of getting insights into how such processes work.
In this post, we show how to encode and analyze one of such models with POPACheck.
Technical Terms
Before starting, we need to introduce a couple of technical terms.
- Basic Reproduction Number \(R_0\). This is the expected number of new infection cases generated by one single infected person. In other words, it is more or less the average number of people to which an infected person transmits the disease. This number is computed by assuming that all other individuals can get the disease, and none of them are immunized naturally (through recovery from prior infection) or artificially (through vaccination).
- Case Fatality Rate (CFR). The CFR is the percentage of infected people who die from the virus, on average. For instance, a CFR of \(1\%\) means that on average, out of 100 people who get the virus, one dies.
Writing a probabilistic program to model the outbreak
COVID-19 had the peculiarity that the CFR was much higher for the elderly than for young people. Several studies have tried to determine the \(R_0\) and the CFR for COVID-19 empirically. However, to make things simpler, in our model we will consider a CFR of \(1\%\) for elderly people, and \(0\%\) for the young. Moreover, we consider the \(R_0\) values reported in the following table:
\(R_0\) | Young | Elderly |
---|---|---|
Young | 1 | 0.5 |
Elderly | 0.5 | 1 |
This means that each age group is more likely to spread the virus within the same age group. With a lower probability, the virus can jump from one group to the other.
We will thus model young and older people separately, with two functions. In our program, a function call models an infection. Here’s the function that represents the infection of a young person:
young() {
u2 y, e;
y = 0 {1 : 3} 1 {1 : 3} 2;
e = 0 {1 : 2} 1;
while (y > 0) {
young();
y = y - 1;
}
if (e > 0) {
elder();
} else {}
}
We declare two variables y
and e
that represent the number of, respectively, young and elderly people that the current young person is going to infect.
These numbers range from 0 to 2, so a 2-bit unsigned integer suffices.
We extract the number of infected people by sampling them from a uniform distribution such that the expected value coincides with the \(R_0\) reported in the table above. Thus, a young person may infect 0, 1, or 2 people respectively with a probability of \(\frac{1}{3}\). We can check that this distribution has the required expected value of 1: \[ E[\mathtt{y}] = \sum_i x_i \cdot P[x_i] = 0 \cdot \frac{1}{3} + 1 \cdot \frac{1}{3} + 2 \cdot \frac{1}{3} = 1 \] Similarly, a young person can infect an elderly person with probability \(\frac{1}{2}\), which corresponds to an expected value of \(\frac{1}{2}\): \[ E[\mathtt{e}] = 0 \cdot \frac{1}{2} + 1 \cdot \frac{1}{2} = \frac{1}{2} \]
Once the values of y
and e
have been extracted, we proceed by actually “launching” the infections.
Since in our model a person being infected is represented by a function call, function young
will first call itself recursively for a number of times equal to the value of y
,
and will then proceed to call elder
, the function representing elderly people, for a number of times equal to the value of e
.
The function representing the elderly is quite similar:
elder() {
bool casualty;
u2 y, e;
y = 0 {1 : 2} 1;
e = 0 {1 : 3} 1 {1 : 3} 2;
if (y > 0) {
young();
} else {}
while (e > 0) {
elder();
e = e - 1;
}
casualty = true {1 : 100} false;
}
This time, the random assignments to y
and e
are exchanged, meaning that an elderly person will infect more fellow elderly people than youngsters, on average.
We also represent the fact that elderly people can die with a \(1\%\) probability through the variable casualty
, to which we assign true
with a probability of \(\frac{1}{100}\).
In our simple model, we assume a person always dies only after possibly infecting other people, hence we extract the value of casualty
only after calls to young
and elder
.
However, it would be possible to modify the model so that, if an elderly person dies, they don’t infect anyone.
Analyzing the model
End of the Pandemic
The first question of interest that we can answer about this model is:
What is the probability that the infection dies out?
In other words, what is the probability that, at some point, all instances of young
and elder
stop making further recursive calls?
This question is equivalent to checking that the whole program terminates. Thus, we ask to POPACheck a termination query, by feeding it the following input file:
probabilistic query: approximate;
program:
young() {
u2 y, e;
y = 0 {1 : 3} 1 {1 : 3} 2;
e = 0 {1 : 2} 1;
while (y > 0) {
young();
y = y - 1;
}
if (e > 0) {
elder();
} else {}
}
elder() {
bool casualty;
u2 y, e;
y = 0 {1 : 2} 1;
e = 0 {1 : 3} 1 {1 : 3} 2;
if (y > 0) {
young();
} else {}
while (e > 0) {
elder();
e = e - 1;
}
casualty = true {1 : 100} false;
}
Note that young
is the first function in the file, meaning that we assume the first person ever being infected—patient zero—is young.
POPACheck answers the following:
Probabilistic Termination Checking
Result:
Lower bound: 0.3028
Upper bound: 0.3028
Total elapsed time: 17.71 ms (1.7709e-2 s)
Meaning that the probability of the infection dying out is \(\sim0.3\).
Infection of an elderly person
Another question that comes to our mind is
What is the probability that at least an elderly person is infected?
To answer it, we edit our input file by replacing the first line with the following:
probabilistic query: quantitative;
formula = F elder;
The formula F elder
uses the finally or eventually operator from Linear Temporal Logic.
We already encountered this operator in our post about the piranha experiment:
it means that something will happen at some point in the future.
in this case, our something is the occurrence of an invocation of the function elder
.
POPACheck answers the following:
Quantitative Probabilistic Model Checking
Query: F elder
Result:
Lower bound: 0.7843
Upper bound: 0.7913
Total elapsed time: 130.3 ms (1.3029e-1 s)
This means that the probability of an elderly person being eventually infected is slightly less than \(0.8\). This is higher than the probability of an individual young person infecting an elderly one (\(\frac{1}{2}\)). The explanation is that a young person has a higher probability of infecting other young people, and each one of them has some probability of infecting an elderly person. Thus, as the number of infected young people increases, so does the probability that one of them infects an elderly person.
Infection of an elderly person
But what is the probability of an elderly person dying?
We let POPACheck answer this question by using the following formula:
formula = F (elder And XNu [elder|casualty]);
This formula ask POPACheck to check the probability that eventually function elder
is called,
and when it terminates (XNu
) the variable casualty
has the value true
.
This formula employs operator XNu
(read chain next up), which states something about what happens when a function terminates.
POPACheck gives us the following answer:
Quantitative Probabilistic Model Checking
Query: F (elder And (XNu [elder| casualty]))
Result:
Lower bound: 0.6921
Upper bound: 0.6991
Total elapsed time: 26.64 s (2.6638e1 s)
Hence, the probability of at least an elderly person dying is around \(\sim 0.7\).
Death preventing infections
What would happen if the death of an elderly person prevented them from infecting anyone?
Try to modify function elder
so that if casualty
is true
, it makes no further calls to functions young
or elder
.
(Hint: move the assignment to casualty
to the top of the function, and use an if
statement to proceed with the rest only if casualty
is false
.)
This post was inspired by the motivating example in the following article:
- T. Winkler, C. Gehnen, and J.-P. Katoen: Model Checking Temporal Properties of Recursive Probabilistic Programs. In Proc. FoSSaCS 2022, LNCS 13242, pp. 449-469, Springer, 2022. [Link]