A little Pandemic

This post is part of a series of posts explaining some basic concepts of probability theory with the help of POPACheck. Check out the first post here.

A virus outbreak

The COVID-19 pandemic started in 2020 has demonstrated the dangers posed by the worldwide spread of new viruses. The spread of viruses and other infectious diseases has been, however, extensively studied through epidemiological models for a long time. These models try to give a mathematical representation of the mechanisms by which the disease spreads into the population, with the purpose of getting insights into how such processes work. In this post, we show how to encode and analyze one of such models with POPACheck.

Technical Terms

Before starting, we need to introduce a couple of technical terms.

Writing a probabilistic program to model the outbreak

COVID-19 had the peculiarity that the CFR was much higher for the elderly than for young people. Several studies have tried to determine the \(R_0\) and the CFR for COVID-19 empirically. However, to make things simpler, in our model we will consider a CFR of \(1\%\) for elderly people, and \(0\%\) for the young. Moreover, we consider the \(R_0\) values reported in the following table:

\(R_0\) Young Elderly
Young 1 0.5
Elderly 0.5 1

This means that each age group is more likely to spread the virus within the same age group. With a lower probability, the virus can jump from one group to the other.

We will thus model young and older people separately, with two functions. In our program, a function call models an infection. Here’s the function that represents the infection of a young person:

young() {
  u2 y, e;

  y = 0 {1 : 3} 1 {1 : 3} 2;
  e = 0 {1 : 2} 1;

  while (y > 0) {
    young();
    y = y - 1;
  }

  if (e > 0) {
    elder();
  } else {}
}

We declare two variables y and e that represent the number of, respectively, young and elderly people that the current young person is going to infect. These numbers range from 0 to 2, so a 2-bit unsigned integer suffices.

We extract the number of infected people by sampling them from a uniform distribution such that the expected value coincides with the \(R_0\) reported in the table above. Thus, a young person may infect 0, 1, or 2 people respectively with a probability of \(\frac{1}{3}\). We can check that this distribution has the required expected value of 1: \[ E[\mathtt{y}] = \sum_i x_i \cdot P[x_i] = 0 \cdot \frac{1}{3} + 1 \cdot \frac{1}{3} + 2 \cdot \frac{1}{3} = 1 \] Similarly, a young person can infect an elderly person with probability \(\frac{1}{2}\), which corresponds to an expected value of \(\frac{1}{2}\): \[ E[\mathtt{e}] = 0 \cdot \frac{1}{2} + 1 \cdot \frac{1}{2} = \frac{1}{2} \]

Once the values of y and e have been extracted, we proceed by actually “launching” the infections. Since in our model a person being infected is represented by a function call, function young will first call itself recursively for a number of times equal to the value of y, and will then proceed to call elder, the function representing elderly people, for a number of times equal to the value of e.

The function representing the elderly is quite similar:

elder() {
  bool casualty;
  u2 y, e;

  y = 0 {1 : 2} 1;
  e = 0 {1 : 3} 1 {1 : 3} 2;

  if (y > 0) {
    young();
  } else {}

  while (e > 0) {
    elder();
    e = e - 1;
  }

  casualty = true {1 : 100} false;
}

This time, the random assignments to y and e are exchanged, meaning that an elderly person will infect more fellow elderly people than youngsters, on average. We also represent the fact that elderly people can die with a \(1\%\) probability through the variable casualty, to which we assign true with a probability of \(\frac{1}{100}\). In our simple model, we assume a person always dies only after possibly infecting other people, hence we extract the value of casualty only after calls to young and elder. However, it would be possible to modify the model so that, if an elderly person dies, they don’t infect anyone.

Analyzing the model

End of the Pandemic

The first question of interest that we can answer about this model is:

What is the probability that the infection dies out?

In other words, what is the probability that, at some point, all instances of young and elder stop making further recursive calls?

This question is equivalent to checking that the whole program terminates. Thus, we ask to POPACheck a termination query, by feeding it the following input file:

probabilistic query: approximate;
program:
young() {
  u2 y, e;

  y = 0 {1 : 3} 1 {1 : 3} 2;
  e = 0 {1 : 2} 1;

  while (y > 0) {
    young();
    y = y - 1;
  }

  if (e > 0) {
    elder();
  } else {}
}

elder() {
  bool casualty;
  u2 y, e;

  y = 0 {1 : 2} 1;
  e = 0 {1 : 3} 1 {1 : 3} 2;

  if (y > 0) {
    young();
  } else {}

  while (e > 0) {
    elder();
    e = e - 1;
  }

  casualty = true {1 : 100} false;
}

Note that young is the first function in the file, meaning that we assume the first person ever being infected—patient zero—is young.

POPACheck answers the following:

Probabilistic Termination Checking
Result:
  Lower bound: 0.3028
  Upper bound: 0.3028

Total elapsed time: 17.71 ms (1.7709e-2 s)

Meaning that the probability of the infection dying out is \(\sim0.3\).

Infection of an elderly person

Another question that comes to our mind is

What is the probability that at least an elderly person is infected?

To answer it, we edit our input file by replacing the first line with the following:

probabilistic query: quantitative;
formula = F elder;

The formula F elder uses the finally or eventually operator from Linear Temporal Logic. We already encountered this operator in our post about the piranha experiment: it means that something will happen at some point in the future. in this case, our something is the occurrence of an invocation of the function elder.

POPACheck answers the following:

Quantitative Probabilistic Model Checking
Query: F elder
Result:
  Lower bound: 0.7843
  Upper bound: 0.7913

Total elapsed time: 130.3 ms (1.3029e-1 s)

This means that the probability of an elderly person being eventually infected is slightly less than \(0.8\). This is higher than the probability of an individual young person infecting an elderly one (\(\frac{1}{2}\)). The explanation is that a young person has a higher probability of infecting other young people, and each one of them has some probability of infecting an elderly person. Thus, as the number of infected young people increases, so does the probability that one of them infects an elderly person.

Infection of an elderly person

But what is the probability of an elderly person dying?

We let POPACheck answer this question by using the following formula:

formula = F (elder And XNu [elder|casualty]);

This formula ask POPACheck to check the probability that eventually function elder is called, and when it terminates (XNu) the variable casualty has the value true. This formula employs operator XNu (read chain next up), which states something about what happens when a function terminates.

POPACheck gives us the following answer:

Quantitative Probabilistic Model Checking
Query: F (elder And (XNu [elder| casualty]))
Result:
  Lower bound: 0.6921
  Upper bound: 0.6991

Total elapsed time: 26.64 s (2.6638e1 s)

Hence, the probability of at least an elderly person dying is around \(\sim 0.7\).

Death preventing infections

What would happen if the death of an elderly person prevented them from infecting anyone?

Try to modify function elder so that if casualty is true, it makes no further calls to functions young or elder.

(Hint: move the assignment to casualty to the top of the function, and use an if statement to proceed with the rest only if casualty is false.)


This post was inspired by the motivating example in the following article: