From April 2023, about 1% of people who contracted COVID-19 eventually died. Does this mean you have a 1% chance of dying from COVID-19?

Epidemiologists call that 1% the death rate, calculated by dividing the number of confirmed COVID-19 deaths by the number of confirmed cases. The death rate is a statistics, or something that is calculated from a data set. Specifically, it is a kind of statistic called a sample sharewhich measures the proportion of data that meets certain criteria – in this case, the proportion of COVID-19 cases that ended in death.

The purpose of calculating a statistic such as the death rate is normally to estimate an unknown proportion. In this case, if every person in the world were infected with COVID-19, what proportion would die? However, some people also use this metric as a guideline to estimate personal risk as well.

It is natural to think of such a statistic as a probability. For example, popular sayings that you are more likely to be struck by lightning then die in a terrorist attack, or dead drive to work than die in a plane crash are based on statistics. But is it correct to take these statements literally?

I’m a mathematician who studies probability theory. During the pandemic, I saw health statistics become a national conversation. The public was inundated with ever-changing data as research unfolded in real time, highlighting specific risk factors such as pre-existing conditions or age. However, it is almost impossible to use these statistics to accurately determine your own personal risk because it varies so widely from person to person and depends on complicated physical and biological processes.

## The Mathematics of Probability

In probability theory, a process is considered random if it has an unpredictable outcome. This unpredictability may simply be due to the difficulty of obtaining the necessary information to accurately predict the outcome. Random processes have observable events to which each can be assigned a probability, or the tendency of that process to produce that specific result.

A typical example of a random process is the toss of a coin. A coin flip has two possible outcomes, each with a 50% chance. While most people consider this process to be random, an observer can know the precise force applied to the coin predict the outcome. But flipping a coin is still considered random, as measuring this force is impractical in practice. A small change can result in a different outcome for the coin flip.

A common way of thinking about the 50% probability of being heads is that when a coin is flipped multiple times, you would expect 50% of those flips to be heads. For a large number of somersaults, almost 50% of the somersaults will be heads. A mathematical theorem called the law of large numbers guarantees this and states that the running part of the outcomes will get closer and closer to the true probability when the process is repeated many times. The more you toss the coin, the running percentage of flips that are heads will get closer and closer to 50% essentially with certainty. However, this depends on each repeated coin flip occurring in essentially identical circumstances.

The 1% death rate of COVID-19 can be thought of as the ongoing percentage of COVID-19 cases that have resulted in death. However, it does not represent the true average probability of death as the virus and the immunity and behavior of the world’s population have changed so much over time. Conditions are not constant.

Only if the virus stopped evolving, everyone’s immunity and risk of death were identical and unchanging over time, and people were always available to be infected, would the death rate, by the law of large numbers, be closer to the actual average come probability of death over time.

## 1% chance of dying?

The biological process of a disease leading to death is complex and uncertain. It is unpredictable and therefore random. Every person is at real physical risk of dying from COVID-19, although this risk varies over time and place and between individuals. So at best, 1% could be the average chance of death within the population.

Health risks also vary between demographic groups. For example, the elderly have a lot higher risk of death than younger individuals. Tracking COVID-19 infections and how they end for a large number of people demographically similar to you would provide a better estimate of personal risk.

The death rate is a probability, but only when you look at the specific data set from which it was calculated directly. If you were to write the outcome of every COVID-19 case in that dataset on a slip of paper and pick one at random from a hat, you have a 1% chance of selecting a case that ended in death . By doing this only for cases from a certain group, such as a group of elderly people with a higher risk or young children with a lower risk, the percentage would be higher or lower. This is why 1% may not be a good estimate of personal risk for each person across all demographics.

We can apply this logic to car accidents. The probability of a car accident on a 1000 mile road trip is approx 1 in 366. But if you are never near roads or cars, then you have a 0% chance. This is really only a probability in the sense of pulling names out of a hat. It also applies unevenly across the population, for example due to differences in driving habits and local road conditions.

While a population statistic is not the same as a probability, it can be a good estimate. But only if everyone in the population is demographically similar enough that the statistic doesn’t change much when calculated for different subgroups.

The next time you’re faced with a population stat like that, you should know what it actually is: it’s just the percentage of a given population that meets certain criteria. Chances are you’re not average to that population. Your own personal probability may be higher or lower.