Using COVID-19 Testing to Better Understand Bayes’ Theorem

Joe Ramirez
5 min readSep 22, 2020

--

Dr. Trefor Bazett Source

I am currently enrolled at Flatiron School in their data science immersive bootcamp. Recently, we learned about Bayes’ Thoerem during the statistical component of the course. Named after the esteemed British statistician Thomas Bayes, Bayes’ Theorem is a powerful tool used in statistics to calculate the probability of an event occurring given outside knowledge that may be related to the event. Although I found the Bayes’ Theorem to be illuminating, given the rapid nature of the course, I did not feel I was able to fully grasp the concept as much as I would have liked. As a result, I decided to do further research on the topic to round out my knowledge and in doing so, I discovered a video by Dr. Trefor Bazett, an assistant professor of mathematics and statistics at the University of Victoria. This video not only perfectly explained Bayes’ Theorem to me, it did so using a relevant real-world example, COVID-19 testing.

Photo by the United Nations COVID-19 Response on Unsplash

As the video commences, Dr. Bazett mentions the subtle, but quite important, difference between specificity and sensitivity when it comes to any virus testing.

Sensitivity is the ability of a test to actually detect the specified disease. In other words, the sensitivity of a test provides that test’s true positive rate. However, any difference between the test’s sensitivity rate and one will also provide that test’s false negative rate, which would mean the test gives the patient a negative result when they in fact have the disease.

Specificity, on the other hand, is the ability of the test to accurately determine that a patient does not have the disease. This is also known as the test’s true negative rate. In a test that does not have perfect specificity, you will also get patients that receive positive results when they do not in fact have the disease in question. That is known as the test’s false positive rate.

Why does this matter? As we dive into a thorough explanation of how Bayes’ Theorem can apply to COVID-19 testing, you will quickly understand the important of understanding specificity and sensitivity.

At first glance, the above image might appear to be undecipherable, which is why using a real-world example will help us to better understand it.

COVID-19 Testing

Let’s work with some numbers that Dr. Trefor Bazett uses for his video (these are not actual COVID-19 testing numbers). Imagine that COVID-19 tests have 99% specificity, 95% sensitivity, and COVID-19 has a 1% prevalence in the population. Utilizing these numbers, can we determine the probability of actually having COVID-19 if you test positive? With Bayes’ Theorem we can! To start, let’s write out what we are solving for using the format above.

r

All that is being said above is just another way of asking what is the probability of actually having COVID-19 given that you test positive.

Now let’s write out the entire equation:

Again, this might seem confusing initially, but we were actually given all the information to solve for this!

A 95% sensitivity means that if you in fact have COVID-19, you will test positive 95% of the time. What part of the above equation looks most like that statement? P(+|COVID)! Next, what is the P(COVID), the probability of having COVID-19? That was also already given to us as 1%. The next part might seem tricky, but not if you try to break it down. That apostrophes in the bottom right part of the equation may seem tricky, but they are actually stating something very intuitive. COVID’ represents the complement of COVID. What is the the complement? Simply put, it means everything that is not COVID. So P(+|COVID’) is merely asking for the probability of testing positive given that you do not have COVID-19. Was that information given to us as well? Yes!

A 99% specificity means that for 99% of the time if you do not have COVID-19, you will test negative? What happens that other 1% of the time? You test positive! So the P(+|COVID’) is 1%. Finally, since we know that the probability of having COVID-19 is 1%, we know its complement, or the probability of not having COVID-19, is 99%. Now we will convert the percentages to decimals and try to solve.

Not bad at all right? When we solve for this, we get an answer of .49, which may shock many people. This means that in this particular case, our probability of having COVID-19 if we test positive is approximately only 50%! How can this be? As Dr. Bazett explains, a specificity of 99% means that if we test 100 people, we will have one false positive out of 100. However, since we already know that out of that 100 people one person also actually has the disease, this means that we will have two total people out of 100 testing positive. Since only 1 out of those 2 people really has COVID-19, our answer of approximately .50 makes perfect sense!

These numbers are not real, of course, and were simply used to drive home the concept of Bayes’ Theorem home using a relevant example, which I think Dr. Bazett successfully accomplished. Hopefully the video and this subsequent breakdown were helpful to you in better understanding this complex, yet powerful, theorem.

--

--

Joe Ramirez
Joe Ramirez

Written by Joe Ramirez

Data Scientist | Data Analyst

Responses (1)