Effect & Cause (#228)
The global medical community is working with unprecedented speed to discover a vaccine and therapeutic treatments for COVID-19. Amid this period of medical research and development, there has been a flood of stories about potential early breakthroughs hitting the news.
While these well-meaning stories are aiming to inspire hope, it’s important to evaluate these new developments with a data-minded perspective. Be hopeful, but recognize the reality of the situation as well.
This is particularly important when evaluating medical studies. When we hear stories about the recovery rates an experimental or promising COVID-19 treatment shows in early stages, it’s important to remember we cannot draw definitive conclusions about the efficacy of the treatment without more context. For example, we need to establish if those patients are recovering because of the treatment, or if they are recovering due to rest and recovery time.
One common mistake in logical reasoning is to assume that if two variables change at the same time, it’s because one of those variables is impacting the other. Often, there is a third, hidden variable that causes both to change.
I saw an example of this error in a response to my recent article about CVS’s decision to stop selling cigarettes in 2014. A reader commented about an article which shared that people who vape experience a lower rate of COVID-19 infection and suggested that vaping somehow protected people from the virus. However, it’s more likely that vapers, who are disproportionately young, are the most likely to be asymptomatic if they are infected with COVID-19. Therefore, they are not tested or counted at the same rate.
Perhaps unsurprisingly, this gentleman worked in the vaping industry. He may well have been intentionally misusing the statistic and relying on people’s misunderstanding of correlation and causation.
Correlation is a statistical term that indicates two variables change at the same time. There are three statistical classifications of correlation:
- Positive correlation: As Variable A increases, Variable B increases as well—the two variables change in the same direction.
- Negative correlation: As Variable A increases, Variable B decreases, and vice-versa. The two variables change at the same time, but in opposite directions.
- No correlation: There is no observable link between Variable A and Variable B—they change independently of each other over time.
When there is positive or negative correlation between two variables, it’s a common error to assume one is causing the other to change. However, correlation does not imply causation. Unless there is clear evidence that one variable directly changes the other, you shouldn’t assume two correlated variables have a cause-and-effect relationship.
Here’s one of my favorite illustrations of this phenomenon. Data show that in Baltimore, Maryland, there is a correlation between ice cream sales and the murder rate. Should we assume that ice cream causes murder and immediately ban the sale of ice cream to protect the public? Of course not.
The reality is that a third variable, hot weather, causes both the homicide rate and ice cream sales to spike at the same time—summer. When it’s hot, people are more inclined to buy ice cream and more likely stay out on the streets late at night, creating more environmental opportunities for homicides.
The misapplication of causation is why every reputable medical treatment study includes a control group with similar health profiles as those who received the treatment. That control group either does not receive the treatment or receives a placebo to determine if the treatment is what’s causing the experimental group to recover.
While potential treatments for COVID-19 are exciting, we need to give time for the proper evaluation to take place before jumping on the bandwagon.
This applies to business as well. Because many people don’t have a clear understanding of correlation and causation, there are people who seek to exploit this and manipulate data to move people toward their narrative—as the vaping advocate attempted. This is why we teach this principle to new employees to help avoid these types of mistakes, which can damage credibility.
In general, we need to learn to be skeptical of claims that draw bold conclusions from correlated data points. Otherwise, you might believe Friday Forward caused you to wake up this morning.
Quote of The Week
“In God we trust. All others must bring data.”
– W. Edwards Deming