Let's Do the Cochrane Review of Physical Measures to Reduce the Spread of Viruses
The Study of the Week delves into the recent Cochrane Review because it offers a trove of lessons in critical appraisal
Professor Tom Jefferson of Oxford led a large group of authors for this fifth update of the evidence review of physical measures to interrupt the spread of respiratory viruses. Their last review was in 2020, which did not include SARS-CoV-2 infection.
Background Regarding Meta-analyses, Systematic Reviews and Cochrane Reviews
In the pyramid of evidence, a systematic review and meta-analysis sits as the highest-level evidence. You can see from the picture that the lowest levels are expert opinion, case reports, case series, and observational studies.
The pandemic underscored two (of many) problems with observational studies of non-randomized groups. One is dissimilar groups and the other is analytic flexibility.
When groups aren’t chosen at random, one group may do better because it has healthier characteristics—not because of the intervention. And when you study the effect of an intervention over one time period, the choice of a different time period may yield different results.
Randomized trials eliminate these biases. They set a time zero and randomization (mostly) balances known and unknown characteristics.
The problem with RCTs is that they can be selective and inform narrow areas. That is where systematic reviews and meta-analyses come in. These combine the trials to estimate an overall effect.
Sadly, though, the medical literature is over-populated with poor meta-analyses. That’s because computer software allows anyone to enter trials and get an overall effect.
Cochrane reviews are different. These are considered the gold standard. It’s beyond the scope of this column to explain why this is, but in short, Cochrane reviews are known for their rigor, strict adherence to methodology, pre-registration and transparency.
Changes from Previous Reviews
The authors made one major change in their methods from 2020. For this review, Jefferson and colleagues found sufficient randomized trials and therefore excluded observational studies.
This change enabled more robust evidence summaries from high‐quality studies, which are much less prone to the risk of the multiple biases associated with observational studies.
This was a massive change because during the pandemic, people who chose to use physical interventions (such as masking) are likely to do multiple other things to stop the spread of a virus. That’s why you need randomization.
The Current Study
The authors included 11 new RCTs and cluster RCTs with more than 610,000 participants. Six of the new trials were conducted during the pandemic—two from Mexico, and one each from Denmark, Bangladesh, England, and Norway.
The authors pre-registered three main questions: 1) medical/surgical masks compared to no masks; 2) N95/P2 respirators compared to medical/surgical masks; 3) Hand hygiene compared to control
The Results: (I will skip hand hygiene for time constraints and because clean hands also prevent bacterial infections.)
For the first question – medical/surgical masks vs no masks, the results are in the picture.
Nine trials considered the risk of any viral illness. The point estimate of the relative risk was 0.95 with 95% CI ranging from 0.84 (a 16% lower rate with masks) to 1.09 (a 9% higher rate with masks). We consider this a non-significant difference.
Six trials considered the risk of SARS-CoV-2 infection. The point estimate was 1.01 with 95% CI ranging from 0.72 (a 28% lower rate) to 1.42 (a 42% higher rate). We also consider this a non-significant difference. (*Notice the width of the confidence intervals.)
For the second question -- N95/P2 respirators compared to medical/surgical masks, the results are in the picture.
Three trials conducted in hospital settings with healthcare workers studied the presence of any viral illness. The point estimate of the relative risk was 0.70 with 95% CI ranging from 0.45 (a 55% lower rate with the stronger masks) to 1.10 (a 10% higher rate with the stronger masks). We also consider this a non-significant difference.
Five trials compared the two types of masks and when using lab-confirmed influenza, the point estimate was 1.10 with 95% CI ranging from 0.90 (a 10% lower rate) to 1.34 (a 34% higher rate). We consider this a nonsignificant result.
The Conclusions
The authors wrote
“The pooled results of RCTs did not show a clear reduction in respiratory viral infection with the use of medical/surgical masks.”
“There were no clear differences between the use of medical/surgical masks compared with N95/P2 respirators in healthcare workers when used in routine care to reduce respiratory viral infection.”
But they added an important caveat: there was low to moderate certainty of the evidence. That reduces confidence in the estimate.
Interpretation
I will start with a Tweet from one of the wiser voices during the pandemic, pediatrician Alasdair Munro:
I like this comment because it exposes two extremes that we should avoid.
The first is easy, right? Of course, a mask may work in a physics lab on a robot. But that is not how masks are used in the real world. People take them off to eat or drink; people don’t wear them properly, and, in the case of N95s vs medical masks in the hospital, workers don’t stay in the hospital the entire day and night.
This message transcends masks. It’s the same with procedures and drugs. A drug may exert a clear effect in moving one surrogate marker. But when used in a larger group of humans, it may be ineffective in reducing an important outcome.
Dr. Munro’s other extreme, that this proves that masks do nothing, is also not exactly what this review said.
Some would argue that the systematic review shows an absence of evidence of benefit. Which is true.
But.
The confidence intervals around the estimates allow for both increased infection and decreased infection. That would lead others to argue that absence of benefit does not equate to evidence of absence of benefit. The authors lend some credence to this idea when they wrote that there was low-to-moderate certainty of the evidence.
They called for a large well-designed RCT that would address these questions—especially the impact of adherence on effect sizes.
My Final Comments
I think we can find a middle ground between Dr. Munro’s two extremes. This involves common sense and consideration of prior probabilities.
First the priors. When I look at a trial, I like to think of it as akin to a medical test.
Medical tests rarely give definitive answers; instead, they update our prior beliefs. For instance, a negative stress test in a person with super-low cardiac risk further strengthens my confidence that this person does not have heart disease.
It’s the same here. Going into these new RCTs, there was no compelling evidence that masks did much to halt the spread of respiratory viruses. The new trials, with their null results, further strengthened that belief, though we should keep our minds open to change if a large well-conducted trial upends that belief.
But, as it always is, the onus of proof is on the proponents to show us a setting in which an intervention works.
Now to common sense: We in the health field have had three years to observe the use of masks. You walk through a hospital ward and see half the workers with their masks pulled down to take a sip of coffee. And there is no mask use in the cafeteria or break rooms.
Even when talking with a person wearing a regular medical mask, you can see the gaps on the sides that a respiratory virus can easily flow through.
You can love evidence, as I do, but this does not mean one needs to abandon common sense.
Editors' Note: This post will allow comments from paying subscribers. If you like our work, and want to support the Sensible Medicine project of independent ad-free evidence review and the allowance of nuanced argument, please consider becoming a paid supporter.
Editors’ Note #2 — I made one edit. I changed the reason for not reviewing the data on hand hygiene. I originally said it was because hand hygiene seemed obvious. That’s a problematic argument because some might say mask use is obvious. Better reasons to exclude it were time constraints and the fact that clean hands also prevents the spread of bacterial pathogens.
I would answer in the following way. If you are looking for a small effect then an RCT is absolutely essential. If there is a black and white effect, then an RCT is not required. The impact of masking at best is tiny, and at worst it may even have a negative effect.
Now in terms of COVID, even if you don't look at the RCTs, it is evident by just comparing similar countries and similar US states that masking had no impact. e.g. Sweden versus UK, for example or any of the other continental european countries.
Lastly, why do you care about flattening the curve. All that does is prolong the agony because the area under the curve remains unchanged. The only reason to attempt to flatten the curve is if the healthcare system becomes completely overwhelmed. For COVID this was never the case even at the beginning in NYC. Recall NYC never made use of the hospital ship or the Jaffitz center. Yes, things were busy but then hey always are in the winter.
As for the article you linked to, just read it again. they make a lot of assertions for which they have absolutely no evidence. For example, they assume masks work and that this is beyond doubt and "settled science". But it's far from settled, and the RCT data clearly show that masks don't work in the real world. Sure, they may do something under carefully controlled lab conditions, such as speaking into a small hole (e.g. the initial NIH study by Bax & Anfinrud published initially as a letter in the NEJM with a follow-up paper in PNAS), but that's not relevant to real life.
I've followed John's blog and writings for many years. And he is sensible. A small contribution from down under:
https://theconversation.com/yes-masks-reduce-the-risk-of-spreading-covid-despite-a-review-saying-they-dont-198992