46 Comments

Dr. Mandrola writes in this post:

"The Results

At 60 days, 17.3% of those in the active arm died vs 21.3% in the placebo arm. That is an absolute risk reduction of 4%. The relative risk reduction in death equaled 33%."

I believe there is an error here. Based on the facts as stated, the relative risk reduction in death equaled 19%, not 33%.

Expand full comment

I think it is more important to know the make up of the two groups. Often times, these participants are cherry picked or negative results are weeded out over time. Also, I think that since this study might have showed good results, there should be 3 or 4 more done.

We have to face the fact that drug companies are more interested in selling drugs, especially high priced newer ones, than they are in curing any diseases or helping people live without drugs. That means most of these studies are going to be massaged and manipulated to extract the best outcomes with the happiest smile faces on them.

It still does not matter because we no longer have any agencies that are honest in their assessments of new drugs. The FDA ain't it no more....not by a long, long shot. They have given up any mission that includes protecting the public from harmful drugs...witness the recent mRNA substance fiasco.

Expand full comment

Thanks for boiling this down so well. I wonder whether research ethics boards have biostatisticians as members. Can they truly even evaluate scientific validity without these?

Expand full comment

It seems fair to say this study failed to show that the antibiotic combo was at least 14% better than placebo. The authors truncated their conclusion statement, but this seems to be what they are saying. I’m not sure what the problem is with that.

What this study does not answer is whether the combo may have been 13% better. Or 12 % better. Or some other smaller but still clinically relevant percentage better (however that might be defined or determined).

This is a common issue these days, where estimates of control arm event rates are excessively high, and estimates of effect size are way too large, resulting in underpowered studies. Does that mean potentially useful (but less useful than hoped) therapies are being abandoned? Quite possibly. But I’m not sure what the alternative is. As you say, having much larger studies may be dubious ethically, and prohibitive financially. Perhaps IRBs need to be more active in the design of studies it reviews and approves, to somehow increase likelihood that a real and relevant active arm treatment effect will be found by the study as designed and proposed.

Expand full comment

"It’s like determining whether a coin is fair with only 10 flips."

No need to figure out if it is a fair coin.

Von Neumann gave a simple solution: flip the coin twice. If it comes up heads followed by tails, then call the outcome HEAD. If it comes up tails followed by heads, then call the outcome TAIL. Otherwise (i.e., two heads or two tails occured) repeat the process.

Expand full comment

This sort of freqeuntist analysis is fundamentally flawed from the git-go. Any study can only determine P(D|H), the probability of getting data D given hypothesis H. What you really want to know is P(H|D), the probability of the hypothesis given the data. But you can get from P(D|H) to P(H|D) only if you also have a prior P(H) on your set of hypotheses. Then you can use Bayes' Theorem, P(H|D) = P(D|H)P(H)/P(D) (where P(D) is a normalization constant that does not need to be known separately) to get P(H|D). Any mental translation of a confidence interval on DATA to a confidence interval on the HYPOTHESIS is completely illegal in frequentist statistics, even though every human being without exception mentally does this. If you do it (and you do), you have inadvertently introduced some prior, and you don't know what it is. Far better to use Bayesian analysis with specified priors. But alas, medicine has been completely conquered by the frequentists.

Expand full comment

I don’t see this as implicating RCTs as your final paragraph implies. The case you are making is for RCTs designers error on the side of confidence in case their pre-study assumptions are wrong.

Expand full comment

Thank you for looking at this. I had seen on Twitter and had forwarded to sensible medicine.

There were people on Twitter that imagined that: it just needed more power.

No. It just doesn't work.

"Statistical" thinking and understanding variation does not come naturally to most human beings. Physicians are not immune from this natural deficit.

I also think there is a linguistic problem.

"At 60 days, 17.3% of those in the active arm died vs 21.3% in the placebo arm. That is an absolute risk reduction of 4%. The relative risk reduction in death equaled 33%."

These last two sentence should never be written. You should NEVER write about ARR or RRR or a hazard ratio until AFTER you decide (with risk of error) whether there is anything meaningful to consider.

To write about absolute or relative risk in advance of deciding that there is something meaningful to consider creates an implicit bias.

ARR, RRR and HR must to be operationalized to only have meaning if and only there is a meaningful reason (with risk of error) to consider an assignable cause is at work.

The default to 95% confidence intervals is also really leading us astray. We should be using 3 standard deviations: 99.7%

I hate to be harsh but medicine hides its lack of understanding of really fine tuned biological causal mechanisms in the space between 2 and 3 standard deviations. Failure to know that one doesn't know plus hopefulness plus hubris plus lack of real statical thinking leads to clinical, social and cultural iatrogenesis. Parhessia.

Here is thought: do or simulate Deming's Red Bead experiment. Then do it again with UCL and LCL calculated with +/- 2 stds.

Expand full comment

Trying my first comment on substack. Yikes.

Though the paper describes the power calculation in terms of absolute risk reduction, I think it's more informative to note that they were powering the study for a 50% relative risk reduction. 50% RRR is fairly standard in trials, no?

The fixation on that magic 2-fold risk reduction is a scientific cultural phenomenon, and it means we are stuck looking for silver bullets.

Expand full comment
May 15, 2023Liked by Adam Cifu, MD

Nice article John. You say that the journal's interpretation is technically correct. I do not believe that to be the case. IMHO it is grossly inaccurate due to a massive misunderstanding of p-values and hypothesis tests. Sir Ronald Fisher himself said that a large p-value means "get more data". Any journal wanting to say that there is evidence that a treatment doesn't work should be forced to base that on a Bayesian probability that the treatment effect is clinically negligible. For this study you'd find such a probability to be near 0.5 so we haven't a clue about the ineffectiveness of the therapy.

Expand full comment

Maybe, they designed the trial to fail and that way be able to (wrongly) conclude that the cheap abd safe treatment doesn't work. I would not be surprised if a very expensive alternative was coming soon

Expand full comment

It has been my experience that most physicians today only read the title and abstract. Few read conclusions and nobody (but us old guys) read the entire paper including references. It’s embarrassing listening to physicians quote papers that do not substantiate their points. Yet, I have no problem pointing out their inadequacies and getting a chuckle.

Expand full comment

So it seems you shouldn't try to estimate the sample size until you know the likely outcome in the placebo arm?

Expand full comment

Great article! It shows just how manipulative the big pharma have been in duping the public re the vaxxes!

Expand full comment