Editors note: The writer—me—inverted results of one of the trials used as an example. This post now contains the proper results. Thanks to an astute reader.
The NEJM published this month a study on an important medical question that may upend the way we use medical evidence.
This has nothing to do with the specific medical question. It has everything to do with how we interpret the results.
First some background.
Medical studies have been like running races. A drug or device beats the standard of care. Or it does not.
But. Unlike a foot race, wherein you can (nearly always) see who wins, the judge of medical studies is statistics.
Did the new treatment reduce the bad outcome so much so that the difference meets a statistical threshold? Was the effect true signal and not noise?
This way of judging and declaring a winner creates a challenge for using these medical studies to treat patients.
Two examples explain the challenge of using statistics to judge science
In large studies, a tiny difference in outcomes—one that is not “clinically” significant can easily reach statististical significance.
For instance, in the FOURIER trial, the super-expensive PCSCK9 inhibitor, evolocumab, reduced a primary composite outcome (CV death, MI, Stroke, unstable angina, coronary revascularization) vs placebo.
The absolute risk decrease was just 1.5%, and it was driven by non-fatal outcomes. There was no difference in cardiovascular death or all-cause death. Yet, because trialists enrolled more than 27,000 patients, this tiny difference rendered a highly significant statistical test.
And the conclusions were that patients with heart disease benefit from this drug.
Contrast this with the THAPCA trial of cooling pediatric survivors of cardiac arrest.
In smaller studies, a large and potentially clinically significant difference may not reach statistical significance.
In THAPCA, trialists reported that meaningful survival at one year occurred in 20% of patients cooled to 33 degrees vs 12% in those kept at a normal body temperature 36.8 degrees.
That 8% better survival in absolute terms did not reach statistical significance. The conclusion therefore read that therapeutic hypothermia “did not confer a significant benefit in survival with a good functional outcome at 1 year.”
This massive improvement in survival was declared not different because there were too few patients randomized.
The statistical test suggested the possibility these results were not surprising enough given the (null hypothesis) assumption of no real difference. (I know; that is a wacky way to say that we cannot exclude the possibility of noise.)
The trial that breaks this dogmatic way of judging science is called the ELAN trial.
Swiss-led investigators asked the question of when to start oral anticoagulation drugs after a patient has a stroke due to a blocked blood vessel in the brain. We call this acute ischemic stroke or AIS.
The two choices are early (within days to a week) or later (about 2 weeks). Current practice now is to start later. So early initiation is the active arm.
The primary outcome is reasonable—recurrent stroke, blood clot elsewhere in the body (systemic embolism), bleeding outside the brain, bleeding in the brain or death due to blood vessel disease in the first 30 days. In other words—a composite of bad things that can occur from not treating (more clots) or treating (bleeding).
In multiple centers, slightly more than 2000 patients were randomized to either early or later starting of the anticoagulant drugs.
A primary outcome occurred in 2.9% of the patients in the early arm vs 4.1% in the later treatment arm.
The absolute risk reduction was 1.2%. The relative risk reduction was 30%—expressed as an odds ratio of 0.70.
The question for you is what to make of these results.
What I love about this study is that we don’t have to worry about dualities of interests. No one is making money from the results. It’s a matter of starting treatment early or later. It's a pure scientific question.
The ELAN authors provide an extremely provocative way to use these results.
Next Monday, on the Study-of-the-Week, I will write about this new approach. We will also discuss it on our podcast next weekend. Stay tuned—and think.
A brief word of thanks. We are shocked at the response this newsletter has achieved. We now measure views in the millions. Thank you x 1000.
I'm a family physician. A 86 year old care home resident with moderate dementia I care for had a major fall leading to a severely swollen arm. She was admitted over 3 days as some debate as to whether fracture or not- eventually decided not. During admission she was found to be in Atrial Fibrillation so discharged on NOAC. The HASBLED score gives her only one point ie >65. No account taken of advanced frailty and numerous falls including the cause of the current admission.
To me as a old fashioned doc of 30 yrs clearly looked wrong- felt obliged to speak to her son but we agreed NOAC to be stopped as tiny benefit potential and clearly significant risk.
An error on my part and I forgot to update the care home and they continued NOAC- 3 days later
developmed intractable nose bleeding and had to attend Accident and Emergency. The hospitals action .... must continue NOAC but have 14 days of tranexamic acid....
Any clinicians on here please tell me if I'm wrong but I struggle to think of this as anything other the crass, unthinking tick box medicine.
For ELAN, I see no major practical difference in composite poor outcomes: 29 out of 1000 vs. 41 out of 1000. That means 1930 out of 2000 (96.5%) had good outcomes with either treatment arm. What could matter, however, is if the incidence of death was overwhelming in either treatment arm.