The Study of the Week Is a Beautiful Example of Science Done Well
Dubious studies are instructive. But so are well-done studies. Let's look today at good science.
This post brought serious criticism. I learned a bunch from it. Please do read the comments, and I wrote a follow-up post here. JMM
Academic medicine sometimes gets it right. This is a positive story about a negative trial.
Neurologist Hooman Kamel from the Weil Cornell Medical Center in NY had an idea about atrial fibrillation and stroke.
Old thinking held that clots formed in the left atrium during periods of irregular rapid fibrillatory activity. Stroke came when these clots moved northward to the brain. The main problem was the aberrant rhythm.
One of the big issues with this theory was that studies had failed to show a strong relationship in time between the stroke and the irregular rhythm. (For purists, it failed on Bradford Hill’s “temporality” criteria of causation.)
Kamel’s new idea about AF and stroke was that the main problem was primary atrial disease. Dilation, fibrosis, and poor contractile function in the left atrium all created a milieu in which clots could form.
The AF episodes we see on an ECG were simply manifestations of atrial disease. He called it an atrial cardiopathy. He explained this in a tremendous paper published in the journal Stroke. I have read that piece many times. It made great sense.
This picture shows the general idea. You can see that it is the abnormal substrate that leads to stroke. AF is partly a bystander.
Kamel’s next step makes me tingle with delight. He did not go on a speakers’ circuit promoting diagnosis and treatment of atrial cardiopathy. Instead, he designed a randomized controlled trial to test this idea.
The Trial
The Journal JAMA published the ARCADIA trial last week. Along with a team of researchers at numerous hospitals, they randomized ≈1000 patients who had had stroke of unknown cause and evidence of atrial cardiopathy to either treatment with an oral anticoagulant (apixaban) vs aspirin.
(If the atrial cardiopathy theory was correct, the anticoagulant should beat the anti-platelet drug aspirin.)
They defined atrial cardiopathy using easy clinical criteria — the ECG appearance of p-waves in lead V1, an abnormally high biomarker called BNP, or a large left atrial diameter by echocardiogram. None of these patients had AF at the time of study entry.
Their primary endpoint was simply recurrent stroke. The primary safety endpoint was brain bleeding and other major bleeding.
Clear Results
After nearly 2 years, recurrent stroke occurred in 40 patients in the apixaban group (annualized rate, 4.4%) and 40 patients in the aspirin group (annualized rate, 4.4%) (hazard ratio, 1.00 [95% CI, 0.64-1.55].
Intracranial bleeds (the worst kind of bleeding) occurred in 0 patients in the apixaban group and 7 in the aspirin group. Other major bleeds occurred in 5 patients in both arms.
So clear was the ARCADIA trial that it had to be stopped early for futility.
During the trial, about 15% of patients developed new AF. Looking specifically at that group, the authors found no effect of apixaban over asa.
The authors concluded that in patients who had stroke of unknown cause and evidence of atrial disease without AF, oral anticoagulation with apixaban did not reduce the risk of recurrent stroke over aspirin.
Comments:
Let’s do the specifics first. It is common to not find an obvious source for stroke. ARCADIA teaches us that there is no reason to put these patients on oral anticoagulation—even when there is evidence of abnormal atrial disease.
This data also aligns well with two previous negative trials that tested oral anticoagulation to prevent recurrent stroke in patients who had stroke of unknown origin. One was called NAVIGATE ESUS and the other was called RESPECT ESUS.
The reasons anticoagulation failed in ARCADIA are speculative. Perhaps Kamel and colleagues needed stricter criteria to enroll patients with sicker atria. Perhaps stroke patients have too many competing causes of stroke, say from diseased blood vessels in the head.
The last specific finding to highlight is the disparate rates of brain bleeds. Zero for the oral anticoagulant vs 7 in the aspirin group. Sure, this could be noise, like throwing 7 heads in a row. But that seems unlikely.
We have strong data from a previous trial, called AVERROES, showing that bleeding rates with apixaban were not worse than aspirin. ARCADIA further confirms the notion that aspirin may not be the “safer” anti-thrombotic drug.
The Most Important Lesson
Professor Kamel has shown us science done well. He and his team had a great idea. It made sense. It explained some inconsistencies in the previous theories.
Yet when it was tested in randomized controlled trials, the idea did not pan out.
To me, this was not a failed effort. It taught us all a) how science is supposed to work, and b) that there is more work to do in preventing recurrent strokes.
Congratulations to Professor Kamel and his team.
Absence of evidence is not evidence for absence strikes again. There is a reason that so many cardiovascular trials that use a lowest-information binary endpoint require 6,000-10,000 patients: They need to have 600 events to be able to nail down the treatment benefit. A study with 80 events on the lowest power binary endpoint of recurrent stroke, which ignores stroke severity, hospitalization required, re-recurrence of stroke, etc. is at a tremendous disadvantage from the start.
The [0.64, 1.55] 0.95 confidence interval for the hazard ratio means that the data are consistent with up to a 36% reduction or a 55% increase in instantaneous rate of recurrent stroke with anticoagulant. The information is likely sufficient for deciding to stop for futility, i.e., to be fairly certain that going to planned completion at the planned inadequate sample size would not lead to success, if success were defined as a demonstration of efficacy through statistical "significance". That is a far cry from concluding that there is evidence for lack of efficacy.
The study had a pre-specified margin of 0.6 for an efficacy threshold. In other words the investigators in their collective wisdom are saying that a 39% reduction in instantaneous risk of recurrent stroke should have been clinically irrelevant. If 0.6 were to be widely accepted by disinterested experts and patients then we perhaps know enough. But we need a more direct analysis to quantify the evidence. We need the Bayesian posterior probability that the hazard ratio is greater than x where x is the minimum clinically effective treatment effect as specified by external experts/patients.
This study is one of countless examples where keeping the study going until sufficient evidence is collected for a conclusion about the effect size, and running the trial as a Bayesian sequential design with no planned sample size would have resulted in far more useful information. One could even do a Bayesian futility analysis to possibly stop earlier than traditional frequentist futility analysis, e.g, when Pr(HR > 0.9) > 0.95, where 0.9 is replaced with whatever clinical threshold is relevant. This design will allow one to conclude that the treatment didn't work, unlike stopping for futility about H0:HR=1 with a wide confidence interval.
'Negative' data is STILL data....when did we forget this?