Oh! I did not notice it was a truncated trial. In RCT collection of sufficient events are mandatory in order to reach a decisive conclusion. If the sufficient number of events are not captured, the trial should be labeled as an insufficient trial, not a proof of null hypothesis. Both type 1 and Type 2 errors are plentiful in truncated trials. The truncated trials could be considered if sufficient number of events are captured, albeit with a possibility of Type 1 error.

Expand full comment

I agree with you. My possible take-home message from the trial is Aspirin is not an innocent drug. Beware of intracranial bleeds with aspirin, particularly in susceptible individuals like poorly controlled hypertensives, etc. Secondly, in medicine no mechanistic explanations/theories, no matter how strong unless proven in meticulously designed and conducted clinical trials should ever be implemented in the clinical practice.

Expand full comment

Unfortunately, this praise is not justified.

“hazard ratio, 1.00 [95% CI, 0.64-1.55].”

So the sample size was too small. One treatment could still be much better than the other, but an underpowered test was not able to tell.

(My sophomore level calculation: sigma = (n p (1-p))^.5 = (1000* .04 * .96)^.5. = 6.2

So instead if the true incidence of stroke was 4% in one arm, (40 out of 1000), the observed number would usually be between 28 and 52. If one arm by chance was +1.5 Sigma and the other was -1.5 sigma, the observed hazard ratio would be (49/31) = 1.58

So an observed hazard ratio of 1.0 means very little - the true hazard ratio might be above 1.5 or below (1/1.5) = .66. [And the correctly calculated limits are 1.55 to .64]

Conclusion of this back-of-the-envelope calculation: the trial of 1000 patients over 2 years was far too small/too short to detect even a major difference between apixaban versus aspirin. The outcome tells us almost nothing.

See Prof. Hartzell’s comment for a more thorough and accurate version of this reasoning. I wanted to check if an intuitive analysis gives the same answer - and it does.

Expand full comment

Unless I am misreading the article this study just showed that a particular anticoagulant drug was no better than low dose aspirin in the prevention of recurrent strokes in patients who had experienced cryptogenic (translation: unknown cause) stroke in the past and exhibited left atrial abnormalities (basically just signs of left atrial enlargement) without evidence of known atrial fibrillation. It cites two prior studies comparing other anticoagulant drugs and aspirin with near identical results. Personally, I would like to see controls on no anti-platelet or anticoagulant therapy. I know there have been such studies in the past and I can recall thinking at the time that the percentage benefit was pretty small and essentially canceled out by the negative effects---primarily major bleeding. Perhaps that is why we no longer see control groups on no drugs.

Throughout my medical career I have had doubts about the concept of thrombotic emboli originating in the left atrium. The word "stasis" if often used giving the impression that clots will form as they would in a pool of blood outside the body. But anyone who has watched contrast media pass through the chambers of the heart and out through the aorta would have to have some doubts about this model. Why should the pathophysiology of infarcts in the brain be fundamentally different from those in myocardial infarction? That is, presumed occlusion brought about by disruption of the endothelium by an atherosclerotic plaque. If the left atrium can build up clot and then release them as emboli, why don't we see similar "embolic" phenomena more often in the upper and lower extremities? The message may be that is difficult to provide reliable therapy for a process or processes about which we have guesses but no certain knowledge.

Expand full comment

Absence of evidence is not evidence for absence strikes again. There is a reason that so many cardiovascular trials that use a lowest-information binary endpoint require 6,000-10,000 patients: They need to have 600 events to be able to nail down the treatment benefit. A study with 80 events on the lowest power binary endpoint of recurrent stroke, which ignores stroke severity, hospitalization required, re-recurrence of stroke, etc. is at a tremendous disadvantage from the start.

The [0.64, 1.55] 0.95 confidence interval for the hazard ratio means that the data are consistent with up to a 36% reduction or a 55% increase in instantaneous rate of recurrent stroke with anticoagulant. The information is likely sufficient for deciding to stop for futility, i.e., to be fairly certain that going to planned completion at the planned inadequate sample size would not lead to success, if success were defined as a demonstration of efficacy through statistical "significance". That is a far cry from concluding that there is evidence for lack of efficacy.

The study had a pre-specified margin of 0.6 for an efficacy threshold. In other words the investigators in their collective wisdom are saying that a 39% reduction in instantaneous risk of recurrent stroke should have been clinically irrelevant. If 0.6 were to be widely accepted by disinterested experts and patients then we perhaps know enough. But we need a more direct analysis to quantify the evidence. We need the Bayesian posterior probability that the hazard ratio is greater than x where x is the minimum clinically effective treatment effect as specified by external experts/patients.

This study is one of countless examples where keeping the study going until sufficient evidence is collected for a conclusion about the effect size, and running the trial as a Bayesian sequential design with no planned sample size would have resulted in far more useful information. One could even do a Bayesian futility analysis to possibly stop earlier than traditional frequentist futility analysis, e.g, when Pr(HR > 0.9) > 0.95, where 0.9 is replaced with whatever clinical threshold is relevant. This design will allow one to conclude that the treatment didn't work, unlike stopping for futility about H0:HR=1 with a wide confidence interval.

Expand full comment

This JAMA piece is truly interesting. Was quite intrigued by the zero intracranial bleeds with apixaban vs 7 with aspirin. (The authors do concede that this may be a chance outcome.)

They also mention that the all-cause mortality data presented a noticeably different picture for the two groups, with 12 deaths in the Apixaban group and 8 among patients receiving aspirin.

"The secondary safety outcome of all-cause mortality occurred in 12 patients receiving apixaban (annualized rate, 1.8%) and 8 patients receiving aspirin (annualized rate, 1.2%) (HR, 1.53 [95% CI, 0.63-3.75])."

Would have appreciated a deeper look at both of these safety outcomes.

Expand full comment

It is fantastic that their team saw it through, from biologic plausibility, to hypothesis, to clinical trial.

I can’t help but think that, had the study been “positive”, it would be in NEJM and not JAMA. It’s maybe not “publication bias”, but perhaps “positivity bias”.

I do hope that they test their theory further with an alternate metric for atrial cardiopathy, such as with left atrial reservoir strain.

Expand full comment

'Negative' data is STILL data....when did we forget this?

Expand full comment