Rescuing a Non-Significant Trial with A Meta-Analysis

The Study of the Week shows you how an industry sponsored trial, which failed to deliver a win for its drug, can be saved. This is not a good thing.

Sep 22, 2025

∙ Paid

Sodium–glucose cotransporter 2 inhibitors (or SGLT2i) have been shown to improve outcomes in patients with heart failure (HF)—mostly in those with reduced systolic function. These are the fourth of four drug classes for patients with HF; the other three being renin angiotensin inhibitors, beta-blockers and mineralocorticoid receptor antagonists.

SGLT2i are usually the last drug to be added and are often omitted. The reasons for lower SGLT2i usage relative to the other HF drugs probably relates to a) newest to be discovered and adoption is slow in medicine, b) higher costs and, c) perceived lower effect size than the other drugs. (Dapagliflozin in DAPA HF showed a strong signal of lower CV death and all-cause death, but empagliflozin in EMPEROR REDUCED failed to show a reduction in CVD or all-cause death.)

One way to sell more product is to induce doctors to start your drug early while the patient is still hospitalized. The idea: get the medicine on the discharge sheet and it’s more likely to stick. But…in the seminal DAPA HF trial there was a 2 week screening period before being enrolled—as outpatients.

The goal of the DAPA ACT HF-TIMI 68 trial was to test the efficacy and safety of in-hospital initiation of dapagliflozin in patients with HF.

Trialists randomized about 2400 patients to either dapa or placebo.

The primary efficacy outcome of either CV death or worsening HF at 2 months occurred in 10.9% in the dapagliflozin group vs 12.7% in the placebo group (hazard ratio [HR], 0.86; 95% CI 0.68-1.08; p=0.20).

The individual components of the primary--worsening HF and CVD—were also non-significant. All cause death had a HR of 0.66 95% CI of 0.43-1.00.

Safety events, including low blood pressure and worsening kidney function, were slightly higher in the dapa arm. Adverse events leading to study drug discontinuation were 3.6% vs 2.2%, dapa vs placebo.

Let’s pause there and see what you would conclude:

You might conclude that despite randomizing more than 1200 patients per group, of which about 1 in 10 had a primary outcome event, there was no significant difference in the primary endpoint.

You might also note the uncertainty in the effect size. The 14% reduction in CVD and HF event may be clinically meaningful and the lower bound of the 95% confidence interval holds the possibility of a 32% reduction. (But it also holds the possibility of dapa being 8% worse.)

Nonetheless, by all standards, this was a non-significant trial. We cannot declare early initiation of dapagliflozin is effective.

This is not—exactly—what the authors concluded.

They added a bonus meta-analysis wherein this trial’s non-significant results were combined with a subgroup of patients from one trial, and another nonsignificant trial. The results of this mixture outputted a 29% reduction in the composite endpoint of CVD/HF event.

It allowed the authors to add this line to the conclusions:

However, the totality of randomized clinical trial data suggests that in-hospital initiation of SGLT2i may reduce the early risk of cardiovascular death or worsening HF and of all-cause mortality.

The two added trials: The SOLOIST trial of sotagliflozin had a subgroup of patients that were initiated in-hospital. And the EMPULSE trial tested in-hospital initiation of empagliflozin. The size of the SOLOIST and EMPULSE groups were smaller than DAPA ACT HF-TIMI 68 trial.

The authors justify this move by declaring the meta-analysis as prespecified in the protocol. But I am not sure it was used exactly as it was pre-specified. In the trial rationale paper they write that

A trial-level meta-analysis of data from published trials of SGLT2is in patients with acute HF will be used as an informative prior distribution.

The italicized portion implies that they will use the meta-analysis to inform a prior probability distribution for a Bayesian analysis. This was not done in the main paper. Instead the authors simply added the trials together to take a summed effect. This is not a Bayesian analysis at all.

That makes me wonder about things. Things I’d rather not think about.

The other problem with this approach is that it is spin. Spin is defined as language that distracts from a non-significant primary. Adding the meta-analysis takes a readers’ attention from the non-significant primary.

Trials should stand alone. You can meta-analyze underpowered trials in a separate paper. There is a great debate in evidence circles about whether users of evidence should put more weight on single trials or combination of trials.

Here, I see a large well conducted trial with null results; and more safety events in the dapa arm. I put more weight on that then the post-hoc combination.

A side note: ignore the all-cause death signal because it is noise. If SGLT2i were to reduce death, it would have to be via reduction of CVD or HF. What’s more, there is no previous data suggesting SGLT2i save lives in the first two months.

Another issue was that the two other trials combined in the meta-analysis were substantially different from the main trial. SOLOIST studied a different drug and included patients with diabetes. EMPULSE also studied a different drug and had different endpoints. Combining trials with different procedures is not ideal.

Take-Home

Continue reading this post for free, courtesy of Sensible Medicine.

Or purchase a paid subscription.