Incentives and Noninferiority Designs
Noninferiority trials can play an important role in advancing medical care. JAMA published a terrific Users’ Guide on how to read these years ago. There are times, however, when researchers employ a non-inferiority design where a superiority design would seem to make more sense. There are also frequent debates about non-inferiority margins and the endpoints used in these trials.
Today, Dr. Scott Matson considers the incentives that drive these decisions.
Adam Cifu
Incentives shape all human endeavors. Even with ethical and moral guardrails, this is as true in biomedical research as it is in any other field. In many cases, such as the collaborations between the for-profit pharmaceutical industry and academic medicine, there is a clear financial incentive for novel therapies to make it to market where our system will funnel billions of dollars for the new pill. Despite the conflict of interest inherent in this bargain, our system closes its eyes and barrels forward, assuming that there is enough confluence of interest that we get more GLP-1s and imatinibs than we get Entrestos and oseltamivirs.
The biomedical market is not tied to true market forces; however, it remains broadly true that a novel therapy that actually helps patients live better or live longer has an easier and more lasting path to riches than one that eeks out statistical significance via dubious surrogate outcomes with questionable trial design. In its current form, the path to market sits almost entirely within the domain of the pharmaceutical industry, where biomedicine intellectual property for novel therapies is either developed in the pre-clinical labs of large pharma companies or purchased from the academy by for-profit forces that are able to develop any financially important intellectual property for pre-clinical testing. These same companies front ~95% of the dollars spent on human clinical trials and are left to design clinical trials to test for clinical efficacy, where millions of dollars sit on one side of p = 0.05 while a shrug emoji sits on the opposite side; see incentive.
Unfortunately, the academy is not immune to incentives either. In a world where academic clinical trialists have a very small pool of federally funded dollars over which to compete, it is imperative that these dollars be used judiciously to pursue clinically meaningful studies that ask questions the industry is not incentivized to explore. There are countless examples of these types of studies being done in novel and creative ways to generate highly generalizable data to inform clinical care. But academic investigations can be prone to their own categories of incentive-driven foibles.
One increasingly observed phenomenon in these studies is the gamification of endpoints and the non-inferiority margin. The noninferiority framework is most compelling when two therapeutic strategies are expected to produce similar clinical outcomes but differ meaningfully in cost, convenience, tolerability, or safety. In such cases, investigators pre-specify a margin of noninferiority — the largest difference in the primary outcome that patients and clinicians would deem acceptable in exchange for the potential advantages of the new approach.
Increasingly, noninferiority trials are designed with wide margins and with endpoints insensitive to the intervention’s plausible effects. Under these conditions, the probability of achieving noninferiority approaches inevitability. These features create a form of publishability bias that has received far less scrutiny than industry-associated bias.
Although noninferiority trials are often justified on pragmatic or ethical grounds, they offer investigators one powerful and underacknowledged advantage: a high probability of achieving statistical significance. In this case, it’s not billions of dollars on one side of the p = 0.05 fence, but it’s academic glory from publication in the highest impact journals. Wide noninferiority margins paired with blunt endpoints make it far easier to demonstrate “success” than in superiority trials, where true differences may be small and the risk of statistical failure is substantial. When investigators must choose how to allocate limited federal funding, the tradeoff becomes stark: a superiority trial with a smaller sample size and a meaningful possibility of failing to reach significance, or a large noninferiority trial with a high likelihood of a publishable p-value.
Because publication in journals such as The New England Journal of Medicine or JAMA strongly influences academic promotion, grant competitiveness, and institutional prestige, the incentive to select trial designs with a greater probability of statistical success is immense. Thus, the noninferiority framework—intended to facilitate rigorous comparison—can become a mechanism by which academic research quietly optimizes for publishability rather than clinical discovery.
A recent example is the EVERDAC trial, which evaluated whether deferring arterial catheterization in shock is noninferior to routine early placement with respect to 28-day mortality. Arterial lines are used to support precise vasopressor titration, timely detection of hypotension, and safe arterial sampling—not to improve short-term survival. Any mortality effect attributable to their presence would therefore be small. Yet the trial used a 5-percentage-point noninferiority margin for mortality, a tolerance far larger than the physiologic effect size one could reasonably expect. Under such a design, noninferiority is not a surprising outcome; it is the default one. Indeed, if a blood-pressure–monitoring strategy were truly capable of producing a 3% or 4% absolute difference in mortality from shock, that would represent a transformative advance in critical care, not an acceptable decrement.
Whether arterial lines are clinically noninferior depends on outcomes the trial did not measure: vasopressor exposure, arterial injury, pain, resource use, or provider workflow. These are the tradeoffs that matter to clinicians, patients, and health systems. Without them, a mortality-based noninferiority conclusion cannot meaningfully adjudicate strategy. Yet trials powered for binary mortality endpoints are more likely to receive editorial priority than those focused on resource use or workflow, reinforcing incentives to optimize for publishability rather than clinical relevance.
These designs do not arise from methodological negligence; they reflect structural incentives within the academic ecosystem. Investigators face pressure to design feasible and publishable studies, making wide margins and blunt endpoints appealing tools for reducing sample size and increasing the likelihood of success. The trial’s conclusion may simplify ICU debates, yet it does not resolve the practical tradeoffs clinicians face. And once substantial public funds are invested in a large noninferiority study, it is unlikely that future funders will support a second trial examining more granular but clinically relevant outcomes.
These forces shape research questions before enrollment begins. When a noninferiority margin is too wide, the trial may succeed while the intervention meaningfully worsens care. When the endpoint is insufficiently sensitive, the trial may declare equivalence while masking clinically important differences. And because noninferiority is often interpreted implicitly as “just as good,” even when margins make that interpretation unwarranted, these trials can exert influence disproportionate to the certainty they provide.
Three factors would lead to a more rigorous approach to noninferiority trial design.
Noninferiority margins should reflect the largest difference patients and clinicians would tolerate, not the difference required to make the trial enrollment feasible. Margin justification should be explicit, mechanistically plausible, and sensitive to clinical context.
Primary endpoints must align with how the intervention exerts its effect. Mortality is appropriate for therapies capable of altering survival trajectories; it is far less appropriate for changes to monitoring or workflow.
Protocols should avoid suppressing the pathways through which differences might emerge; doing so undermines the capacity of the trial to inform.
We need to broaden our understanding of bias. Academic incentives can shape trials toward publishability, just as industry incentives can shape trials toward surrogate-outcome benefit. The safeguard is transparency—about how margins are chosen, how endpoints align with mechanisms, and how design decisions influence the likelihood of statistical success.
Scott Matson is an academic clinician scientist in pulmonary and critical medicine. His research focuses on translational biomarker discovery and clinical trial design in autoimmune and fibrotic lung disease.



Agree. Industry incentives are overt and obvious. Nice to have a post that shines a light on academic incentives.
I would say that my generation had “hard outcomes” drilled in as the de facto gold standard, but this is a good reminder that hard outcomes, upon which an intervention has no hope of impacting, will simply make “non inferiority” inevitable. This lesson should come in a box-set, paired with Dr. JMM’s oft-issued refrain where composites of outcomes that move in opposite directions also make non-inferiority the default result.
I wonder if the heuristic might be, if you know the outcome before that trial even starts, then that’s a bogus study.
The antiquated academic mantra (incentive) “publish or perish” significantly dilutes the quality of research in all fields of study—medicine being no exception. Perhaps we have too many research institutions, many of which should focus more on teaching. Perhaps the increasing number of non- inferiority studies is an indicator that too many federal dollars are made available. Can the country (taxpayers) continue to afford research that provides nebulous outcomes? My radical idea is that ALL federal funding of research be put on hold for 5-10 years; instead, promote private and corporate funding of research with tax credit incentives.