I thought the articles we posted a couple of weeks ago, Errors in Science: Self-Correcting, or Self-Propagating?, might have gotten a little too deep in the weeds. I was completely wrong. It stimulated a ton of great comments and emails. Here, Alex Byrnes responds.
Adam Cifu
The recent Sensible Medicine essay, “Errors in Science: Self-Correcting, or Self-Propagating?” and the paper it’s based on, are lovely pieces of metascience. They introduce a useful term, “non-Markovian error,” and emphasize the need for active correction of the record rather than random walks towards truth.
However, the authors overlook a problem which may be equally important. What if the errors we see in science are non-random and intentional? Intentional doesn't need to mean fraud. Three common sources of error have been known long enough that it's hard to call them unintentional anymore: p-hacking, base rate neglect, and publication bias.
The reason for saying "intentional" is important is that if we misdiagnose the problem, the treatment is wrong too. If researchers are p-hacking unintentionally, we can appeal to them. But if researchers are using gray-area questionable research practices (QRPs) to get ahead, they can't really be appealed to. This explanation fits the data much better.
In the 2010s, early reformers, who were mostly psychologists, answered the question about intentions in a very polite way. Researchers are p-hacking unconsciously, they said, or forgetting their hypothesis. Over the next decade, reformers made tremendous progress, but it became less and less believable that intentions weren't involved.
Even now, and in the face of 10, 20, or 200 years of warning, depending on how you count it, the locus of control is still depicted as somewhere other than the researcher. In an article that came out in May, p-hacking is something that “happens to you.” In the recent, flawed paper about migration of p-values, the term for moving p-values is never used, nor is any replacement provided. We’re expected to suppose notches in the p-curve just happen too.
The authors’ paper on Markovian errors doesn't mention intention. Biased non-Markovian errors are described as external to the researcher, and non-biased non-Markovian errors are explicitly committed by previous researchers.
In reality, both can be deliberate. Choosing to collect two small samples instead of one large one is a non-Markovian attempt at a large effect. Collecting a lot of variables, knowing that you, or someone in your lab, will inevitably stumble on something to support the desired hypothesis, is non-Markovian. In fields where only one direction of result is, as the authors put it, seen as “generally good,” researchers had better be ready to be biased or risk losing their White Hat. Doing a meta-analysis in one of these fields is also guaranteed to find the desired result.
The authors’ article, on the other hand, does mention intentions, but it uses the standard dichotomy: innocence or deliberate misconduct. Deliberate questionable research is the most common, impactful, and invisible category.
There’s very little we can do to detect p-hacking, base rate neglect, and publication bias, but these are now the most obvious causes of the replication crisis, and the remedy is most certainly not pleas. It may not be blame and condemnation, but it is certainly not pleas.
The core issue in science reform is that we admit that incentives drive QRPs, but then we don't actually give up our incentives, or let anyone without incentives (or the opposite incentives) into the community. “Opposite incentives” is anathema because, by god, some of our incentives were political, and we'll be damned if we let any more of that in.
Our answer to incentives has been to continually appeal to scientists. This part of the authors' article has been written and repeated in papers, lectures, and meetups on and on hundreds, maybe thousands, of times.
One promising answer to the problem is to fund meaningfully adversarial post-publication review (Smoliga, 2025; Piller, 2022; Szabo, 2025). Someone may have a better one. Whatever we do, it can’t be meaningless, toothless repetition of pleas and unlikely explanations. In the 2010s, that was polite — maybe even plausible. In 2025, it is simply unscientific.
Alex Byrnes is an independent researcher and commentator who writes at Red Team of Science.
Photo Credit: Daniela Holzer
As a graybeard who has grappled with the propensity of healthcare to perpetuate error (see Cochrane's Brake: Randomized Controlled Trials and the Doctor's Pen), it seems to me that the errors mentioned are most common among young investigators who have not seen enough of the recurring patterns to include such critiques in their discussion of their own results. For me, it took into my 40s, after years of immersion in my primary research topic, to be begin to ask the hard questions, and maybe another 10 years to accept that the propensity to error is nearly universal in healthcare. Perhaps we need to start earlier in our education/training to open the eyes of our students. I remember being astonished by a young Vinay Prasad's recognition of "reversals" as historical evidence of error eventually unmasked. I hold high hopes for his return to the battlefield.
"There’s very little we can do to detect p-hacking, base rate neglect, and publication bias...."
I found a pretty clear cut case of publication bias earlier this year. I brought it up with some doctors and researchers on substack and a certain medical science forum that I won't name here. No one cared. Doctors are very willing to be led by the nose and just cite something in JAMA uncritically and are more prone to motivated reasoning than any group of people I've ever met. The researchers themselves are certainly aware of all the shenanigans. Let's not pretend any of this is unintentional.