26 Comments

I think the approach taken in the ELAN trial is indeed an interesting one and showcases a less rigid approach to clinical trial interpretation. By not imposing a binary classification of "significant" or "not significant" based on p-values, the study acknowledges the underlying uncertainty that is inherent in any statistical analysis.

However, there are still considerations to keep in mind when interpreting the results of this study. Confidence intervals, while providing a range of potential outcomes, should not be misinterpreted as the full range of possible outcomes. They only capture 95% of potential outcomes assuming the data follows a specific distribution. This is a reminder that statistical models are simplifications of complex realities.

Furthermore, while embracing the uncertainty in trial results is laudable, it is crucial to ensure that trials are adequately powered to detect a clinically meaningful effect. A small sample size, while often necessitated by practical constraints, can increase the risk of a Type II error - failing to detect an effect when one truly exists. This could potentially limit the interpretability of the results and their applicability to wider patient populations.

Lastly, while p-values are often criticized for their misuse (p-hacking) and overreliance, they do play an important role in hypothesis testing and controlling the rate of false positive findings. It would be important to maintain a balanced perspective that incorporates both effect estimates and their precision (confidence intervals), as well as hypothesis tests (p-values) to inform clinical decisions.

Expand full comment

Just note that you are thinking in the frequentist box. Bayesian thinking involves playing the odds rather than thinking about "errors". Computation of the probability of efficacy and the probability of similarity of two treatments would be more helpful IMHO.

Expand full comment

I have less and less faith in guideline writers in recent years. And I am less charitable than Dr. Mandrola tends to be. I have a feeling things are not often on the up-and-up, and that some of it is in fact nefarious. The conflicts of interest these days are so large as to block out the sun.

The latest on HFpEF (ie SGLT for everyone), for example, is nauseating.

I would much prefer if guidelines highlighted those things that are clearly “settled” (if I may use that loaded word), and things that are clearly harmful, and leave everything else as “considerations”, rather than the current completely arbitrary and recipe-centric “classes”.

Expand full comment

Absolutely. This is why I tell people even if an AI is hyper-intelligent, it will set medicine back decades. Guideline writers will supply nefarious, as you say, instructions to the AI. If we had an AI GP when those Alzheimer drugs came out, they'd have put half the geriatric population on them.

NPs have taken over cardiology in my neck of the US; and have been applying GDMT uncompromisingly and blindly. I've seen them cause strokes secondary to hypoperfusion more than once; while real trained cardiologists are nowhere to be seen. We've abandoned our profession and we will pay for it.

Expand full comment

My PCP is an NP, and you're right that guidelines are all that matter to her. She did, however, respect my wishes not to go on a statin for primary prevention, albeit somewhat disdainfully. However, if I went to a cardiologist, I suspect that s/he would also recommend a statin, despite the risks being greater than the benefits in primary prevention.

https://thennt.com/nnt/statins-persons-low-risk-cardiovascular-disease/

Expand full comment

This has nothing to do with physician versus NP. Both would've respected your wishes nonetheless. An NP respecting your bodily autonomy; which is your legal right, does not qualify them to practice medicine.

And this also falls down to the personal style of each physician, you will find a multitude of opinions depending on the cardiologist. I'm not too gung-ho about prescribing for primary prevention. I am more aggressive in diabetics but that's secondary preventio.

Expand full comment

I have read about some physicians who won't treat patients who choose not to follow their recommendations, although I don't know how common it is, and I also don't know if it's motivated by fear of being sued. I once told a doctor who wanted me on a statin for primary prevention "no thanks" and that I would sign a waiver, but she didn't make me do that. She just moved elsewhere! :) It sounds like you are not impressed by NP's qualifications to be PCP's. Mine has her doctorate, but now I'm wondering if I should switch! The reality is, there just aren't enough MD's doing primary care these days in many locations.

Expand full comment

That's something else.

CMS has been strangling PCPs for a while now; where they have been drowned in paperwork and had their compensation tied to patient-driven outcomes. What I mean is, CMS will pay you less if X% of your patients don't undergo colonoscopies. Or if X% of your patients have an A1c above a cutoff. It's absolutely insane, and they're doing another wave of it now.

If you want to see the physician perspective to this; a recent thread on r/medicine is complaining about that just that. It's one of the reason no one does primary care anymore, and one of the MAIN reasons corporations have bought up tons of private PCPs as of late. You can't afford the overhead of paperwork on your own.

So that PCP of yours had a conflict of interest: Medicaid would pay her practice MORE if you're on a statin. It goes back to the use of guidelines as a hammer, where the government (and insurance companies and hospitals) don't want doctors to use their brains, but only to put checkboxes on spreadsheets.

As for your DNP; do not leave him/her on account of anything I've said. As you said, it's hard to find an MD and any care is better than no care; especially if you have a good trusting relationship with them. Just keep in mind, have a low threshold to go see a doctor for a second opinion if you feel something's off.

https://www.reddit.com/r/medicine/comments/14a12ys/cms_launches_a_new_model_for_primary_care/?utm_source=share&utm_medium=ios_app&utm_name=ioscss&utm_content=1&utm_term=1

Expand full comment

Thanks for sharing the MD perspective. I hope CMS reads r/medicine! It sounds good but when you read the comments, I guess that's because they used all the right buzzwords. If patient outcomes are what CMS is trying to improve, why are they tying compensation to interventions? Outcomes and interventions are not the same. As we recently learned about colonoscopies, they apparently don't reduce mortality. Same with statins in primary prevention. You said Medicaid would pay my PCP more if I were on a statin. I'm not on Medicaid, so if that's true, then Medicaid pays more for the total # of patients in a practice on a statin, regardless of their insurance? Strange. But I guess I definitely need to ask her about conflicts of interest! I really think practices should be transparent about conflicts of interest, but I guess that's asking too much. Don't worry....I won't leave my NP for now, but I definitely won't be shy about getting a second opinion if I'm concerned. I think NP's tend to refer more to specialist doctors anyway. That happened to me once with an NP, and the MD seemed rather disdainful that the NP couldn't read my labs correctly and made an unnecessary referal to him, although he was super nice in explaining it all to me very comprehensively. Anyway, something's gotta give with primary care before the system totally breaks!

Expand full comment

I will repeat my comment from part 1 that a picture is worth a thousand numbers:

https://postimg.cc/jCFv3C9K

The x-axis is the actual rate of a bad outcome, and the curves show the probability of that value, given the results of the study, with a flat input prior: blue is early administration, orange is late.

If I was shown these curves as a patient, I would unhesitatingly ask for the blue protocol, unless my physician could clearly articulate a reason for choosing the orange protocol.

Wouldn't you?

Expand full comment

Any study of medical and psychological conditions particularly those dealing with clots MUST determined vac’d status and presence of spike protein and what its concentration was/is. In this time of vac’d related health effects we need to quantify its true affect of the data results.

Expand full comment

Don't leave out infection related health effects, which seem to be more common.

Expand full comment

While I love the idea of being creating about clearly conveying uncertainty, I worry that presenting 95% confidence intervals this way is deceptive.

"estimated to range from 2.8 percentage points lower to 0.5 percentage points higher (based on the 95% confidence interval)"

That's not really what the confidence interval means. I could write out the complex statistical definition that confuses everyone, but the bottom line is the authors computed this interval by inverting a statistical test. The confidence interval *is* based around a null-hypothesis significance test. If you are computing a confidence interval, then implicitly or explicitly you are testing statistical significance. Even in this sentence, the 95% refers to the unmentioned statistical test.

This rhetorical decision in this study is more akin to reporting NNT instead of absolute risk reduction: it is a presentation of the same statistical quantity that attempts to convey a different perspective from null-hypothesis testing. But under the hood, CIs and p-values are using the same statistical epistemic framing.

Expand full comment

True, so perhaps the authors left that out because they didn't want to convey the message that results were not significant? BTW, could the large confidence intervals be due to high variability in the sample? Or do you think the sample was simply underpowered?

Expand full comment

As a psychologist, I have mixed feelings about this. I recognize that most studies have at least some degree of uncertainty, and it takes years and millions of dollars to accumulate sufficient evidence to gain confidence in recommendations, and even then, there will always be some uncertainty, as humans are not all the same, and therefore do not all respond the same. So it makes sense to "embrace the uncertainty" in medicine, and use medical judgment for each individual case, so as not to be robots. (It would be interesting to see what Chat GPT-4 would recommend, and how confidently it would do so). The examples presented in the article here are relatively easy ones to decide, but I suspect most cases are not. The problem is that humans are stressed by uncertainty, and to place a very uncertain decision in front of a patient who is already highly stressed is not ideal (see link below - the original study is linked in the article). Non-physicians tend to think doctors sound overly confident due to ego, and while that might be part of it (it's a coping mechanism to deal with all that uncertainty!), I think it's also because having a doctor who appears to be confident (but not arrogant) gives the patient greater confidence and therefore, less stress. So doctors need to help their patients make an informed decision in a context of uncertainty without placing undue stress on them, which means making an accurate assessment of the person's anxiety level and coping resources, and providing the degree of psychological support they need. When a doctor feels unprepared to do so, whether by training or temperament, they should obtain support in doing so from somebody who is (e.g., nurse, PA, colleague MD, etc.).

https://www.theguardian.com/commentisfree/2016/apr/04/uncertainty-stressful-research-neuroscience

Expand full comment

You make a good point about the stressful effect of uncertainty. On the other hand in my experience some physicians don't put enough emphasis on the uncertainty of their recommendation for surgery, for example. Then when a bad outcome occurs, patients will recall with anger for a very long time their physician's excessive certainty about a good surgical outcome. Maybe, as suggested by the article you linked, we all need to realize that "If high uncertainty is really unavoidable, if the Buddhists are right, gratification is transitory and suffering inevitable, then, in the big picture, the odds of adversity aren’t 50% but 100%. So, if we’re concerned about the big picture, we might as well worry less about snakes and practise the art of surrender."

Expand full comment

I had a C5-C6 fusion, which my surgeon told me beforehand was 99% effective in relieving pain. In my post surgery follow-up, I told him the pain was the same as before surgery. I wasn't angry at him for his overconfidence beforehand, but I got very angry at his response. He asked if I had any tingling in my hands, and I said no. He said, "You told me you had tingling in your hands and now you don't, so the surgery was successful." I had never told him I had tingling in my hands (because I didn't), and my only symptom was neck pain, but he basically called me a liar and continued to insist my surgery was successful. OK, THAT made me angry, even though as a shrink, I understand why he had to lie to himself to justify what he said and did. In my case, I did not have a life threatening illness. I was thinking more about situations in which patients have a life threatening condition, which is very stressful to begin with, and adding uncertainty about treatment may only exacerbate their anxiety. I'm not saying doctors shouldn't be honest about the uncertainty, and respecting patient autonomy requires such honesty. I just hope that doctors are mindful of the additional stress, and help their patients cope with it. I agree it would be best for both doctors and patients to be more Buddhist-like and practice the art of surrender, but I know such is not the case in our society. Developing such a stance goes against our evolutionary-based existential anxiety, and is not easy. Just ask any monk about how much time and effort they put into it! :)

Expand full comment

"Judgment" easily turns into "voodoo".

And voodoo draped in "numbers" is worse that proclaiming "expertise" by putting on a white coat and hanging a stethoscope around your neck.

There is uncertainty.

Point estimates are misleading. I don't think they should ever be given. It should always be a range, aka confidence or credibility interval.

Abandon 95% (~ 2 sigma) and replace it with 99.7% (3 sigma). This will force medicine to come to grips with its lack of substantive knowledge.

There is a difference between the returns of 100 people going to casino with the same system that works 95% of the time, and one person going for 100 consecutive days (unless he goes bankrupt before he makes it 100 days). The former is like a population study, the later is treating an individual patient.

Human beings, without training and experience, have a hard time distinguishing 1 in million from 1 in 1000 from 1 in a 100. We manage to simultaneously overestimate and underestimate.

Of course, you have to use judgment.

Expand full comment

This is a good article. It's sure not the first time that the use of confidence intervals is shown to be straightforward, dummy-proof almost, and very helpful --- indeed almost so helpful that Common Sense can be said to have come into play. Who knew? On the other hand, ritual abuse of NHST has been so widespread in the recent decades of academic "work product" output that its appearance in journal articles nowadays (regardless of the particular specialty or context) is essentially just an instance of recurrent cliche. Sadly, a fairly substantial fraction of medical doctors would more likely than not be hard-pressed to utter a correct terse explanation of just what a P-value means in the hypothesis testing rigmarole. One of my favorite queries to bright senior surgical residents was to ask them about this matter and a VERY common answer was rather often: "Oh, the P-value is the computed probability that the observed result would have occurred by chance". Then the climax of my pedagogic probing was ushered in with a simple followup question: "The chance of exactly what"? A grim silence was almost always the response achieved.

Expand full comment

Nice article John. A few statistical points. First, the interpretation of a confidence interval is very tricky because of the infinite sampling setup used in traditional frequentist statistics. A CI doesn't mean exactly what you stated because a CI doesn't have a within-study interpretation. This problem can be somewhat solved by thinking of CI's as "compatibility intervals" as Sander Greenland has pushed. E.g., "the data are compatible with a true unknown crude gross marginal treatment difference of x to y under compatibility criterion C" where C needs to be spelled out, e.g., refer to 0.95 confidence coverage under some assumed model. Second, a pooled composite binary outcome (union of separate outcomes / time to first event) makes various assumptions that are clearly not true, e.g., events are of equal impact to the patient and death after a nonfatal endpoint can be ignored. Oversimplified composite outcome analysis does not answer the question "do patients on treatment B do better than patients on treatment A". An analysis that respected the data generating process would be able to answer "did the worst thing that happens to a patient on B in a given day tend to be better than the worst thing that happens to a patient on A on that day?". The probabilities involved such a longitudinal current status analysis can be easily converted to differences in expected time in any clinical status of interest. Finally, your article was written as if the sample size was "golden" and had to be a pre-defined constant. Had the original study had a sequential design, it could have been kept going a while longer until some criterion is met. For example it is reasonable to experiment until one has an answer, where "answer" could be definitive evidence for either a positive effect, no effect/harm, similarity of outcomes between treatments, or futility. When thinking sequentially instead of trying to interpret equivocal results, studies can be more clinically informative.

Example futility evidence: Suppose that a study planned to look at the results monthly. In a given month suppose that the cumulative results show very slight harm of treatment B. The probability that the posterior probability of benefit will ever reach 0.95 or higher at higher sample sizes can be computed. If this probability is below 0.65, for example, one may want to declare futility. Futility can be triggered even earlier by considering posterior probabilities of non-trivial clinical benefit.

Expand full comment

It’s been years, but I loved statistics in college and graduate school. Understanding bell curve, confidence intervals, and have been known to read the leaflets that come with Rx drugs. When I would report a side effect from a drug, my primary care doctor would say: “hmmmm, that’s not one of the side effects.” I concur wholeheartedly with this article: use judgment!

Expand full comment
Comment deleted
Expand full comment

Please don't include NNT or NNH. These are harmful to thinking: https://discourse.datamethods.org/t/problems-with-nnt

Expand full comment

An event rate of 2% vs 1% would be going from “very unlikely” to “trivially less likely”. And event rate of 93% Vs 92% is going from “very likely” to “trivially less likely”. The similar 1% ARR describes the similar “trivially less likely” nature of both of the purportedly superior treatments. I don’t see any difficulty in parsing that in either instance. It would be far more disingenuous in my book to report the “50% RRR” in the first instance.

Any trial result requires the characterization of the “average pt”. Any application of trial data requires extrapolating the results observed from an average pt (also under ideal and precise trial environments) onto Mr. Smith or Mrs. Jones, who lives in the messy real world. Any data you apply from any study onto any individual pt will espouse this uncertainty, regardless of what metric you use to characterize effect size.

Expand full comment
Comment deleted
Expand full comment

I have to have a friendly disagreement about that. I think there is a chance that NNT is misleading even under ideal situations such as primary prevention. That's because people with mild risk factors may have NNT that is 10 fold greater than those with serious risk factors. NNT uses averages and may not apply to anyone. Primary prevention is best served by focusing on those with more risk, in some cases.

Expand full comment