come on John; a 0.6% difference; only a pharmaceutical could claim that as something;
surely we are in statin territory here; specious claims of efficacy;
Dion says it well; the absolute risk reduction is so small (over 200 need to spend the cash to claim even one might benefit .......)
Dion comments "since there are always also adverse effects to consider," and do we all believe a pharma is going to really tell us the incidence of problems; ... please .....
it showed that in a study, 40% of heart diagnoses; by cardiologists; looking after ill people; were OVERTURNED by a secret committee; run by the researchers; getting paid by the company; months later, who never saw the patients obviously; come on John; so many of us are so over all of this stuff
raw data was vigorously and systematically re-worked; John: why don't you review that paper sometime? If we saw daylight in one paper; how much more is going on? We admire your honesty and thoroughness; you do a great job.
Well stated and that reference you linked should be read by anyone with a serious interest in reviewing medical studies. I wrote a book on the cholesterol myth a dozen years ago and reviewed probably several hundred journal articles on the subject. I found that a favorite trick used by the researchers pushing the narrative was to use composite end points where most of the differences were loaded into the components that were subject to interpretation such as cause of death. Frequently the cause of death was taken from death certificates (completely unreliable) or, in some cases from conversation with relatives of the deceased. I also advised readers to ignore the statistical analysis and just look at the raw data and use common sense to determine whether the reported differences are of any practical significance. As you pointed out, in this case the differences were so small that it is hard to believe anyone would give it any credence at all.
thank you sir; thanks for all the details; I must look for your book; the great myth; the cholesterol myth; the statin myth; kept going for ever; hot air and nonsense.
You do a good job of explaining subtleties in research well. I really appreciate your making this kind of analysis available without the new Substack teaser feature. I am not a medical researcher but have always wondered about the small differences so often report in many research studies. Now I have a better understanding of those numbers. Thank you.
But, to make things simpler, does it matter whether an absolute risk reduction of less than 1% is signal or noise? If it is real then the number needed to treat would be more than 100. And, since there are always also adverse effects to consider, then the overall net clinical benefit is not likely to be relevant here.
It certainly seems to be the case that most contemporary cardiology trials overestimate the placebo arm event rate, the IP effect size, or both. And it’s about finding the balance of enrolling enough pts without enrolling too many, for the reasons Dr. Mandrola has alluded to. It’s an elusive goldilocks point.
But it seems also quite common that trials end up increasing sample size and/or follow up duration when it seems they were underpowered at the outset, in order to accrue more outcome events. That seems to be the real world compromise. I’m not sure if there’s a better solution than that, when it comes to predicting the future (ie. what the trial results will actually show).
But I disagree that any confidence interval crossing 1 is an underpowered trial. In this case, it depends on what the assumptions of placebo rate and effect size were. Also, a confidence interval range of 2.2% doesn’t strike me as being widely disparate. Also, by definition, any CI that crosses 1 and fails to reject the null will mean that the possible effect range includes both some degree of harm and some degree of benefit. If one does not accept that to be a null result, then when would there ever be a null result?
However, I do object to the NEJM article conclusion statement. If all we wanted to do was to NOT rule out a possible device benefit, then we should simply not do trials. NOT looking at a device will also NOT rule out the possibility of it being beneficial. It’s an idiotic conclusion statement for a NEJM article. The point is to try to rule in a benefit, which, as far as I’m concerned, the proponents have failed to do with the study in question.
One cannot know what any drug is really doing to a body, diseased or not. What may appear to be a positive outcome may in fact be the body repairing and healing itself. The one-size-fits-all approach to medicine is a failure. We have constant drug trials, new drugs and protocols and yet while more people than ever are taking more drugs than ever, the results speak for themselves.
We are still getting sicker and more diseased as a nation (US). So I ask, what the heck good is another trial that so often has to be data re-manufactured into something that might pass muster with the FDA or appear to be of value? A 95% confidence level has not turned the tide where increased drug usage presents a healthier, longer living in better health population.
Clearly the TAVR trial shows a major bias in their conclusion ("but on the basis of the 95% confidence interval around this outcome, the results may not rule out a benefit of the *device* during TAVR."). If we don't follow the rule of 95% CI, most trials may have have positive results.
A post on my substack noted that "The Introduction to a good statistics text will tell you that 'what we do in statistics is to put a number on our intuition.' .... The idea is that you start from the science, from the question to be answered and what the outcome will look like. You propose or apply a mathematical model to the results of your experiment. In other words, the medical or scientific question comes first.... A major defect in the medical literature is that often the opposite is what’s going on — many papers are trying to come up with an intuition to fit a number, trying to derive the science from the statistics. ....The implication, in these cases, is that your experiment did not have independent justification and the significance was revealed by the statistics. The corollary is that the type of experiment becomes more important than its quality."
The description of the case here: “A treatment to reduce stroke is tested in a clinical trial. In the treatment group, 2.3% of patients had a stroke vs 2.9% in the control arm. The question that everyone wants to know …” should be, first, the researchers assessment of how meaningful the procedure is relative to the data. Frank Harrell's comment pointing to Bayes may be helpful but it is the (philosophical) idea contained in Bayes that is key: statistics is taken as the belief in the data. Science is expected to be an intellectual activity. We trust that the researcher has enough training to interpret the experiment. Otherwise, who would have hired him? The most distressing thing about the advent of AI is that we ourselves have become like AI.
“For this, we look to the 95% confidence intervals.…” This is wrong. The key phrase in the post is "A treatment..." emphasis on "A." We might look first to our understanding, that is, our belief (a priori in Bayes terms).
Dear professor Feinman. There is a fundamental flaw in the way biological plausibility is translated to a hypothesis of clinical relevance. This is a pervasive phenomena that contaminates medical research, specially apparent in RCTs. Overestimation is no surprise. This is the cause of "negative" trials.
I have launched a Substack to discuss the translation of biological plausibility to clinical significance. I invite everyone interested in the topic.
Thank you Professor Feinman. The good sense exhibited in your comment led me to order your book. I can't wait to read it. I would recommend that everyone read the link you gave below on Red Meat and Diabetes Statistics.
Thanks for the comment. I think we do need to return to common sense. Of course, many papers that are published would not stand the test of common sense.
The first part of the NEJM conclusion is awful. The second part, about the compatibility interval (aka confidence interval) is good.
Higher effective sample size (and thus higher power) can be achieved by not allowing investigators to use endpoints that have such a high frequency of tied values. Higher real sample sizes can be achieved when necessary by stop pretending that the sample size is a number.
Investigators do want to know if something is signal or is it noise, but I subject that what they want to know more than this is the probability that the treatment benefits patients, which cannot be obtained from P-values and compatibility intervals. And if NEJM wants to include that first sentence of the conclusion they should demand that the probability of similarity of outcomes of the two treatments be high. With Bayes you can compute both of these probabilities.
Dear Dr Mandrola and Prof Harrell. I think the point here is overoptimistic effect estimation. People fail to translate biological plausibility to clinical relevance in multi-disease multi-treatment scenarios. It has downstream consequences the statistician couldn't see coming. It contaminates Bayesian reasoning as well. As professor Feinmann states above, the medical question comes first, and I think we clinicians are failing to provide good questions. I have recently launched a Substack to discuss biological plausibility. I invite those interested in the topic.
My primary question here, and I am a nurse with limited knowledge, is why strong anecdotal evidence cannot be used to make emergency medical decisions. This data compilation on IVM used in care homes in France came out in March of 2020. Residents being treated for scabies with IVM had an astonishingly low rate of Covid infections. If an EUA can be given for an experimental injection, why not for a long-tested and extraordinarily safe drug like IVM? This seems like sensible medicine to me.
come on John; a 0.6% difference; only a pharmaceutical could claim that as something;
surely we are in statin territory here; specious claims of efficacy;
Dion says it well; the absolute risk reduction is so small (over 200 need to spend the cash to claim even one might benefit .......)
Dion comments "since there are always also adverse effects to consider," and do we all believe a pharma is going to really tell us the incidence of problems; ... please .....
I commend this paper to all; please read it; get your head around its implications; https://bmjopen.bmj.com/content/12/12/e060172
it showed that in a study, 40% of heart diagnoses; by cardiologists; looking after ill people; were OVERTURNED by a secret committee; run by the researchers; getting paid by the company; months later, who never saw the patients obviously; come on John; so many of us are so over all of this stuff
raw data was vigorously and systematically re-worked; John: why don't you review that paper sometime? If we saw daylight in one paper; how much more is going on? We admire your honesty and thoroughness; you do a great job.
Well stated and that reference you linked should be read by anyone with a serious interest in reviewing medical studies. I wrote a book on the cholesterol myth a dozen years ago and reviewed probably several hundred journal articles on the subject. I found that a favorite trick used by the researchers pushing the narrative was to use composite end points where most of the differences were loaded into the components that were subject to interpretation such as cause of death. Frequently the cause of death was taken from death certificates (completely unreliable) or, in some cases from conversation with relatives of the deceased. I also advised readers to ignore the statistical analysis and just look at the raw data and use common sense to determine whether the reported differences are of any practical significance. As you pointed out, in this case the differences were so small that it is hard to believe anyone would give it any credence at all.
thank you sir; thanks for all the details; I must look for your book; the great myth; the cholesterol myth; the statin myth; kept going for ever; hot air and nonsense.
The book is out of print but last time I checked Amazon had a few used copies. The title is "The Cholesterol Delusion".
You do a good job of explaining subtleties in research well. I really appreciate your making this kind of analysis available without the new Substack teaser feature. I am not a medical researcher but have always wondered about the small differences so often report in many research studies. Now I have a better understanding of those numbers. Thank you.
But, to make things simpler, does it matter whether an absolute risk reduction of less than 1% is signal or noise? If it is real then the number needed to treat would be more than 100. And, since there are always also adverse effects to consider, then the overall net clinical benefit is not likely to be relevant here.
It certainly seems to be the case that most contemporary cardiology trials overestimate the placebo arm event rate, the IP effect size, or both. And it’s about finding the balance of enrolling enough pts without enrolling too many, for the reasons Dr. Mandrola has alluded to. It’s an elusive goldilocks point.
But it seems also quite common that trials end up increasing sample size and/or follow up duration when it seems they were underpowered at the outset, in order to accrue more outcome events. That seems to be the real world compromise. I’m not sure if there’s a better solution than that, when it comes to predicting the future (ie. what the trial results will actually show).
But I disagree that any confidence interval crossing 1 is an underpowered trial. In this case, it depends on what the assumptions of placebo rate and effect size were. Also, a confidence interval range of 2.2% doesn’t strike me as being widely disparate. Also, by definition, any CI that crosses 1 and fails to reject the null will mean that the possible effect range includes both some degree of harm and some degree of benefit. If one does not accept that to be a null result, then when would there ever be a null result?
However, I do object to the NEJM article conclusion statement. If all we wanted to do was to NOT rule out a possible device benefit, then we should simply not do trials. NOT looking at a device will also NOT rule out the possibility of it being beneficial. It’s an idiotic conclusion statement for a NEJM article. The point is to try to rule in a benefit, which, as far as I’m concerned, the proponents have failed to do with the study in question.
If more underpowered studies could be done with similar patients, methods, etc, then maybe a metanalysis could be done credibly. If only . . .
One cannot know what any drug is really doing to a body, diseased or not. What may appear to be a positive outcome may in fact be the body repairing and healing itself. The one-size-fits-all approach to medicine is a failure. We have constant drug trials, new drugs and protocols and yet while more people than ever are taking more drugs than ever, the results speak for themselves.
We are still getting sicker and more diseased as a nation (US). So I ask, what the heck good is another trial that so often has to be data re-manufactured into something that might pass muster with the FDA or appear to be of value? A 95% confidence level has not turned the tide where increased drug usage presents a healthier, longer living in better health population.
Clearly the TAVR trial shows a major bias in their conclusion ("but on the basis of the 95% confidence interval around this outcome, the results may not rule out a benefit of the *device* during TAVR."). If we don't follow the rule of 95% CI, most trials may have have positive results.
Here's my take on the problem:
A post on my substack noted that "The Introduction to a good statistics text will tell you that 'what we do in statistics is to put a number on our intuition.' .... The idea is that you start from the science, from the question to be answered and what the outcome will look like. You propose or apply a mathematical model to the results of your experiment. In other words, the medical or scientific question comes first.... A major defect in the medical literature is that often the opposite is what’s going on — many papers are trying to come up with an intuition to fit a number, trying to derive the science from the statistics. ....The implication, in these cases, is that your experiment did not have independent justification and the significance was revealed by the statistics. The corollary is that the type of experiment becomes more important than its quality."
The description of the case here: “A treatment to reduce stroke is tested in a clinical trial. In the treatment group, 2.3% of patients had a stroke vs 2.9% in the control arm. The question that everyone wants to know …” should be, first, the researchers assessment of how meaningful the procedure is relative to the data. Frank Harrell's comment pointing to Bayes may be helpful but it is the (philosophical) idea contained in Bayes that is key: statistics is taken as the belief in the data. Science is expected to be an intellectual activity. We trust that the researcher has enough training to interpret the experiment. Otherwise, who would have hired him? The most distressing thing about the advent of AI is that we ourselves have become like AI.
“For this, we look to the 95% confidence intervals.…” This is wrong. The key phrase in the post is "A treatment..." emphasis on "A." We might look first to our understanding, that is, our belief (a priori in Bayes terms).
Dear professor Feinman. There is a fundamental flaw in the way biological plausibility is translated to a hypothesis of clinical relevance. This is a pervasive phenomena that contaminates medical research, specially apparent in RCTs. Overestimation is no surprise. This is the cause of "negative" trials.
I have launched a Substack to discuss the translation of biological plausibility to clinical significance. I invite everyone interested in the topic.
Here is a link: https://thethoughtfulintensivist.substack.com/
Thank you Professor Feinman. The good sense exhibited in your comment led me to order your book. I can't wait to read it. I would recommend that everyone read the link you gave below on Red Meat and Diabetes Statistics.
Thanks for the comment. I think we do need to return to common sense. Of course, many papers that are published would not stand the test of common sense.
Very well written John. A few observations:
The first part of the NEJM conclusion is awful. The second part, about the compatibility interval (aka confidence interval) is good.
Higher effective sample size (and thus higher power) can be achieved by not allowing investigators to use endpoints that have such a high frequency of tied values. Higher real sample sizes can be achieved when necessary by stop pretending that the sample size is a number.
Investigators do want to know if something is signal or is it noise, but I subject that what they want to know more than this is the probability that the treatment benefits patients, which cannot be obtained from P-values and compatibility intervals. And if NEJM wants to include that first sentence of the conclusion they should demand that the probability of similarity of outcomes of the two treatments be high. With Bayes you can compute both of these probabilities.
Dear Dr Mandrola and Prof Harrell. I think the point here is overoptimistic effect estimation. People fail to translate biological plausibility to clinical relevance in multi-disease multi-treatment scenarios. It has downstream consequences the statistician couldn't see coming. It contaminates Bayesian reasoning as well. As professor Feinmann states above, the medical question comes first, and I think we clinicians are failing to provide good questions. I have recently launched a Substack to discuss biological plausibility. I invite those interested in the topic.
https://thethoughtfulintensivist.substack.com/
the post: richardfeinman.substack.com/p/red-meat-and-diabetes-statistics
My primary question here, and I am a nurse with limited knowledge, is why strong anecdotal evidence cannot be used to make emergency medical decisions. This data compilation on IVM used in care homes in France came out in March of 2020. Residents being treated for scabies with IVM had an astonishingly low rate of Covid infections. If an EUA can be given for an experimental injection, why not for a long-tested and extraordinarily safe drug like IVM? This seems like sensible medicine to me.
https://www.clinmedjournals.org/articles/jide/journal-of-infectious-diseases-and-epidemiology-jide-7-202.php?jid=jide