On yesterday’s podcast, I talked with Bobby Yeh, an academic cardiologist who made a compelling case for enhancing credibility of observational research. Please do listen. Bobby is one of the smartest people in cardiology today.
After closely reviewing the cardiac literature for the past decade or so, I have become increasingly hopeless that we could glean any useful information from non-random retrospective comparisons. Mainly due to systemic bias.
Bobby infused me with some hope.
But then came this email. It’s from Paul Dorian, MD. Paul is a cardiologist and Professor of Medicine at the University of Toronto. With permission, I will publish his email. It is a trove of educational material.
John, thank you for your recent outstanding podcasts, and this recent reference to two important and thoughtful papers on observational studies. (Editor’s note—links are in the show notes from the podcast yesterday.)
While I agree on the general points made by both authors and the confusion in the community between the usual (unarticulated) purpose of observational studies and their reporting (either not acknowledging the “causal intent” or using weak terms such as association), I think a general comment is summarized by Yogi Berra:
“In theory there is no difference between theory and practice - in practice there is"
I am very concerned that, even as we strive as a community to do better observational studies, taking into account the important concepts and methods discussed in these papers, there will be an implicit belief that striving for causation means that epidemiological studies will actually get us to the Holy Grail. This is most often a fool’s errand in the real world.
The reasons are multiple, but I will articulate a few:
Most epidemiological studies are done from large data sets that are not collected for the purpose of research. This means that the data on which the observations are based, independently of how they are analyzed, are not reliable.
Administrative data sets are often collected for the purpose of billing, or costing, or administrative organization, and thus are extremely vulnerable to up coding or intentional or accidental omissions. In universal healthcare systems, where these factors are less at play, administrative data sets are completed and compiled by professional coders, who rely on the medical record. I think we can all agree that the medical record is unreliably reflective of what the caregivers were actually thinking, and what their ultimate diagnosis was. If medical practitioners cannot decide if a small troponin rise in a patient with myocarditis and non-critical coronary disease was due to an infarct or not, how can coders possibly make an informed decision about this?
More pertinent to your recent This Week in Cardiology podcast, patients with monomorphic ventricular tachycardia and coronary disease often have a small troponin rise and ST segment changes on their initial post-cardioversion ECG (and have established coronary disease). How is a professional coder to decide if this event was due to an MI or not? I have not investigated this question, but I would wager that administrative medical records often indicate an “MI” in this setting, although most of us would not consider this type of event to be a true myocardial infarction, or at least not the type that would require urgent angiography and coronary revascularization.
A second major problem with administrative data is that it fails to account for the severity of underlying disease or morbidities, as opposed to their presence or absence. This is a common probably insurmountable problem with morbidities, such as hypertension, diabetes , COPD , etc.
In these cases, not only is the diagnosis completely arbitrary on the part of the clinician, but the severity of the condition has an extremely strong influence on its contribution to the usually measured outcome.
Another problem, which is possibly surmountable, is the conflation of what one can call “patient outcomes” versus “doctor outcomes.” For example, mortality is a patient outcome—provided it can be reliably ascertained. This is not always obvious in the clinical data sets, since mortality out of hospital may be recorded in vital status databases but not in hospital or healthcare system databases especially if patients move, etc..
On the other hand, outcomes, such as rehospitalization or revascularization are doctor outcomes, very heavily influenced by physician biases, economic factors, cultural factors, insurance status, etc.
I would argue that these biases are even more at play in observational studies than in clinical trial settings.
Well intentioned and sophisticated methods of correcting for bias by indication, for example, such as propensity analysis, are seriously hampered by the assumption that physicians are consistent (within and between doctors) in their treatment decisions. We know this not to be the case.
Another commonly used source of data is patient reported data, such as alcohol consumption, exercise habit, dietary habits, etc.
These are not only weak approximation of the actual habits individuals have over time, but in many cases the measures used are known to be systematically biased. The alcohol consumption studies and exercise habits studies are particularly relevant. There are credible observations that alcohol consumption is systematically underestimated in observational studies, and the extent of underestimation is asymmetric, with more underestimation at a moderate or high alcohol intake. As a consequence, the dose response relationship of alcohol to health outcomes, even if the relationship is “causal” will be systematically incorrect. Studies relating self report of activity/ exercise versus objectively measured movement with accelerometers yield similar, systematic over estimation, which is greater in certain specific populations.
Unless the data on which the analyses and inferences are to be made are collected for the purpose of research, rather than for another reason, and subjectively established data is validated for accuracy, precision, and reliability, no analytic legerdemain can compensate for these major limitations.
The long and unfortunate history of observational data of dietary constituents and supplements (vitamins, fish oils, chocolate , etc) for which persuasive epidemiologic research has suggested causal connections, and eventually thoroughly and completely refuted by randomized blinded clinical trials, serves as a warning about the risks of introducing discussion of causal affects when associations are reported.
The arguments advanced in favor of forthrightness in articulating the purposes of epidemiological studies, and the need for rigor and transparency in their methods are extremely well supported. However, I worry about “interpretation creep,” whereby careful discussions about strengths and limitations and the design of observational studies would be spun to causal inferences, whether articulated or not.
By all means, let us continue doing observational studies, since they give us important insights into what is actually happening, and since they allow the formation of important and testable hypotheses.
Unfortunately, in most observational studies, hypotheses is all we’re going to get.
Paul
We at Sensible Medicine are very excited to get this kind of interaction regarding medical science. It is our goal. Thanks for your support. It has been amazing.
Next week I am going to show you what a well-done medical study looks like. It will be a positive upbeat view of medical science. JMM.
Well put! One of the worst parts of current practice is the charting/coding involved, which is so forced, so artificial, there is no way you can make a reliable study out of chart codes. For the non doctors out there, imagine if you were required to fit every single interaction with every unique individual into a prefabricated mold to make some bureaucrat happy. So that, say, watching a stupid tik tok video your annoying coworker shows you and watching the video of your child’s dance recital both have to be officially recorded as “encounter for watching screens.” Now imagine doing stupid, inhuman coding like that all day, every day - and then having another bureaucrat who’s never met you think he can collate it all into a meaningful study! Madness.
I know we will never get rid of this stupid charting waste of time, because lawyers, but my dream is that we at least some day succeed in making lawyers’ lives as stupidly tedious and prohibit them from charging any clients for their time unless they precisely document all their legal reasoning into codes taken from a technical manual written by a robot. It won’t accomplish anything worthwhile or improve the legal profession, but it will be sweet, sweet revenge: )
This is kind of discouraging, but the good news is that, unless you're a researcher in search of hypotheses to investigate, you can cut WAY back on your medical literature reading! :)