I love case control studies.
When done well, they are simple, relatively easy to perform studies, that can be hugely impactful, usually by suggesting that an exposure is harmful. Case control studies have saved millions of lives, demonstrating the adverse effects of substances such as tobacco smoke, DES, and dexfenfluramine. Case control studies can also mislead us. A poorly chosen control group or a sprinkling of recall bias can lead to wildly misleading results. This balance of the potential for knowledge or hornswoggling is what makes reading case control studies fun. The articles that report the results of case control studies are also usually short – just like this week’s post.
Design
Patients who have a disease and a group of otherwise similar people who do not have the disease are selected. The investigator then looks retrospectively to determine the frequency of exposure in the two groups.
The advantages of case control studies include that they can be performed quickly and inexpensively. Because they are run retrospectively, they are the only feasible method for studying rare outcomes or those with long lag time between exposure and outcome. RCTs or cohort studies would need to be prohibitively large and long to reach the conclusion of a case control study. Like cohort studies, case control studies are generally unencumbered by ethical concerns.
The disadvantages of case control studies are obvious. They often rely on recall or (sometimes unreliable) medical records to determine past exposure. They are prone to confounding as cases and controls often differ with respect to multiple exposures and demographics. And, of course, case control studies cannot establish incidence.
Selection bias
In case control studies, cases and controls almost always differ with respect to potential determinants of outcome. Researchers generally must work with the cases they have -- the cases often have a rare disease -- but they are at liberty to choose controls. The best control is a person who would have become a case in the study had they developed the outcome of interest. Like in cohort studies, researchers will use restriction, matching, and adjustment to manage confounding.
Measurement bias
Measurement bias occurs in case control studies when patients in one group have a better chance of having their exposure and/or outcome detected than those in the comparison group.
Bias/confounding regarding exposure:
Recall bias occurs when the presence of the outcome affects a subject’s recollection of the exposure. Imagine you have recently suffered a miscarriage. You might be more apt to remember (and report) certain exposures than a woman who carried her pregnancy to term. Recall bias can be controlled by confirming exposure with objective records.
“Nesting” your case control study in a prospective study is a nice way of getting around recall bias. Here data is collected prospectively but analyzed retrospectively.
One of my favorite examples of a nested case control study is this study by Paul Ridker and colleagues on CRP as a cardiovascular risk factor. This case control study was nested within a RCT. The researchers capitalized on the fact that participants in the randomized trial had blood drawn and stored before they were randomized to aspirin and/or beta carotene or placebo. The participants were then followed for the onset of a vascular event (MI, CVA, VTE). Cases, those participants with a vascular event during follow-up, and controls, those participants without were compared with respect to their exposure to an elevated CRP in the initial blood specimen. Brilliant, right?
Confounding by indication occurs when the presence of the outcome directly affects the exposure. Here is an example. Patients with anginal chest pain in the emergency room are often given the following cocktail of medications: nitroglycerin, oxygen, aspirin, beta-blockers, and morphine. A case control study might show that among ER patients with angina, those exposed to morphine are more likely to die. This might be confounding by indication as severe chest pain would call for morphine but also predict a higher risk of death.
Lastly, if data gatherers are not blinded to the whether a participant is a case or a control (and if they know the study’s hypothesis), the participants status might affect the recording of the exposure.
Biases/confounding regarding outcome:
You must be sure that controls have the same opportunity for diagnosis as do cases. This avoids the possibility that there are cases hidden in your control group. This is usually not a much of a problem since we are usually dealing with rare diseases.
The Numbers
If you were doing a prospective cohort study (or RCT), you would follow exposed and unexposed (treated and untreated) patients and determine what proportion of each group developed the outcome:
Proportion of exposed pts with outcome = A/(A+B)
Proportion of unexposed pts with outcome = C/(C+D)
The relative risk of developing would therefore be:
RR = (A/A+B)/(C/C+D)
In a case control study, a relative risk is meaningless because you begin by selecting a group of cases, rather than seeing how many become cases over time. You have selected the ratio of A:B and C:D. However, you can determine the relative frequencies of exposures in the cases and controls by calculating the odds ratio. For those of you who don’t go to the track much, an odds is a ratio of two probabilities. Bear with me for a little math here.
The odds of a case being exposed = probability of a case being exposed/probability of a case being unexposed = (A/A+C)/(C/A+C)
The odds of a control being exposed = probability of a control being exposed/probability of a control being unexposed = (B/B+D)/(D/B+D)
The odds ratio (OR) = odds of a case being exposed/odds of a control being exposed
OR = [(A/A+C)/(C/A+C)]/[ (B/B+D)/(D/B+D)]
Doing a little algebra this simplifies to AD/BC
In the end, you can think of an odds ratio being equivalent to a relative risk because in case control studies the diseases are rare so A<<B and C<<D so…
RR = (A/A+B)/(C/C+D) = (A/B)/(C/D) = AD/BC = OR
An odds ratio greater than 1 indicates increased risk, and an OR less than one indicates protection. Thus, the odds ratio can be interpreted like a relative risk. The odds ratio is approximately equal to the relative risk when the incidence of disease is low (less than 1/100 unexposed people). Because of this, sometimes results of case control studies are reported as relative risks, even though they are really odds ratios.
Some of my favorite little gems. N.B., a few of the conclusions reached in these trials are dead wrong. I’ll leave it to you do identify them.
Herbst AL, et al. Adenocarcinoma of the vagina. Association of maternal stilbestrol therapy with tumor appearance in young women. NEJM 1971;284:878-81.
Macmahon B, et al. Coffee and Cancer of the Pancreas. NEJM 1981;304:630-633.
Cramer DW, et al. Ovarian Cancer and Talc: A Case-Control Study. Cancer 50372-376, 1982.
Psaty BM, et al. The Risk of Myocardial Infarction Associated with Antihypertensive Drug Therapies. JAMA 1995;274:620-625.
Abenhaim L, et al. Appetite-suppressant drugs and the risk of primary pulmonary hypertension. International Primary Pulmonary Hypertension Study Group. NEJM 1996. 335:609-16.
Ridker PM, et al. Inflammation, aspirin, and the risk of cardiovascular disease in apparently healthy men. NEJM 1997;336:973-9.
Teran-Santos J, et al. The association between sleep apnea and the risk of traffic accidents. Cooperative Group Burgos-Santander. NEJM 1999;340:847-51.
Cnattingius S, et al. Caffeine intake and the Risk of First Trimester Spontaneous Abortion. NEJM 2000;343;1839-1845.
Gawande AA, et al. Risk factors for retained instruments and sponges after surgery. NEJM 2003;348:229-35.
D’Souza G, et al. Case–Control Study of Human Papillomavirus and Oropharyngeal Cancer. NEJM 2007;356:1944-56.
Park-Wyllie LY, et al. Bisphosphonate Use and the Risk of Subtrochanteric or Femoral Shaft Fractures in Older Women. JAMA 2011;305:783-789.
Billioti de Gage S, et al. Benzodiazepine use and risk of Alzheimer’s disease: A case-control study. BMJ 2014;349:5205.
Users’ Guides questions for articles about harm
1) Were there clearly identified comparison groups that were similar with respect to important determinants of outcome, other than the one of interest?
2) Were the outcomes and exposures measured in the same way in the groups being compared?
3) Was follow-up sufficiently long and complete?
4) Is the temporal relationship correct?
5) Is there a dose-response gradient?
6) How strong is the association between exposure and outcome?
7) How precise is the estimate of risk?
8) Are the results applicable to my practice?
9) What is the magnitude of the risk?
10) Should I attempt to stop the exposure?
This well-written article gives an excellent description of Case Control studies followed by a comprehensive list of reasons why the conclusions are likely inaccurate. Mainly because the exposure data are unmeasurable and/or based on recall that can often be far back in time. Designations of cause of death as outcomes are also notoriously unreliable. These types of studies are most often used to advance a particular agenda and will cement into place policies and laws that are based on individual preferences or for propaganda by pharmaceutical firms. The poster child for this is the now accepted concept that second hand smoke is a health hazard. I haven't read all of the studies listed at the end of the article. But several of the initial ones were scientifically flawed.