Critical Appraisal of the DANCAVAS study of Cardiac Screening
This Study of the Week column is a tour de force of critical appraisal of one the most important trials presented at the recent ESC conference
Cardiologist and Associate Professor Andrew Foy, MD, from Penn State University, offers his critical appraisal of the recently published DANCAVAS trial of cardiac screening.
Andrew is a friend and mentor. He has greatly influenced my approach to evidence. We have co-authored academic papers together. Andrew is a true scientist. He not only publishes original work but also spends a great deal of time critically appraising science—a rare combination.
His take of DANCAVAS is different from mine. That’s why we are delighted to publish it here. Civil disagreement is one of the core purposes of Sensible Medicine.
You should care about DANCAVAS because it is a rare randomized trial of screening that measured the only outcome that matters when we attempt to help people who complain of nothing—overall mortality. JMM
By Andrew Foy:
Why I think DANCAVAS was a null trial that will not change my practice
Five-year results of the Danish Cardiovascular Screening (DANCAVAS) trial were presented in Barcelona at the annual meeting of the European Society of Cardiology. As you might anticipate, with any large screening trial, the results created buzz and caused controversy. From my perspective, its outcome will not change my practice and I explain why in this column.
Description of the Trial:
The trial consisted of Danish men between the ages of 65 to 74 years who were extracted from the Danish National Central Person Registry and randomized to receive an invitation to undergo screening (invited group) or not (control group). The control group was unaware of the trial-group assignment and their consent was not required.
The primary objective of the trial was to determine whether inviting men to undergo “comprehensive, advanced cardiovascular screening” would reduce the incidence of death from any cause.
The screening program used 7 tests which were intended to identify 11 pre-specified targets. These included: 1) a coronary artery calcium (CAC) score to identify a score greater than the median sex- and age-specific score; 2) an ECG to identify atrial fibrillation; 3) a non-contrast CT scan to identify aneurysms involving the aorta; 4) standard blood pressure measurement to identify high blood pressure; 5) ankle brachial blood pressure measurement to identify peripheral artery disease (PAD); 6) glycated hemoglobin testing to identify an A1c 6.5%; and 7) measurement of total cholesterol to identify a level ≥ 309 mg/dL.
In each case of a positive finding, a pre-specified action plan was recommended. For all aneurysms, general cardiovascular prevention was advised which included use of aspirin 75 mg daily and atorvastatin 40 mg daily, smoking cessation and a diet low in saturated fats along with surveillance CT testing at an interval determined by size as well as surgical evaluation in cases where the size exceeded a pre-specified cut-off (e.g., ≥55 mm for an ascending aortic aneurysm). For elevated CAC, aspirin and atorvastatin were recommended as per above, and cardiology referral was provided if angina was suspected. For atrial fibrillation, referral for cardiac evaluation and initiation of anticoagulation was advised. All cases of elevated blood pressure, A1c and hypercholesterolemia were provided a referral to a general practitioner for confirmation and treatment.
The analysis of the primary outcome, all-cause death, was performed according to the intention-to-screen principle. Pre-specified subgroup analyses were performed according to age (<70 or ≥70 years); history of comorbid conditions (yes or no) including cardiovascular disease, stroke, myocardial infarction, heart failure, peripheral artery disease, aortic aneurysm, hypertension, and diabetes; and current use of lipid-lowering drugs (yes or no). [A]
Secondary outcomes included all stroke as well as ischemic and hemorrhagic strokes, myocardial infarction, amputation due to vascular disease, aortic dissection and aortic rupture. The authors report cost-effectiveness in a separate paper. Additional outcomes, which the authors refer to as “explanatory outcomes”, were also collected. These were outcomes that could possibly explain differences in the primary and secondary outcomes, if present, and included the percentage of participants who underwent screening, initiation and adherence to preventive medications and elective repair of aortic aneurysm. Safety outcomes were defined as major bleeding, cardiac revascularization, peripheral vascular revascularization, aortic repair, incident cancer that occurred at least 6 months after randomization, death within 30 days after cardiovascular surgery, and a change in quality of life after screening or follow-up. Additional outcomes included changes in quality of life and overdiagnosis and overtreatment; however, in my opinion, these outcomes had too many limitations and thus, do not provide sufficient information to influence my opinion of the trial so I will not discuss them further. [B]
Outcome data were mainly derived from the Danish National Patient Registry and the Danish National Prescription Registry and were censored by December 31, 2021.
The Participants:
From September 2014 through September 2017, 46,611 men underwent randomization and the final population consisted of 46,526 men (27,790 in the control group and 16,736 in the invited group). Within the invited group, 10,471 (63%) underwent screening. While baseline characteristics for the control and invited groups were nearly identical, there were significant differences between the screened and unscreened invited participants (e.g., level of education, work status, ethnicity, diabetes, etc). [C]
Positive findings of participants in the invited group included 33% with CAC score >400 AU, 15% had an aneurysm identified, 12% had PAD according to the ABI parameters used, 9% had uncontrolled blood pressure and 2% had diabetes. Less than 1% were found to have atrial fibrillation as well as severely elevated cholesterol. Overall, 37% had prevention measures, other than blood pressure treatment, initiated. It is not clear why documentation of blood pressure treatment based on screening, is excluded from the “prevention initiated” counts. [D]
Main results: 12.6% of individuals in the invited group died compared to 13.1% of patients in the control group (HR 0.95; 95% CI 0.90-1.00). However, there does appear to be a difference in the treatment effect for patients 65-69 years compared to those 70-74 years. For those under 70, the overall death rate was 10.2% vs 11.3% (HR 0.89; 95% CI 0.83-0.96) compared to 16.0% vs 15.8% (HR 1.01; 95% CI 0.94-1.09) for those ≥70, respectively. Note that all comparative data from here will be presented as the invited vs control group. [E]
Among the secondary outcomes, the most important differences were observed for ischemic stroke (4.7% vs 5.3%; HR 0.89; 95% CI 0.81-0.96) followed by myocardial infarction (2.6% vs 2.8%; HR 0.91; 95% CI 0.81-1.03). Hemorrhagic strokes were numerically higher in the invited group but not statistically significant (0.9% vs 0.8%; HR 1.13; 95% CI 0.92-1.38). It is noteworthy that the rate of aortic dissection and rupture as well as major amputation events were basically identical between groups.
For safety outcomes, the biggest differences were for severe bleeding (6.8% vs 6.3%; HR 1.07; 95% CI 1.00-1.15) and its components of intracerebral bleeding (1.6% vs 1.4%; HR 1.15; 95 CI 0.99-1.34) and gastrointestinal bleeding (5.3% vs 5.0%; HR 1.05; 95 CI 0.97-1.14), respectively.
Among explanatory outcomes, the biggest differences were observed for initiation of antiplatelet (22.9% vs 8.3%) and lipid lowering agents (20.7% vs 9.0%), respectively. There was no difference observed in initiation of anticoagulants and differences in initiation of hypertensive and antidiabetic agents was very small. Approximately 0.7% more individuals underwent elective aneurysm repairs in the invited group.
Regarding data on adherence to initiated medications, it is sufficient to say that there were minimal differences, but methodology to assess adherence was very limited. Information regarding adherence or changes in dosage of other therapies (i.e., those that would be indicated and already prescribed) was not recorded.
Summary:
The DANCAVAS trial found that invitation to undergo 7 cardiovascular screening tests reduced the absolute rate of death in Danish men 65 to 74 years old by 0.5% and this barely missed the threshold for statistical significance. There is evidence of treatment effect heterogeneity. Individuals <70 years experienced a 1.1% reduction in death compared to 0.2% increase for those ≥70, respectively. Differences in death may be related to reductions in stroke and heart attacks, which would be commensurate with the increased initiation of antiplatelet and lipid lowering agents observed in the invited group.
In response to these results, several physician commenters who I have tremendous respect for have opined that these results represent a win for cardiovascular screening, at least in those under 70. In defense of their case, they cite the high bar of all-cause death and analysis based on the intention-to-screen principle that were used in the study. I agree that these should be the standards for determining the efficacy of screening tests in randomized controlled trials.
The all-cause death signal in DANCAVAS is important. Very few commonly used screening tests have demonstrated reductions in all-cause mortality. As an example, breast cancer screening with biennial mammography has a class B recommendation from the US Preventive Services Task Force for women between the ages of 50 to 74 years and entire centers within major health systems across the US are dedicated to providing it. However, recommendations for performing it are driven by evidence from RCTs that find that screening reduces breast cancer mortality – not overall mortality. For women aged 50-59 and 60-69 years, results from RCTs show statistically significant relative risk reductions in breast cancer mortality of 14% and 33% but there are no commensurate reductions in all-cause death with nearly identical death rates for screening compared to control groups. [2] As a result of these findings for breast cancer screening and other screening tests, debate about the importance of reductions in cause-specific mortality and the utility of these tests persists with no clear answers. [3]
The Positives:
DANCAVAS investigators should be commended for attempting to definitively address the efficacy of cardiovascular screening by using all-cause death as the primary endpoint of the trial. However, it was perhaps not as ambitious as some commenters have suggested. Sticking with the example above, according to the American Cancer Society, the chance a women will die of breast cancer is about 2.6% or 1 in 39. [4] In contrast, cardiovascular disease accounts for 32% of all global deaths of which 85% are due to heart attack or stroke. [5] This fraction would be substantially higher in men between the ages of 65 to 74. The attributable fraction of death from breast cancer is very low compared to cardiovascular disease and thus, using all-cause death as the bar for determining efficacy for a “comprehensive, advanced cardiovascular screening program” is appropriate.
Another area where I think the investigators should be commended is on the attention taken to assess a variety of secondary endpoints, safety outcomes, and explanatory outcomes. The investigators are careful to point out that results from these outcomes should be viewed as hypothesis-generating only but their presence aids in the interpretation of the primary result.
Three Major Concerns:
First, I am not confident that a meaningful clinical effect exists and that the findings were not due to chance. In my opinion, treatment effect heterogeneity between age groups undermines confidence in the overall effect. It is not obvious why cardiovascular screening would be beneficial in those 65 to 69 but not 70 to 74 years old. It could be due to higher competing risk of death in older individuals from non-cardiovascular causes but information on causes of death are not provided and thus, the ability to drill down further on this hypothesis is not possible. I do not think the fraction of cardiovascular death is much different between these ages but perhaps the ability to reduce it through risk factor modification wanes after the age of 69. While that is plausible it seems like a bit of a stretch.
I am also skeptical that a 0.5% absolute reduction in all-cause death could be due to cardiovascular screening when there is only a 0.5% absolute difference in stroke and a 0.2% absolute difference in myocardial infarction and no differences in aortic dissection or rupture and major amputation between groups. Furthermore, differences in stroke and myocardial infarction were counterbalanced by a 0.5% absolute increase in severe bleeding.
My second, and more important, contention with the trial involves 2 specific design features. The first is the inclusion of patients with established cardiovascular disease and the second is combining screening tests, some of which are standard (especially for patients with established disease) and others that are more exploratory, into a single screening protocol. This makes it impossible to sort out the effects of the different tests.
Even if we were to accept a small difference in the primary endpoint it certainly does not seem to be driven equally by the various tests. One could easily make a case that CAC or cholesterol testing alone drove the difference in initiation of lipid-lowering therapy and that the other tests provided little-to-no additional benefit. If this were true, it would be reasonable to argue that measurement of cholesterol and blood pressure alone, in conjunction with a simple risk calculator, could be expected to achieve similar results.
My final concern and something the authors highlight nicely in the discussion is the issue of external validity. Denmark is an ideal country to show benefit from a national screening program and yet, this trial could not definitively do that. I will not belabor the point, but in the US system I would be far more concerned about overtreatment and overdiagnosis due to testing cascades from abnormal screening results.
In summary, concerns in relation to 1) small overall treatment effect that did not reach statistical significance with unexplained heterogeneity between age groups; 2) the mortality effect relative to effects on secondary and explanatory endpoints; 3) inclusion of patients with established cardiovascular disease; 4) use of multiple tests (some standard and others exploratory) with inability to disaggregate individual test effects; and 5) concerns for external validity when used in other health systems reduces my confidence in the results and is why I will not be changing my current practice.
References
1. Lindholt JS, Søgaard R, Rasmussen LM, Mejldal A, Lambrechtsen J, Steffensen FH, Frost L, Egstrup K, Urbonaviciene G, Busk M, Diederichsen ACP. Five-Year Outcomes of the Danish Cardiovascular Screening (DANCAVAS) Trial. N Engl J Med. 2022 Aug 27. doi: 10.1056/NEJMoa2208681. Epub ahead of print. PMID: 36027560.
2. Nelson HD, Fu R, Cantor A, Pappas M, Daeges M, Humphrey L. Effectiveness of Breast Cancer Screening: Systematic Review and Meta-analysis to Update the 2009 U.S. Preventive Services Task Force Recommendation. Ann Intern Med. 2016 Feb 16;164(4):244-55. doi: 10.7326/M15-0969. Epub 2016 Jan 12. PMID: 26756588.
3. Prasad V, Lenzer J, Newman DH. Why cancer screening has never been shown to "save lives"--and what we can do about it. BMJ. 2016 Jan 6;352:h6080. doi: 10.1136/bmj.h6080. PMID: 26740343.
4. https://www.cancer.org/cancer/breast-cancer/about/how-common-is-breast-cancer.html
5. https://www.who.int/news-room/fact-sheets/detail/cardiovascular-diseases-(cvds)
Footnotes
A. It is odd to me that participants with certain comorbid conditions would be included in this trial since some of the screening tests are part of the traditional standard of care for those conditions (e.g., blood pressure, cholesterol and A1c testing in participants with cardiovascular disease). In my mind, this does not fit with my sense of “screening.”
B. There seems to be overlap in what the authors refer to as explanatory outcomes and safety outcomes (e.g., initiation of medical therapy is explanatory whereas aortic repair is considered safety). Major bleeding fits squarely into the safety category but the others do not, in my opinion.
C. Differences between the screened and unscreened invited participants highlight the importance of assessing outcomes based on the intention-to-screen principle rather than the as-screened approach; otherwise, it would be an observational analysis and the results would likely be heavily confounded.
D. It’s unclear why CAC score >400 AU was presented as this would seem to be well above the median ranges for men 65 to 74, which was the prespecified parameter for a positive CAC finding. According to the MESA calculator, the 50% percentile score for white men, 65 to 74 years, ranges from 71 to 223 AU.
E. The death rate estimates among age-based subgroups were back-calculated as the exact numbers of individuals in each age group were not provided. The estimate is based on the number of events, which is provided. The total number of patients per subgroup was estimated by multiplying the overall number of patients for each group (invited and control) by 59.4% to arrive at an N for the 65-69 year group and by 40.6% to arrive at an N for the 70-74 year group.
Layperson here. I found this a helpful sentence in articulating why all cause mortality is a valid endpoint in this discussion. Thank you.
“The attributable fraction of death from breast cancer is very low compared to cardiovascular disease and thus, using all-cause death as the bar for determining efficacy for a “comprehensive, advanced cardiovascular screening program” is appropriate.”
Very nice review.. Like most studies with multiple endpoints, DANCAVAS analyzed the endpoints separately. This fails to answer a global question of "did screened persons fare better?". One should recognize that some endpoints are worse than others, and consider re-analyzing the data to answer the question "was the worst event that happened to a screened person less bad than the worst event happening to a non-screened person?". This can be done with an ordinal outcome analysis. Secondly, the analysis by age group is not valid unless one thinks that magic happens on a specific birthday. Age should only be thought of as a continuous variable, and its effect assessed with a smooth interaction analysis. Finally, continuously scoring the severity of the results of the screening tests, rather than considering the screening result as all-or-nothing would have given some interesting results.