Evidence Appraisal Is NOT like a Rorschach Test
Let me take a break from specific study review to tell you about a comment I heard this week
I was involved in a conversation about the recent trials of left atrial appendage closure. A colleague remarked that how one views these trials is like a Rorschach test.
His next sentences described the certainty that the procedure will increase in volume. Another colleague in the conversation added, “read the tea leaves.”
The Rorschach is a psychological projective test developed by Swiss psychiatrist Hermann Rorschach in 1921. It consists of 10 standardized inkblot cards — symmetrical, ambiguous images — that a subject is asked to interpret. The idea is that because the stimuli have no inherent meaning, what you see reveals something about your perception, personality, unconscious concerns, and cognitive style. (Italics mine).
There’s no single “correct” answer to a Rorschach card. Two people can look at the exact same image and describe something entirely different — and both responses are valid, but telling.
What I take from the Rorschach test analogy is that the objective data is the same for everyone but what you (or I) take from it says as much about priors and biases as it does about the data itself.
While this comment is eye-opening, I disagree with it strongly.
Evidence is evidence.
When we do a randomized clinical trial, we measure outcomes in the two arms. Assuming proper randomization and trial procedures, we come up with an estimate of effect size between the two arms. This is not an inkblot.
I also disagree with the idea that priors are a negative bias. On the contrary; trials are like medical tests, they simply strengthen or weaken our prior beliefs.
The Specifics of Left Atrial Appendage Closure Evidence
Two large LAAC trials, CLOSURE AF and CHAMPION AF, have published results recently.
The non-industry-funded CLOSURE AF trial (n =912) found that LAAC was not noninferior to best medical therapy (mostly DOACs). The rate of a first primary endpoint (stroke, systemic embolism, major bleed, CV death) was 26% higher in the LAAC arm. This 3.5% increase in absolute terms did not come close to meeting non-inferiority. In fact, technically speaking, LAAC was inferior to best medical therapy. The main driver of the result was a higher rate of major bleeding in the LAAC arm.
The industry-funded CHAMPION AF trial (N = 3000) sits in NEJM as a “positive” trial for LAAC. But its positivity relies solely on flawed choices of endpoints and noninferiority assessments.
For efficacy, the primary endpoint of stroke, systemic embolism, and CV death occurred in 5.7% vs 4.8%, LAAC vs DOAC, respectively. The absolute difference of 0.9% higher rates in the LAAC arm had 95% confidence intervals of -0.8-2.6%. Since the worst-case scenario of 2.6% was less than the declared noninferiority margin of 4.8%, LAAC was declared noninferior.
However, the margin of 4.8% was based on a 12% expected rate in the DOAC arm. That it came in much lower at 4.8% means that the margin of 4.8% would allow a relative risk margin of 2.0. (4.8 + 4.8). The choice of a 4.8% margin based on an expected 12% translates to a relative risk NI margin of 1.40 or 40% higher.
When you calculate a relative risk difference in primary endpoint (5.7% vs 4.8%) you get 1.20 (0.87 to 1.66). Note here that 1.66 is more than 1.40. So, had the authors hued to normal NI practice, the device would be declared not noninferior.
Most damning however, is why the primary endpoint was higher in the LAAC arm: because ischemic strokes were higher in the LAAC arm.
That’s not even the worse manipulation of results in CHAMPION AF. Worse was their handling of safety events.
Instead of counting all major bleeding from the moment of randomization, the authors declared their primary safety endpoint to be non-procedural bleeding, including nonmajor bleeds. Of course, everyone knows that bleeding is the most common procedural complication, and patients cannot exlude this outcome. Everyone also knows that it in an open-label trial, patients on anticoagulation drugs are likely to complain of more nonmajor bleeding, such as bruising, gum bleeding etc. (This was seen in OPTION).
The topline results shown in the NEJM reveal clear superiority in this “primary safety endpoint.” But it is an utterly flawed endpoint.
The main secondary endpoint of all major bleeding were similar: 5.9% vs 6.4% HR 0.92 (95% CI, 0.68 to 1.24). So, there was no advantage in safety from this device.
The authors hide this conclusion by testing this all-important endpoint with noninferiority. They claim LAAC is noninferior to DOAC for all major bleeding. But this too is obfuscation because if you allow your new device to be not as good in efficacy, it must be superior in safety. It was not.
On the Matter of Priors
Inherent in the comment about how you feel about the recent trials as a Rorschach test is that prior beliefs are akin to a bias. But this is wrong headed. Priors are critical to interpretation of trial results.
In the case of LAAC, we know that the PROTECT AF trial did not pass FDA muster due to internal validity concerns. The FDA-mandated PREVAIL trial found higher rates of stroke in the Watchman arm, resulting in the device not making non-inferiority in its first co-primary endpoint of stroke, systemic embolism and CV death. We also know that the PRAGUE 17 trial of LAAC vs DOAC had too few events to make any conclusions.
Therefore, going into CLOSURE AF and CHAMPION AF, our priors should have been quite pessimistic that LAAC offered either an efficacy or safety advantage.
That we did not observe higher efficacy (worse in both trials) or superiority in safety only strengthens our pessimistic priors.
Conclusion
Rather than calling interpretation of trials, such as the new LAAC trials, a Rorschach test, I’d liken it more closely to a test of intelligence and neutrality.



The average physician would never be able to do the kind of analysis you do. He/she is oriented to "follow the science", meaning peer reviewed papers and protocolos supposed to have done the analysis for him/her. In this sense, I think that your colleague's comment was correct.
John is right, evidence is evidence. That is why the Rorschach has never been a truly useful tool. When it is useful, the abnormal response is obvious and achievable through simple normal interviewing.