4 Comments
User's avatar
⭠ Return to thread
Ben Recht's avatar

I understand your point here, but this is also where statistics drives people a bit batty.

1) They study both endpoints of survival and absolute mortality.

2) The power calculation is based on a 25% improvement of survival (using a log-rank test as the basis of the power calculation)

3) When translated into absolute mortality, this corresponded to an expected 50% RRR.

Now, if you think 50% RRR in absolute mortality is wildly optimistic, then you as the consulting statistician has to say to the study authors: "Though we decided on an RRR of 25% in terms of survival, when I reinterpret this calculation in terms of cumulative mortality, this would correspond to a 50% RRR. Is that realistic?"

Expand full comment
Frank Harrell's avatar

Important points to discuss pre-study. "25% improvement of survival (using a log-rank test)" involves two clashing concepts but you are right we need to be clear. I would say that in general a hazard ratio of 0.75 (25% reduction in relative hazard) is pretty large. 50% reduction in RR is large too. Which one to emphasize is not clear.

Expand full comment
Ben Recht's avatar

Because of our conversation, I re-read the "Sample Size Calculation.” section and worry we're both a bit wrong. Here's my current read:

1. A previous trial found “patients with a MELD score [Model for Endstage Liver Disease] greater than or equal to 21 had lower survival by 18.8% compared with patients with a MELD score of less than 21.” (direct quote from the paper)

2. The investigators “hypothesized” that antibiotics would reduce *overall mortality* by 75% of the effect size of this previous trial. That is, they expected 75% of 18.8%, which is 14%. Given that the rate of death in the control group was 28%, this meant they were hypothesizing a 50% RRR for antibiotics (27%/14% = 1.9), but they don't explicitly discuss RRR.

3. They then generated their sample size calculation using a log-rank test powered to detect this hypothesized difference in mortality.

I know this is not what you would recommend they do! But I'm just quoting the paper directly here. Very bizarre. Did they not consider the implied RRR? Did they only have a budget for about 300 patients and came up with some hand-wavy reasoning in their power calculation to justify the small trial size? It is impossible to know, unfortunately. But I agree with you that this conversation should have been had before the study.

Expand full comment
Frank Harrell's avatar

Oh my. NEVER use an observed effect size in another trial in a power calculation. A recipe for disaster. Power calculations should always use the clinical effect you would not like to miss. This is not a data-oriented specification.

Expand full comment