For many people, eating and exercising are the highlights of their day. It is not surprising then that the media is fascinated with their impact on our health. After years of reading hundreds of studies about exercise -- and probably spending an equal number of hours of biking, running, and swimming -- we have reached a simple conclusion: exercise feels good, is probably good for us, and therefore something we all should do. This, of course, is not a surprising or terribly interesting recommendation. It would certainly not garner clicks if published every week. And, you might want more information: what kind of exercise; for how long; at what time of day; with whom? The answers to these questions remain unanswered and are probably of little importance.
Why? Let’s take a brief aside into the world of automobiles. Consider someone who is debating whether to trade in a 2015 Toyota Prius for a 2022 Tesla Model 3. Imagine this person’s primary concern is minimizing their carbon footprint. How does she make the decision? Turns out, it is complicated. A Model 3 produces less greenhouse gas emissions per mile compared to a Prius, even assuming that the source of electricity to power the car is a fossil fuel plant. But this is not the entire calculation. Our environmentally minded friend must also consider the fact that she owns the Prius now and would have to purchase the Tesla. The manufacturing burden of the new car — starting with the CO2 emitted to mine the aluminum -- will offset the environmental gains per mile driven. How many miles would she need to drive before being carbon even? The calculation might get even more complicated if she factors in the driving habits of the person to whom she sells the Prius, and what car it displaces. If you step back and reflect on the issue, you realize that as much as you agonize over the Prius versus Tesla decision, the bigger effect on the planet would be if everyone driving to work alone in a pickup truck traded in their vehicles for Corollas.
What does this have to do with exercise? Thinking about society at large, the biggest impact we could have on health with exercise would be to encourage sedentary individuals to adopt regular physical activities. This would have a larger beneficial effect than changing one person’s 7-mile daily run at a 6-minute pace from after work to before. The truth is there is far more to be gained from doing something (rather than nothing) in the first place than further improving health outcomes from minor alterations.
The problem is, these evidence-based thoughts about exercise are not that interesting. Nobody is going to read an article (at least more than once) that says, “If you are not exercising, you should start. If you are exercising, keep up the good work or maybe, if you can, do a bit more.” Moreover, the people who click and read these columns are not the sedentary person curious about exercise. They are the exercise enthusiasts looking for a tip. Thus, news coverage about exercise is a steady stream of bad science.
Tennis is the best sport for a long life, especially when you hire a ball-boy
The real headline was nearly this outrageous, “The Best Sport for a Longer Life? Try Tennis,” and appeared in the New York Times in September of 2018. The news story makes the claim that playing tennis or badminton increases your lifespan more than swimming, cycling, or jogging.
Any of us could come up with an explanation for this finding. Maybe competitive sports motivate you to push harder, improving your fitness beyond individual exercise. Maybe the social aspect of tennis makes you more likely to play regularly. Maybe tennis benefits you by exercising both your upper and lower body. Maybe it has to do with the mental aspects of tennis -- you are forced to plan your shot and to learn your opponent’s style. Is this the sort of joker who likes to hit a drop shot after a series of baseline ground strokes? Keeping your mind sharp also lengthens life, the theory may go.
Human beings are great at dreaming up stories. If you believe a finding is credible, there is no shortage of rationalizations that you can make up. Medical school is especially good at training aspiring doctors to think this way, to explain any finding based on known human physiology. It is the inductive reasoning we all learned about in middle school science. But remember, these explanations are just stories. Real science treats these stories as hypotheses and then tests them. Our problem, and the problem propagated by churnalism, is that researchers often accept a plausible story as the truth. In this case, the researchers involved were happy to oblige in speculating that “the social aspects of racket games and other team sports are a primary reason that they seem to lengthen lives.”
When you spend a little time with the actual research, problems begin to emerge. Here is how the research was conducted: participants in Copenhagen, Denmark were surveyed about their exercise between 1991 and 1994 and then followed for 25 years. Participants told researchers how they exercised. It turns out that most of the active people in this study were bikers and most bikers did other sports as well (tennis, badminton, soccer, jogging, calisthenics, swimming, and going to the gym). If biking was your most common activity you were considered a biker. Because almost everyone was a biker, you were considered a jogger, for instance, if, after biking, jogging was the sport you did most commonly. Thus the same person was often included in multiple categories.
The researchers found that playing tennis was associated with the longest survival; tennis players lived over 10 years longer than sedentary people. Following tennis, badminton, soccer, jogging, cycling, and swimming benefited longevity in that order. Going to the gym was the least beneficial, lengthening survival by only 4 years. When the authors also adjusted for smoking, education, income, diabetes, and alcohol consumption, the benefit was attenuated, but tennis and badminton still stood out, with 9 and 6 years of benefits, respectively.
What is going on here? First, let us articulate the null hypothesis. A null hypothesis is how statisticians describe the world if there is no relationship between an exposure and the outcome of interest. In this case, the null hypothesis would be: assuming you are going to get out of the house and exercise, it doesn’t matter what you do. The alternative hypothesis is that it does matter. Some sports are better for you than others. Authors of studies are always interested in the alternative hypothesis, that is why they are doing the study. Ideally, once the data is in, a researcher should at least seem disinterested. She should look at the data with a cold scientific eye. Here, the research authors seem to be trying to persuade us that the alternative hypothesis is true. If you swim you will live 3 years longer than if you sit around, but if you play tennis you will live 9 years longer. The benefit of playing tennis over swimming is twice as big as the benefit of swimming over doing nothing.
When you hear it stated that way, you might already have doubts. How could that possibly be true? Tennis might be a little better than swimming, but that much better seems preposterous. It seems even more preposterous when you remember that people do more than one activity. Tennis players also play soccer. Soccer players also swim. Swimmers also go to the gym. And, in Denmark it seems, nearly everyone cycles. What makes someone a “tennis player” in this study is that tennis is the second most common activity a person does. Our journalist is committing the 4th sin of churnalism by completely neglecting to consider the plausibility of the results.
Confounding (associated with our 3rd sin) is also likely a problem here. People who choose one sport are different from those who choose another. Tennis players in this population tended to be younger than others and a higher proportion of them had a high household income. Soccer players were almost universally men (95%). The authors did adjust for factors that they inquired about and that they found differed between the groups. For instance, they adjusted for age, sex, and how many hours a week people exercised. But, think about the residual confounders, all those differences that were not measured and thus not adjusted for. In order to play tennis you have to have access to a court. Copenhagen has a yearly average daytime high of 52 degrees and fewer hours of sunshine per year than Anchorage, Alaska. In order to keep up a tennis habit a person probably needs access to an indoor court. That takes a certain level of resources. You also need someone to hit the ball back to you, a colleague or a friend who is free at the same time or maybe a professional whom you pay. This further selects for people who are economically well off and socially connected. (Writing this paragraph reminds us that we, both upper middle-class doctors who love tennis, do not get to play more than a few times a year. It might be our bitterness about this that made us focus on this study).
Although the authors adjust for many factors, it is hard to document, measure, and adjust for intangibles like economic advantages and social networks required to play racket sports. As such, it is likely that the longevity differences measured in this study are not because playing tennis, as your second sport, is three times better than swimming as your back-up sport. It is likely that the type of person who can play tennis has the good fortune of having wealth, friends, and resources to afford them a better life. These factors increase longevity, not playing tennis. Tennis becomes a symbol, a marker for folks with such a life. That’s why we speculate that survival would be even better for the folks who can afford also to have a ball boy.
The researchers who performed this study were aware of this. The New York Times article that reported on the research contains a paragraph that reads:
Income and other aspects of people’s lifestyles also likely matter, he (the researcher) says. The researchers tried to account for socioeconomic factors, but it remains possible, he says, that people who have sufficient money and leisure time to play tennis live longer because they have sufficient money and leisure time, not because they play tennis.
And yet, the article quickly moves on, dismissing this most likely explanation. (The classic disclaim and pivot sin). The researcher focuses on a finding that is likely to be spurious and a journalist chooses to deliver clickbait rather than challenging the researcher’s interpretation. What is missed is that this research really does demonstrate something interesting. This article, which looks at the effect of sport on longevity, is showing us that there are real healthcare disparities, even in the famously egalitarian Scandinavian societies, and sport is a pretty good marker for these disparities.
In this way, the journalist commits our 7th sin, incuriousness. Missing the real story for the one handed to him. A lack of curiosity probably underlies most churnalism. When a journalist commits any of the churnalistic sins, he is missing the interesting findings buried in research. If chilies are not panaceas, what is it that makes healthy people eat chilies? Some articles bring the level of indifference to such an extreme that it becomes a sin in and of itself. The actual definition of churnalism is reporting press releases; a behavior so egregious that we are not even mentioning articles that practice true churnalism. The articles that we are (and will) discuss reveal and author committing the sin of incuriousness, making no effort to look beneath the researcher's interpretation of that data. These articles seldom ask the author for an alternative interpretation of the data or perhaps seek out the views of a disinterested or skeptical source. In this article about tennis in Denmark, there seems to have been no curiosity to look beyond the researchers' spin on their own data.
The cure for depression is being dunked in an ice bath
Betteridge's law of headlines states that the answer to any question in a headline is no. Thus, maybe we should just answer this question raised in a BBC article -- “Can cold water swimming treat depression?”2 -- and move on. This article was based on a case report, an anecdote, of a 24-year-old woman who suffered from depression. Her depression was severe, requiring the use of medications. However, after 1-week of cold-water swimming, she felt better, came off her medications, and was able to remain off for one year. The journalist, who reported this case report, added his own opinion, “Like many other people who swim in cold water regularly, I love it, but I also believe it has mental health benefits.”
Most people who exercise will tell you that they feel better after brisk exercise. Vinay is a year-round cyclist and came to love a hard ride in Portland’s icy cold rain. Adam is a swimmer, who finds swimming in frosty Lake Michigan exhilarating. But simply feeling calmer, energized, or more creative after exercise is not the same thing as evidence that it is a treatment for a medical condition.
A case report is generally accepted to be the lowest level of evidence in the medical literature. Case reports are published not to suggest that doctors and patients should try to replicate the events. Case reports are published so that when someone notices something interesting, she can get it out into the world for other people to consider. If other people think, “You know what, I have also noted that my patients with depression get better after adopting a regimen of cold-water swimming,” then one might be inspired to design a study to assess if this is really a beneficial treatment.
What would such a study look like? A sample of people with depression would need to be recruited, randomized to cold-water swimming or another healthy exercise (maybe playing tennis, in Denmark, with the help of a ball-boy), and then followed to a validated endpoint.
Why do we need a sample of people rather than one individual? Because every person is different and before recommending a treatment, we would like to know that it is likely to work in many people and not just in one person, who, by definition, is idiosyncratic. Why do we need a comparison group? Depression, like many diseases, has an unpredictable course. The normal course of depression is for it to wax and wane. Comparing the treatment to a control group allows us to know if an observed benefit is related to the treatment rather than the natural course of the disease. Every day, something happens before something else. The order in which they occur does not mean that the thing that happened first causes what happens second. Some 24-year-olds have depression that resolves without doing anything. Some 24-year-olds get a new job, a new girlfriend, or a new car before their improvement. This does not mean we would start recommending that the depressed buy a new car or dump their significant other based on an anecdote. The endpoint in our cold swimming for depression study, the outcome for which we are comparing the treatment group and the control group, must be a validated measure of depression symptoms, or the ability to withdraw anti-depressive medications, or maybe even suicide rates.
Thus, the BBC article is an example of reporting out of context. It attempts to extrapolate the uncontrolled, though enticingly positive, experience of one person to people with depression.
I continue to read your posts because I'm hoping they will sharpen my own critical thinking about health research. However, especially in light of some of the comments from other readers, I'm becoming concerned that your writing is also contributing to a kind of nihilism regarding health and medical research. That would not be good. Every teacher knows that they must offer examples of good work as well as bad. Are you going to give us some examples of solid research that have contributed to the prevention or treatment of disease (or solid journalism reporting on health research, good or bad)?
Love the article and the comments about it. I am not alone in this.