As coaches and athletes, we’re constantly surrounded by an unending stream of reports on the findings of this or that study. Many of them seem to contradict each other and it can be confusing how two studies can have completely different findings.
In this article, I’ll cover some of the basic concepts about how a study is designed and how to separate the good from the bad.
Internal Validity and External Validity
Judging the quality of a study consists of examining two areas: internal validity and external validity. Internal validity refers to the controls of a study. External validity refers to whether or not the results of that study can be projected back onto a larger population.
Internal validity is primarily built through accurate measurement instruments (both the equipment and type of test) and control of confounding factors. A confounding factor is any unaccounted for item that caused the effect you observed in your study. Did that new supplement increase the subject’s the back squat weight or was it something else, like the training program the subject used or a change in diet?
RELATED: A Lesson in Study Design (And the Bench Press)
External validity is where you need to examine who the subjects were in the study. Were they all men? All women? Elite athletes? Novices? Old? Young? The demographic of the subjects is incredibly important. Look closely at who was included in the study and ask if those subjects represent you or your athletes. The less varied the demographic of the subjects, the smaller the population the results apply to.
What Do Studies Do?
Now we know these concepts, but how in the world do we figure out the level of internal and external validity in the study we are reading? To start, let’s go back to the basics of what a study is and what it does.
A scientific study tries to answer a question by creating a situation that can be observed and measured. In health and fitness, we are mostly looking to observe cause and effect. This caused that. Did taking creatine increase your one-rep-max (1RM) back squat? Did having that last beer make you that much more awesome? I don’t know how many beers you had, so we’ll use a fake study about creatine and increasing the 1RM back squat as our example.
“An important part of understanding studies is to never, ever believe that just because a study found such-and-such then that is now a fact.”
The most basic step in designing a study is developing a dependent variable and an independent variable. In our study, the weight of back squat is the dependent variable and the amount of creatine taken is the independent variable. The researcher determines the amount of creatine the participants will take in the study design and then measures back squat weights to determine if they changed. That means the this is the independent variable and the that is the dependent variable in your question of this caused that.
The Role of Bias
One of the most important factors that can affect the quality of a study is bias. Bias is a situation where either the researcher did not account for something that affected the results or subjects were selected that were more likely to have a specific outcome. Bias affects both internal and external validity. Not accounting for something that affected the results is a confounding factor, which reduces internal validity. Selection bias is tremendously important in health and fitness research and affects external validity.
The Importance of the Randomized Controlled Trial
Another important aspect of study quality is the type of study conducted. While there are several types of studies, including observational and case studies, the only true way to determine cause and effect is using a randomized controlled trial, or RCT. Other types of studies are nice for presenting data, but to say this caused that, you have to use an RCT. Yes, that is a must.
“The goal of a good study design is to change just one factor, the independent variable, and keep everything else between the study groups as similar as absolutely possible.”
An RCT is the necessary study design for showing cause and effect because it reduces selection bias. Meaning, every subject in the study has an equal chance of being selected for either group. In an RCT, you collect your test subjects and randomly assign them into two or more groups. At a minimum, you must have a test group and a control group. In our fictional study, the test group would take creatine and the control group wouldn’t take any supplements. The goal of a good study design is to change just one factor, the independent variable, and keep everything else between the study groups as similar as absolutely possible.
The one exception to this is for review studies, which include literature reviews and meta-analyses. In these studies, the author compares as many studies as can be found that relate to a certain topic and analyzes the study designs and results. This is a great way to get a picture of a much larger body of evidence about that topic.
Studies Don’t Prove Anything
It is important to understand that scientific studies don’t prove anything. They suggest. They provide supporting evidence. An important part of understanding studies is to never, ever believe that just because a study found such-and-such then that is now a fact.
“You need to examine everything closely, ask hard questions, and be a difficult sell on any idea.”
Sometimes researchers are biased and interpret results incorrectly. Sometimes there’s a confounding factor that wasn’t accounted for. Sometimes you just can’t figure out why the results were the way they were. Look for multiple studies that support the hypothesis. For extra credit, look for studies that don’t support that hypothesis and compare them with the ones that do.
Guidelines to Help You Get the Most From Reading Studies
- Type of study. As stated above, the study has to be an RCT. An abstract won’t always come straight out and call a study an RCT, so look for the key word randomized. If you can’t find it, it’s not an RCT and the results aren’t worth your time.
- Were the researchers blinded? Single blinding means the researchers did not know whether the participants they were testing belonged to the test or control group and is an absolute requirement for reducing bias and establishing internal validity. Like using an RCT, this is a necessity and the results of a study that doesn’t use it shouldn’t be trusted.
- Who were the participants? Remember, we’re looking for good external validity here. Does the study population represent the general public, you or your athletes, or a group that you don’t even work with? If the participants don’t sound like you or your athletes, the results won’t necessarily apply to you.
- What was the level of training of the participants? One of the golden rules of exercise research is to never, ever use novice athletes. Novice athletes will respond positively to just about anything. Seriously, never buy into the results of a study that used novice athletes.
- What is the difference between the test and control groups? This is going to tell you what the study actually examined, and it’s not uncommon for a poorly designed study to end with the results reflecting something completely different from what was intended. Was the topic of the study the back squat, but the results were determined by testing leg press or leg extension weight? Seriously, this stuff is out there.
This all may sound a bit cynical, but in all honesty, that’s the scientific way. You need to examine everything closely, ask hard questions, and be a difficult sell on any idea. To do anything else is to be a sucker for every fad that comes along. Practice looking at studies, asking these questions, and thinking critically about the quality of the study and meaning of the results. Keep at it and you’ll be crunching through studies like a pro in no time.
READ: Correlation and Causation: What Pubs Have to Do With Your Clean and Jerk
There is much more to truly understanding how to read scientific studies, so this article is only an overview to get you started. Have a question? Ask it in the comments!
Photos courtesy of Shutterstock.