Have a language expert improve your writing
Run a free plagiarism check in 10 minutes, generate accurate citations for free.
- Knowledge Base
- What Is a Case-Control Study? | Definition & Examples
What Is a Case-Control Study? | Definition & Examples
Published on February 4, 2023 by Tegan George . Revised on June 22, 2023.
A case-control study is an experimental design that compares a group of participants possessing a condition of interest to a very similar group lacking that condition. Here, the participants possessing the attribute of study, such as a disease, are called the “case,” and those without it are the “control.”
It’s important to remember that the case group is chosen because they already possess the attribute of interest. The point of the control group is to facilitate investigation, e.g., studying whether the case group systematically exhibits that attribute more than the control group does.
Table of contents
When to use a case-control study, examples of case-control studies, advantages and disadvantages of case-control studies, other interesting articles, frequently asked questions.
Case-control studies are a type of observational study often used in fields like medical research, environmental health, or epidemiology. While most observational studies are qualitative in nature, case-control studies can also be quantitative , and they often are in healthcare settings. Case-control studies can be used for both exploratory and explanatory research , and they are a good choice for studying research topics like disease exposure and health outcomes.
A case-control study may be a good fit for your research if it meets the following criteria.
- Data on exposure (e.g., to a chemical or a pesticide) are difficult to obtain or expensive.
- The disease associated with the exposure you’re studying has a long incubation period or is rare or under-studied (e.g., AIDS in the early 1980s).
- The population you are studying is difficult to contact for follow-up questions (e.g., asylum seekers).
Retrospective cohort studies use existing secondary research data, such as medical records or databases, to identify a group of people with a common exposure or risk factor and to observe their outcomes over time. Case-control studies conduct primary research , comparing a group of participants possessing a condition of interest to a very similar group lacking that condition in real time.
The only proofreading tool specialized in correcting academic writing - try for free!
The academic proofreading tool has been trained on 1000s of academic texts and by native English editors. Making it the most accurate and reliable proofreading tool for students.
Try for free
Case-control studies are common in fields like epidemiology, healthcare, and psychology.
You would then collect data on your participants’ exposure to contaminated drinking water, focusing on variables such as the source of said water and the duration of exposure, for both groups. You could then compare the two to determine if there is a relationship between drinking water contamination and the risk of developing a gastrointestinal illness. Example: Healthcare case-control study You are interested in the relationship between the dietary intake of a particular vitamin (e.g., vitamin D) and the risk of developing osteoporosis later in life. Here, the case group would be individuals who have been diagnosed with osteoporosis, while the control group would be individuals without osteoporosis.
You would then collect information on dietary intake of vitamin D for both the cases and controls and compare the two groups to determine if there is a relationship between vitamin D intake and the risk of developing osteoporosis. Example: Psychology case-control study You are studying the relationship between early-childhood stress and the likelihood of later developing post-traumatic stress disorder (PTSD). Here, the case group would be individuals who have been diagnosed with PTSD, while the control group would be individuals without PTSD.
Case-control studies are a solid research method choice, but they come with distinct advantages and disadvantages.
Advantages of case-control studies
- Case-control studies are a great choice if you have any ethical considerations about your participants that could preclude you from using a traditional experimental design .
- Case-control studies are time efficient and fairly inexpensive to conduct because they require fewer subjects than other research methods .
- If there were multiple exposures leading to a single outcome, case-control studies can incorporate that. As such, they truly shine when used to study rare outcomes or outbreaks of a particular disease .
Disadvantages of case-control studies
- Case-control studies, similarly to observational studies, run a high risk of research biases . They are particularly susceptible to observer bias , recall bias , and interviewer bias.
- In the case of very rare exposures of the outcome studied, attempting to conduct a case-control study can be very time consuming and inefficient .
- Case-control studies in general have low internal validity and are not always credible.
Case-control studies by design focus on one singular outcome. This makes them very rigid and not generalizable , as no extrapolation can be made about other outcomes like risk recurrence or future exposure threat. This leads to less satisfying results than other methodological choices.
If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.
- Student’s t -distribution
- Normal distribution
- Null and Alternative Hypotheses
- Chi square tests
- Confidence interval
- Quartiles & Quantiles
- Cluster sampling
- Stratified sampling
- Data cleansing
- Reproducibility vs Replicability
- Peer review
- Prospective cohort study
- Implicit bias
- Cognitive bias
- Placebo effect
- Hawthorne effect
- Hindsight bias
- Affect heuristic
- Social desirability bias
Prevent plagiarism. Run a free check.
A case-control study differs from a cohort study because cohort studies are more longitudinal in nature and do not necessarily require a control group .
While one may be added if the investigator so chooses, members of the cohort are primarily selected because of a shared characteristic among them. In particular, retrospective cohort studies are designed to follow a group of people with a common exposure or risk factor over time and observe their outcomes.
Case-control studies, in contrast, require both a case group and a control group, as suggested by their name, and usually are used to identify risk factors for a disease by comparing cases and controls.
A case-control study differs from a cross-sectional study because case-control studies are naturally retrospective in nature, looking backward in time to identify exposures that may have occurred before the development of the disease.
On the other hand, cross-sectional studies collect data on a population at a single point in time. The goal here is to describe the characteristics of the population, such as their age, gender identity, or health status, and understand the distribution and relationships of these characteristics.
Cases and controls are selected for a case-control study based on their inherent characteristics. Participants already possessing the condition of interest form the “case,” while those without form the “control.”
Keep in mind that by definition the case group is chosen because they already possess the attribute of interest. The point of the control group is to facilitate investigation, e.g., studying whether the case group systematically exhibits that attribute more than the control group does.
The strength of the association between an exposure and a disease in a case-control study can be measured using a few different statistical measures , such as odds ratios (ORs) and relative risk (RR).
No, case-control studies cannot establish causality as a standalone measure.
As observational studies , they can suggest associations between an exposure and a disease, but they cannot prove without a doubt that the exposure causes the disease. In particular, issues arising from timing, research biases like recall bias , and the selection of variables lead to low internal validity and the inability to determine causality.
Sources in this article
We strongly encourage students to use sources in their work. You can cite our article (APA Style) or take a deep dive into the articles below.
George, T. (2023, June 22). What Is a Case-Control Study? | Definition & Examples. Scribbr. Retrieved February 15, 2024, from https://www.scribbr.com/methodology/case-control-study/
Schlesselman, J. J. (1982). Case-Control Studies: Design, Conduct, Analysis (Monographs in Epidemiology and Biostatistics, 2) (Illustrated). Oxford University Press.
Is this article helpful?
Other students also liked, what is an observational study | guide & examples, control groups and treatment groups | uses & examples, cross-sectional study | definition, uses & examples, what is your plagiarism score.
Study Design 101
- Helpful formulas
- Finding specific study types
- Case Control Study
- Meta- Analysis
- Systematic Review
- Practice Guideline
- Randomized Controlled Trial
- Cohort Study
- Case Reports
A study that compares patients who have a disease or outcome of interest (cases) with patients who do not have the disease or outcome (controls), and looks back retrospectively to compare how frequently the exposure to a risk factor is present in each group to determine the relationship between the risk factor and the disease.
Case control studies are observational because no intervention is attempted and no attempt is made to alter the course of the disease. The goal is to retrospectively determine the exposure to the risk factor of interest from each of the two groups of individuals: cases and controls. These studies are designed to estimate odds.
Case control studies are also known as "retrospective studies" and "case-referent studies."
- Good for studying rare conditions or diseases
- Less time needed to conduct the study because the condition or disease has already occurred
- Lets you simultaneously look at multiple risk factors
- Useful as initial studies to establish an association
- Can answer questions that could not be answered through other study designs
- Retrospective studies have more problems with data quality because they rely on memory and people with a condition will be more motivated to recall risk factors (also called recall bias).
- Not good for evaluating diagnostic tests because itâ€™s already clear that the cases have the condition and the controls do not
- It can be difficult to find a suitable control group
Design pitfalls to look out for
Care should be taken to avoid confounding, which arises when an exposure and an outcome are both strongly associated with a third variable. Controls should be subjects who might have been cases in the study but are selected independent of the exposure. Cases and controls should also not be "over-matched."
Is the control group appropriate for the population? Does the study use matching or pairing appropriately to avoid the effects of a confounding variable? Does it use appropriate inclusion and exclusion criteria?
There is a suspicion that zinc oxide, the white non-absorbent sunscreen traditionally worn by lifeguards is more effective at preventing sunburns that lead to skin cancer than absorbent sunscreen lotions. A case-control study was conducted to investigate if exposure to zinc oxide is a more effective skin cancer prevention measure. The study involved comparing a group of former lifeguards that had developed cancer on their cheeks and noses (cases) to a group of lifeguards without this type of cancer (controls) and assess their prior exposure to zinc oxide or absorbent sunscreen lotions.
This study would be retrospective in that the former lifeguards would be asked to recall which type of sunscreen they used on their face and approximately how often. This could be either a matched or unmatched study, but efforts would need to be made to ensure that the former lifeguards are of the same average age, and lifeguarded for a similar number of seasons and amount of time per season.
Boubekri, M., Cheung, I., Reid, K., Wang, C., & Zee, P. (2014). Impact of windows and daylight exposure on overall health and sleep quality of office workers: a case-control pilot study . Journal of Clinical Sleep Medicine : JCSM : Official Publication of the American Academy of Sleep Medicine, 10 (6), 603-611. https://doi.org/10.5664/jcsm.3780
This pilot study explored the impact of exposure to daylight on the health of office workers (measuring well-being and sleep quality subjectively, and light exposure, activity level and sleep-wake patterns via actigraphy). Individuals with windows in their workplaces had more light exposure, longer sleep duration, and more physical activity. They also reported a better scores in the areas of vitality and role limitations due to physical problems, better sleep quality and less sleep disturbances.
Togha, M., Razeghi Jahromi, S., Ghorbani, Z., Martami, F., & Seifishahpar, M. (2018). Serum Vitamin D Status in a Group of Migraine Patients Compared With Healthy Controls: A Case-Control Study . Headache, 58 (10), 1530-1540. https://doi.org/10.1111/head.13423
This case-control study compared serum vitamin D levels in individuals who experience migraine headaches with their matched controls. Studied over a period of thirty days, individuals with higher levels of serum Vitamin D was associated with lower odds of migraine headache.
- Odds ratio in an unmatched study
- Odds ratio in a matched study
A patient with the disease or outcome of interest.
When an exposure and an outcome are both strongly associated with a third variable.
A patient who does not have the disease or outcome.
Each case is matched individually with a control according to certain characteristics such as age and gender. It is important to remember that the concordant pairs (pairs in which the case and control are either both exposed or both not exposed) tell us nothing about the risk of exposure separately for cases or controls.
The method of assignment of individuals to study and control groups in observational studies when the investigator does not intervene to perform the assignment.
The controls are a sample from a suitable non-affected population.
Now test yourself!
1. Case Control Studies are prospective in that they follow the cases and controls over time and observe what occurs.
a) True b) False
2. Which of the following is an advantage of Case Control Studies?
a) They can simultaneously look at multiple risk factors. b) They are useful to initially establish an association between a risk factor and a disease or outcome. c) They take less time to complete because the condition or disease has already occurred. d) b and c only e) a, b, and c
← Previous Next →
© 2011-2019, The Himmelfarb Health Sciences Library Questions? Ask us .
- Himmelfarb Intranet
- Privacy Notice
- GW is committed to digital accessibility. If you experience a barrier that affects your ability to access content on this page, let us know via the Accessibility Feedback Form .
Open topic with navigation
Prospective vs. Retrospective Studies
A prospective study watches for outcomes, such as the development of a disease, during the study period and relates this to other factors such as suspected risk or protection factor(s). The study usually involves taking a cohort of subjects and watching them over a long period. The outcome of interest should be common; otherwise, the number of outcomes observed will be too small to be statistically meaningful (indistinguishable from those that may have arisen by chance). All efforts should be made to avoid sources of bias such as the loss of individuals to follow up during the study. Prospective studies usually have fewer potential sources of bias and confounding than retrospective studies.
A retrospective study looks backwards and examines exposures to suspected risk or protection factors in relation to an outcome that is established at the start of the study. Many valuable case-control studies, such as Lane and Claypon's 1926 investigation of risk factors for breast cancer, were retrospective investigations. Most sources of error due to confounding and bias are more common in retrospective studies than in prospective studies. For this reason, retrospective investigations are often criticised. If the outcome of interest is uncommon, however, the size of prospective investigation required to estimate relative risk is often too large to be feasible. In retrospective studies the odds ratio provides an estimate of relative risk. You should take special care to avoid sources of bias and confounding in retrospective studies.
Prospective investigation is required to make precise estimates of either the incidence of an outcome or the relative risk of an outcome based on exposure.
Case-Control studies are usually but not exclusively retrospective, the opposite is true for cohort studies. The following notes relate case-control to cohort studies:
- outcome is measured before exposure
- controls are selected on the basis of not having the outcome
- good for rare outcomes
- relatively inexpensive
- smaller numbers required
- quicker to complete
- prone to selection bias
- prone to recall/retrospective bias
- related methods are risk (retrospective) , chi-square 2 by 2 test , Fisher's exact test , exact confidence interval for odds ratio , odds ratio meta-analysis and conditional logistic regression .
Cohort studies are usually but not exclusively prospective, the opposite is true for case-control studies. The following notes relate cohort to case-control studies:
- outcome is measured after exposure
- yields true incidence rates and relative risks
- may uncover unanticipated associations with outcome
- best for common outcomes
- requires large numbers
- takes a long time to complete
- prone to attrition bias (compensate by using person-time methods)
- prone to the bias of change in methods over time
- related methods are risk (prospective) , relative risk meta-analysis , risk difference meta-analysis and proportions
Copyright © 2000-2023 StatsDirect Limited, all rights reserved. Download here .
What Is A Case Control Study?
Editor at Simply Psychology
BA (Hons) Psychology, Princeton University
Julia Simkus is a graduate of Princeton University with a Bachelor of Arts in Psychology. She is currently studying for a Master's Degree in Counseling for Mental Health and Wellness in September 2023. Julia's research has been published in peer reviewed journals.
Learn about our Editorial Process
Saul Mcleod, PhD
Editor-in-Chief for Simply Psychology
BSc (Hons) Psychology, MRes, PhD, University of Manchester
Saul Mcleod, Ph.D., is a qualified psychology teacher with over 18 years experience of working in further and higher education. He has been published in peer-reviewed journals, including the Journal of Clinical Psychology.
On This Page:
A case-control study is a research method where two groups of people are compared – those with the condition (cases) and those without (controls). By looking at their past, researchers try to identify what factors might have contributed to the condition in the ‘case’ group.
A case-control study looks at people who already have a certain condition (cases) and people who don’t (controls). By comparing these two groups, researchers try to figure out what might have caused the condition. They look into the past to find clues, like habits or experiences, that are different between the two groups.
The “cases” are the individuals with the disease or condition under study, and the “controls” are similar individuals without the disease or condition of interest.
The controls should have similar characteristics (i.e., age, sex, demographic, health status) to the cases to mitigate the effects of confounding variables .
Case-control studies identify any associations between an exposure and an outcome and help researchers form hypotheses about a particular population.
Researchers will first identify the two groups, and then look back in time to investigate which subjects in each group were exposed to the condition.
If the exposure is found more commonly in the cases than the controls, the researcher can hypothesize that the exposure may be linked to the outcome of interest.
Figure: Schematic diagram of case-control study design. Kenneth F. Schulz and David A. Grimes (2002) Case-control studies: research in reverse . The Lancet Volume 359, Issue 9304, 431 – 434
Quick, inexpensive, and simple
Because these studies use already existing data and do not require any follow-up with subjects, they tend to be quicker and cheaper than other types of research. Case-control studies also do not require large sample sizes.
Beneficial for studying rare diseases
Researchers in case-control studies start with a population of people known to have the target disease instead of following a population and waiting to see who develops it. This enables researchers to identify current cases and enroll a sufficient number of patients with a particular rare disease.
Useful for preliminary research
Case-control studies are beneficial for an initial investigation of a suspected risk factor for a condition. The information obtained from cross-sectional studies then enables researchers to conduct further data analyses to explore any relationships in more depth.
Subject to recall bias.
Participants might be unable to remember when they were exposed or omit other details that are important for the study. In addition, those with the outcome are more likely to recall and report exposures more clearly than those without the outcome.
Difficulty finding a suitable control group
It is important that the case group and the control group have almost the same characteristics, such as age, gender, demographics, and health status.
Forming an accurate control group can be challenging, so sometimes researchers enroll multiple control groups to bolster the strength of the case-control study.
Do not demonstrate causation
Case-control studies may prove an association between exposures and outcomes, but they can not demonstrate causation.
A case-control study is an observational study where researchers analyzed two groups of people (cases and controls) to look at factors associated with particular diseases or outcomes.
Below are some examples of case-control studies:
- Investigating the impact of exposure to daylight on the health of office workers (Boubekri et al., 2014).
- Comparing serum vitamin D levels in individuals who experience migraine headaches with their matched controls (Togha et al., 2018).
- Analyzing correlations between parental smoking and childhood asthma (Strachan and Cook, 1998).
- Studying the relationship between elevated concentrations of homocysteine and an increased risk of vascular diseases (Ford et al., 2002).
- Assessing the magnitude of the association between Helicobacter pylori and the incidence of gastric cancer (Helicobacter and Cancer Collaborative Group, 2001).
- Evaluating the association between breast cancer risk and saturated fat intake in postmenopausal women (Howe et al., 1990).
Frequently asked questions
1. what’s the difference between a case-control study and a cross-sectional study.
Case-control studies are different from cross-sectional studies in that case-control studies compare groups retrospectively while cross-sectional studies analyze information about a population at a specific point in time.
In cross-sectional studies , researchers are simply examining a group of participants and depicting what already exists in the population.
2. What’s the difference between a case-control study and a longitudinal study?
Case-control studies compare groups retrospectively, while longitudinal studies can compare groups either retrospectively or prospectively.
In a longitudinal study , researchers monitor a population over an extended period of time, and they can be used to study developmental shifts and understand how certain things change as we age.
In addition, case-control studies look at a single subject or a single case, whereas longitudinal studies can be conducted on a large group of subjects.
3. What’s the difference between a case-control study and a retrospective cohort study?
Case-control studies are retrospective as researchers begin with an outcome and trace backward to investigate exposure; however, they differ from retrospective cohort studies.
In a retrospective cohort study , researchers examine a group before any of the subjects have developed the disease, then examine any factors that differed between the individuals who developed the condition and those who did not.
Thus, the outcome is measured after exposure in retrospective cohort studies, whereas the outcome is measured before the exposure in case-control studies.
Boubekri, M., Cheung, I., Reid, K., Wang, C., & Zee, P. (2014). Impact of windows and daylight exposure on overall health and sleep quality of office workers: a case-control pilot study. Journal of Clinical Sleep Medicine: JCSM: Official Publication of the American Academy of Sleep Medicine, 10 (6), 603-611.
Ford, E. S., Smith, S. J., Stroup, D. F., Steinberg, K. K., Mueller, P. W., & Thacker, S. B. (2002). Homocyst (e) ine and cardiovascular disease: a systematic review of the evidence with special emphasis on case-control studies and nested case-control studies. International journal of epidemiology, 31 (1), 59-70.
Helicobacter and Cancer Collaborative Group. (2001). Gastric cancer and Helicobacter pylori: a combined analysis of 12 case control studies nested within prospective cohorts. Gut, 49 (3), 347-353.
Howe, G. R., Hirohata, T., Hislop, T. G., Iscovich, J. M., Yuan, J. M., Katsouyanni, K., … & Shunzhang, Y. (1990). Dietary factors and risk of breast cancer: combined analysis of 12 case—control studies. JNCI: Journal of the National Cancer Institute, 82 (7), 561-569.
Lewallen, S., & Courtright, P. (1998). Epidemiology in practice: case-control studies. Community eye health, 11 (28), 57–58.
Strachan, D. P., & Cook, D. G. (1998). Parental smoking and childhood asthma: longitudinal and case-control studies. Thorax, 53 (3), 204-212.
Tenny, S., Kerndt, C. C., & Hoffman, M. R. (2021). Case Control Studies. In StatPearls . StatPearls Publishing.
Togha, M., Razeghi Jahromi, S., Ghorbani, Z., Martami, F., & Seifishahpar, M. (2018). Serum Vitamin D Status in a Group of Migraine Patients Compared With Healthy Controls: A Case-Control Study. Headache, 58 (10), 1530-1540.
- Schulz, K. F., & Grimes, D. A. (2002). Case-control studies: research in reverse. The Lancet, 359(9304), 431-434.
- What is a case-control study?
Leave a Comment Cancel reply
You must be logged in to post a comment.
Cookies on GOV.UK
We use some essential cookies to make this website work.
We’d like to set additional cookies to understand how you use GOV.UK, remember your settings and improve government services.
You have accepted additional cookies. You can change your cookie settings at any time.
You have rejected additional cookies. You can change your cookie settings at any time.
- Health and social care
- Public health
- Health improvement
Case-control study: comparative studies
How to use a case-control study to evaluate your digital health product.
This page is part of a collection of guidance on evaluating digital health products .
A case-control study is a type of observational study. It looks at 2 sets of participants. One group has the condition you are interested in (the cases) and one group does not have it (the controls).
In other respects, the participants in both groups are similar. You can then look at a particular factor that might have caused the condition, such as your digital product, and compare participants from the 2 groups in relation to that.
A case-control study is an observational study because you observe the effects on existing groups rather than designing an experiment where participants are allocated into different groups.
What to use it for
A case-control study can help you to find out if your digital product or service achieves its aims, so it can be useful when you have developed your product (summative evaluation).
It can be a useful method when it would be difficult or impossible to randomise participants, for example, if your product aims to help people with rare health conditions.
Case-control studies have many benefits.
- help to estimate the effects of your digital product when randomisation is not possible
- use existing data, which could be cheaper and easier
- operate with fewer participants compared to other designs
There can also be drawbacks of a case-control study.
- you need to pay careful attention to factors that may influence your results, confounding factors and biases – see explanation in ‘How to carry out a case-control study’ below
- there may be challenges when accessing pre-existing data
- you cannot draw definitive answers about the effects of your product as you haven’t randomly selected participants for your evaluation
How to carry out a case-control study
In a traditional case-control design, cases and controls are looked at retrospectively – that is, the health condition and the factor that might have caused it have already occurred when you start the study.
Sources of cases and controls typically include:
- routinely collected data at medical facilities
- disease registries
- cross-sectional surveys
Some researchers use the term prospective case-control study when, for example, a prospective group exposed to an intervention is compared to a retrospective control.
Choosing your control
Selecting an appropriate control is an important part of a case-control study. The comparison group should be as similar as possible to the source population that produced the cases. This means the participants will be similar to each other in terms of factors that may influence the outcomes you’re looking at. Ideally, they will only differ in whether they received your digital product (cases) or not (controls).
There are 2 main types of case-control design: matched and unmatched.
Essentially, in an unmatched case-control design, a shared control group is selected for all cases at random given certain attributes. In a matched case-control design, controls are selected case-by-case based on specified characteristics. You should pick characteristics that have an effect on the usage of digital devices and services.
Commonly used matching factors include:
- socio-economic status
However, think about other characteristics and attributes that might influence the use of your product, and the subsequent outcomes.
Confounding variables and biases
Confounding variables (variables other than the one you are interested in that may influence the results) and biases (errors that influence the sample selected and results observed) are important to consider when conducting any research. This is especially important in designs that are non-randomised.
- selection bias can happen when participants are assigned without randomisation
- attribution bias may occur when patients with unfavourable outcomes are less likely to attend follow-ups
Analysing your data
The analysis most commonly used in case-control studies is an odds ratio, which is the chance (odds) of the outcomes occurring in the case group versus the control group.
Example: Can telemedicine help with post-bariatric surgery care? A case-control design
In 2019, Wang and colleagues published a paper entitled Exploring the Effects of Telemedicine on Bariatric Surgery Follow-up: a Matched Case Control Study .
The study showed that people who go through bariatric surgery have better outcomes if they attend their follow-up appointments after surgery in comparison to those who do not. However, attending appointments can be challenging for people who live in remote areas. In Ontario, Canada, telemedicine suites were set up to enable healthcare provider-patient videoconferencing.
The researchers used a matched case-control study to investigate if telemedicine videoconferencing can support post-surgery appointment attendance rates in people who live further away from the hospital sites. They used the existing data from the bariatric surgery hospital programme to identify eligible patients.
All patients attending the bariatric surgery were offered telemedicine services. The cases were the participants who used telemedicine services; they were compared to those who did not (the controls).
Cases and controls were matched on various characteristics, specifically:
- time since bariatric surgery
- body mass index ( BMI )
- travel distance from the hospital site
- the percentage of appointments attended
- rates of dropout
- pre-and post-surgery weight and BMI
- various physical and psychological outcomes
They also calculated rurality index to classify patients into urban, non-urban and rural areas. These variables were used to compare cases (those who used telemedicine) and controls (those who did not).
During the study period, they identified that 487 patients of 1,262 who received bariatric surgery used telemedicine services. Of those, 192 agreed to participate in the study.
They found that patients who used telemedicine did as well as patients who attended in person, both in terms of appointment attendance rates and in terms of physical and psychological outcomes.
Moreover, the researchers found that the cases (telemedicine users) came from more rural areas than the controls. The authors argued that this demonstrated that telemedicine can help overcome the known challenges for patients in more rural areas to attend appointments.
Randomising patients to telemedicine or withdrawing the telemedicine would be difficult, undesirable and possibly unethical. Case-control was a good alternative to assess the potential impact on patient outcomes in a service that is already up and running.
More information and resources
A 2003 study by Mann provides an accessible overview of observational research methods, including an explanation of biases and confounding variables.
On the website for Strengthening the Reporting of Observational Studies in Epidemiology ( STROBE ), there is a checklist of items that should be included in reports of case-control studies .
A 2016 study by Pearce offers considerations for the analysis of a matched case-control study.
Examples of case-control studies in digital health
In a 2020 study by Heuvel and others , researchers assessed a new digital health tool to monitor women at increased risk of preeclampsia at home. They investigated if the digital tool allows for fewer antenatal visits without compromising women’s safety, and whether it positively affects pregnancy outcomes. This study used a prospective case group compared to a retrospective control group.
In a 2019 study by Depp and others , the research team examined whether schizophrenia symptoms were associated with mobility (measured using GPS sensors). They compared participants with schizophrenia to healthy controls and they found that less mobility was associated with greater symptoms of schizophrenia.
Is this page useful.
- Yes this page is useful
- No this page is not useful
Help us improve GOV.UK
Don’t include personal or financial information like your National Insurance number or credit card details.
To help us improve GOV.UK, we’d like to know more about your visit today. We’ll send you a link to a feedback form. It will take only 2 minutes to fill in. Don’t worry we won’t send you spam or share your email address with anyone.
- | 2
- | 3
- | 4
- | 5
- | 6
- | 7
- | 8
A Nested Case-Control Study
Retrospective and prospective case-control studies.
Suppose a prospective cohort study were conducted among almost 90,000 women for the purpose of studying the determinants of cancer and cardiovascular disease. After enrollment, the women provide baseline information on a host of exposures, and they also provide baseline blood and urine samples that are frozen for possible future use. The women are then followed, and, after about eight years, the investigators want to test the hypothesis that past exposure to pesticides such as DDT is a risk factor for breast cancer. Eight years have passed since the beginning of the study, and 1.439 women in the cohort have developed breast cancer. Since they froze blood samples at baseline, they have the option of analyzing all of the blood samples in order to ascertain exposure to DDT at the beginning of the study before any cancers occurred. The problem is that there are almost 90,000 women and it would cost $20 to analyze each of the blood samples. If the investigators could have analyzed all 90,000 samples this is what they would have found the results in the table below.
Table of Breast Cancer Occurrence Among Women With or Without DDT Exposure
While 1,439 breast cancers is a disturbing number, it is only 1.6% of the entire cohort, so the outcome is relatively rare, and it is costing a lot of money to analyze the blood specimens obtained from all of the non-diseased women. There is, however, another more efficient alternative, i.e., to use a case-control sampling strategy. One could analyze all of the blood samples from women who had developed breast cancer, but only a sample of the whole cohort in order to estimate the exposure distribution in the population that produced the cases.
If one were to analyze the blood samples of 2,878 of the non-diseased women (twice as many as the number of cases), one would obtain results that would look something like those in the next table.
Odds of Exposure: 360/1079 in the cases versus 432/2,446 in the non-diseased controls.
Totals Samples analyzed = 1,438+2,878 = 4,316
Total Cost = 4,316 x $20 = $86,320
With this approach a similar estimate of risk was obtained after analyzing blood samples from only a small sample of the entire population at a fraction of the cost with hardly any loss in precision. In essence, a case-control strategy was used, but it was conducted within the context of a prospective cohort study. This is referred to as a case-control study "nested" within a cohort study.
Rothman states that one should look upon all case-control studies as being "nested" within a cohort. In other words the cohort represents the source population that gave rise to the cases. With a case-control sampling strategy one simply takes a sample of the population in order to obtain an estimate of the exposure distribution within the population that gave rise to the cases. Obviously, this is a much more efficient design.
It is important to note that, unlike cohort studies, case-control studies do not follow subjects through time. Cases are enrolled at the time they develop disease and controls are enrolled at the same time. The exposure status of each is determined, but they are not followed into the future for further development of disease.
As with cohort studies, case-control studies can be prospective or retrospective. At the start of the study, all cases might have already occurred and then this would be a retrospective case-control study. Alternatively, none of the cases might have already occurred, and new cases will be enrolled prospectively. Epidemiologists generally prefer the prospective approach because it has fewer biases, but it is more expensive and sometimes not possible. When conducted prospectively, or when nested in a prospective cohort study, it is straightforward to select controls from the population at risk. However, in retrospective case-control studies, it can be difficult to select from the population at risk, and controls are then selected from those in the population who didn't develop disease. Using only the non-diseased to select controls as opposed to the whole population means the denominator is not really a measure of disease frequency, but when the disease is rare , the odds ratio using the non-diseased will be very similar to the estimate obtained when the entire population is used to sample for controls. This phenomenon is known as the r are-disease assumption . When case-control studies were first developed, most were conducted retrospectively, and it is sometimes assumed that the rare-disease assumption applies to all case-control studies. However, it actually only applies to those case-control studies in which controls are sampled only from the non-diseased rather than the whole population.
The difference between sampling from the whole population and only the non-diseased is that the whole population contains people both with and without the disease of interest. This means that a sampling strategy that uses the whole population as its source must allow for the fact that people who develop the disease of interest can be selected as controls. Students often have a difficult time with this concept. It is helpful to remember that it seems natural that the population denominator includes people who develop the disease in a cohort study. If a case-control study is a more efficient way to obtain the information from a cohort study, then perhaps it is not so strange that the denominator in a case-control study also can include people who develop the disease. This topic is covered in more detail in EP813 Intermediate Epidemiology.
Students usually think of case-control studies as being only retrospective, since the investigators enroll subjects who have developed the outcome of interest. However, case-control studies, like cohort studies, can be either retrospective or prospective. In a prospective case-control study, the investigator still enrolls based on outcome status, but the investigator must wait to the cases to occur.
return to top | previous page | next page
Content ©2016. All Rights Reserved. Date last modified: June 7, 2016. Wayne W. LaMorte, MD, PhD, MPH
Case Control Studies
- 1 University of Nebraska Medical Center
- 2 Spectrum Health/Michigan State University College of Human Medicine
- PMID: 28846237
- Bookshelf ID: NBK448143
A case-control study is a type of observational study commonly used to look at factors associated with diseases or outcomes. The case-control study starts with a group of cases, which are the individuals who have the outcome of interest. The researcher then tries to construct a second group of individuals called the controls, who are similar to the case individuals but do not have the outcome of interest. The researcher then looks at historical factors to identify if some exposure(s) is/are found more commonly in the cases than the controls. If the exposure is found more commonly in the cases than in the controls, the researcher can hypothesize that the exposure may be linked to the outcome of interest.
For example, a researcher may want to look at the rare cancer Kaposi's sarcoma. The researcher would find a group of individuals with Kaposi's sarcoma (the cases) and compare them to a group of patients who are similar to the cases in most ways but do not have Kaposi's sarcoma (controls). The researcher could then ask about various exposures to see if any exposure is more common in those with Kaposi's sarcoma (the cases) than those without Kaposi's sarcoma (the controls). The researcher might find that those with Kaposi's sarcoma are more likely to have HIV, and thus conclude that HIV may be a risk factor for the development of Kaposi's sarcoma.
There are many advantages to case-control studies. First, the case-control approach allows for the study of rare diseases. If a disease occurs very infrequently, one would have to follow a large group of people for a long period of time to accrue enough incident cases to study. Such use of resources may be impractical, so a case-control study can be useful for identifying current cases and evaluating historical associated factors. For example, if a disease developed in 1 in 1000 people per year (0.001/year) then in ten years one would expect about 10 cases of a disease to exist in a group of 1000 people. If the disease is much rarer, say 1 in 1,000,0000 per year (0.0000001/year) this would require either having to follow 1,000,0000 people for ten years or 1000 people for 1000 years to accrue ten total cases. As it may be impractical to follow 1,000,000 for ten years or to wait 1000 years for recruitment, a case-control study allows for a more feasible approach.
Second, the case-control study design makes it possible to look at multiple risk factors at once. In the example above about Kaposi's sarcoma, the researcher could ask both the cases and controls about exposures to HIV, asbestos, smoking, lead, sunburns, aniline dye, alcohol, herpes, human papillomavirus, or any number of possible exposures to identify those most likely associated with Kaposi's sarcoma.
Case-control studies can also be very helpful when disease outbreaks occur, and potential links and exposures need to be identified. This study mechanism can be commonly seen in food-related disease outbreaks associated with contaminated products, or when rare diseases start to increase in frequency, as has been seen with measles in recent years.
Because of these advantages, case-control studies are commonly used as one of the first studies to build evidence of an association between exposure and an event or disease.
In a case-control study, the investigator can include unequal numbers of cases with controls such as 2:1 or 4:1 to increase the power of the study.
Disadvantages and Limitations
The most commonly cited disadvantage in case-control studies is the potential for recall bias. Recall bias in a case-control study is the increased likelihood that those with the outcome will recall and report exposures compared to those without the outcome. In other words, even if both groups had exactly the same exposures, the participants in the cases group may report the exposure more often than the controls do. Recall bias may lead to concluding that there are associations between exposure and disease that do not, in fact, exist. It is due to subjects' imperfect memories of past exposures. If people with Kaposi's sarcoma are asked about exposure and history (e.g., HIV, asbestos, smoking, lead, sunburn, aniline dye, alcohol, herpes, human papillomavirus), the individuals with the disease are more likely to think harder about these exposures and recall having some of the exposures that the healthy controls.
Case-control studies, due to their typically retrospective nature, can be used to establish a correlation between exposures and outcomes, but cannot establish causation . These studies simply attempt to find correlations between past events and the current state.
When designing a case-control study, the researcher must find an appropriate control group. Ideally, the case group (those with the outcome) and the control group (those without the outcome) will have almost the same characteristics, such as age, gender, overall health status, and other factors. The two groups should have similar histories and live in similar environments. If, for example, our cases of Kaposi's sarcoma came from across the country but our controls were only chosen from a small community in northern latitudes where people rarely go outside or get sunburns, asking about sunburn may not be a valid exposure to investigate. Similarly, if all of the cases of Kaposi's sarcoma were found to come from a small community outside a battery factory with high levels of lead in the environment, then controls from across the country with minimal lead exposure would not provide an appropriate control group. The investigator must put a great deal of effort into creating a proper control group to bolster the strength of the case-control study as well as enhance their ability to find true and valid potential correlations between exposures and disease states.
Similarly, the researcher must recognize the potential for failing to identify confounding variables or exposures, introducing the possibility of confounding bias, which occurs when a variable that is not being accounted for that has a relationship with both the exposure and outcome. This can cause us to accidentally be studying something we are not accounting for but that may be systematically different between the groups.
Copyright © 2024, StatPearls Publishing LLC.
- Issues of Concern
- Clinical Significance
- Enhancing Healthcare Team Outcomes
- Review Questions
- Study Guide
Guidance for reporting a case-control study
This advice is relevant to studies reporting case-control studies and is based on the STROBE guidelines. In a case-control study people with or without some condition (e.g. diabetes) are compared in relation to past events. Read more
The following information was originally published here.
- Title and abstract
Indicate the study’s design with a commonly used term in the title or the abstract.
Readers should be able to easily identify the design that was used from the title or abstract. An explicit, commonly used term for the study design also helps ensure correct indexing of articles in electronic databases.
Leukaemia incidence among workers in the shoe and boot manufacturing industry: a case-control study.
Provide in the abstract an informative and balanced summary of what was done and what was found.
The abstract provides key information that enables readers to understand a study and decide whether to read the article. Typical components include a statement of the research question, a short description of methods and results, and a conclusion. Abstracts should summarize key details of studies and should only present information that is provided in the article. We advise presenting key results in a numerical form that includes numbers of participants, estimates of associations and appropriate measures of variability and uncertainty (e.g., odds ratios with confidence intervals). We regard it insufficient to state only that an exposure is or is not significantly associated with an outcome.
A series of headings pertaining to the background, design, conduct, and analysis of a study may help readers acquire the essential information rapidly. Many journals require such structured abstracts, which tend to be of higher quality and more readily informative than unstructured summaries.
Background: The expected survival of HIV-infected patients is of major public health interest.
Objective: To estimate survival time and age-specific mortality rates of an HIV-infected population compared with that of the general population.
Design: Population-based cohort study.
Setting: All HIV-infected persons receiving care in Denmark from 1995 to 2005.
Patients: Each member of the nationwide Danish HIV Cohort Study was matched with as many as 99 persons from the general population according to sex, date of birth, and municipality of residence.
Measurements: The authors computed Kaplan-Meier life tables with age as the time scale to estimate survival from age 25 years. Patients with HIV infection and corresponding persons from the general population were observed from the date of the patient's HIV diagnosis until death, emigration, or 1 May 2005.
Results: 3990 HIV-infected patients and 379,872 persons from the general population were included in the study, yielding 22,744 (median, 5.8 y/person) and 2,689,287 (median, 8.4 years/person) person-years of observation. Three percent of participants were lost to follow-up. From age 25 years, the median survival was 19.9 years (95% CI, 18.5 to 21.3) among patients with HIV infection and 51.1 years (CI, 50.9 to 51.5) among the general population. For HIV-infected patients, survival increased to 32.5 years (CI, 29.4 to 34.7) during the 2000 to 2005 period. In the subgroup that excluded persons with known hepatitis C coinfection (16%), median survival was 38.9 years (CI, 35.4 to 40.1) during this same period. The relative mortality rates for patients with HIV infection compared with those for the general population decreased with increasing age, whereas the excess mortality rate increased with increasing age.
Limitations: The observed mortality rates are assumed to apply beyond the current maximum observation time of 10 years.
Conclusions: The estimated median survival is more than 35 years for a young person diagnosed with HIV infection in the late highly active antiretroviral therapy era. However, an ongoing effort is still needed to further reduce mortality rates for these persons compared with the general population.
2. Background / rationale
Explain the scientific background and rationale for the investigation being reported.
The scientific background of the study provides important context for readers. It sets the stage for the study and describes its focus. It gives an overview of what is known on a topic and what gaps in current knowledge are addressed by the study. Background material should note recent pertinent studies and any systematic reviews of pertinent studies.
Concerns about the rising prevalence of obesity in children and adolescents have focused on the well documented associations between childhood obesity and increased cardiovascular risk1 and mortality in adulthood.2 Childhood obesity has considerable social and psychological consequences within childhood and adolescence,3 yet little is known about social, socioeconomic, and psychological consequences in adult life.
A recent systematic review found no longitudinal studies on the outcomes of childhood obesity other than physical health outcomes3 and only two longitudinal studies of the socioeconomic effects of obesity in adolescence. Gortmaker et al found that US women who had been obese in late adolescence in 1981 were less likely to be married and had lower incomes seven years later than women who had not been overweight, while men who had been overweight were less likely to be married.4 Sargent et al found that UK women, but not men, who had been obese at 16 years in 1974 earned 7.4% less than their non-obese peers at age 23.5
The study of adult outcomes of childhood obesity is difficult because obesity often continues into adult life and therefore poorer socioeconomic and educational outcomes may actually reflect confounding by adult obesity. Yet identifying outcomes related to obesity confined childhood is important in determining whether people who are obese in childhood and who later lose weight remain at risk for adult adversity and inequalities.
We used longitudinal data from the 1970 British birth cohort to examine the adult socioeconomic, educational, social, and psychological outcomes of childhood obesity. We hypothesised that obesity limited to childhood has fewer adverse adult outcomes than obesity that persists into adult life.
State specific objectives, including any prespecified hypotheses.
Objectives are the detailed aims of the study. Well crafted objectives specify populations, exposures and outcomes, and parameters that will be estimated. They may be formulated as specific hypotheses or as questions that the study was designed to address. In some situations objectives may be less specific, for example, in early discovery phases. Regardless, the report should clearly reflect the investigators' intentions. For example, if important subgroups or additional analyses were not the original aim of the study but arose during data analysis, they should be described accordingly (see also items 4, 17 and 20).
Our primary objectives were to 1) determine the prevalence of domestic violence among female patients presenting to four community-based, primary care, adult medicine practices that serve patients of diverse socioeconomic background and 2) identify demographic and clinical differences between currently abused patients and patients not currently being abused.
4. Study design
Present key elements of study design early in the paper.
We advise presenting key elements of study design early in the methods section (or at the end of the introduction) so that readers can understand the basics of the study. For example, authors should indicate that the study was a cohort study, which followed people over a particular time period, and describe the group of persons that comprised the cohort and their exposure status. Similarly, if the investigation used a case-control design, the cases and controls and their source population should be described. If the study was a cross-sectional survey, the population and the point in time at which the cross-section was taken should be mentioned. When a study is a variant of the three main study types, there is an additional need for clarity. For instance, for a case-crossover study, one of the variants of the case-control design, a succinct description of the principles was given in the example above .
We recommend that authors refrain from simply calling a study ‘prospective' or ‘retrospective' because these terms are ill defined . One usage sees cohort and prospective as synonymous and reserves the word retrospective for case-control studies . A second usage distinguishes prospective and retrospective cohort studies according to the timing of data collection relative to when the idea for the study was developed . A third usage distinguishes prospective and retrospective case-control studies depending on whether the data about the exposure of interest existed when cases were selected . Some advise against using these terms , or adopting the alternatives ‘concurrent' and ‘historical' for describing cohort studies . In STROBE, we do not use the words prospective and retrospective, nor alternatives such as concurrent and historical. We recommend that, whenever authors use these words, they define what they mean. Most importantly, we recommend that authors describe exactly how and when data collection took place.
The first part of the methods section might also be the place to mention whether the report is one of several from a study. If a new report is in line with the original aims of the study, this is usually indicated by referring to an earlier publication and by briefly restating the salient features of the study. However, the aims of a study may also evolve over time. Researchers often use data for purposes for which they were not originally intended, including, for example, official vital statistics that were collected primarily for administrative purposes, items in questionnaires that originally were only included for completeness, or blood samples that were collected for another purpose. For example, the Physicians' Health Study, a randomized controlled trial of aspirin and carotene, was later used to demonstrate that a point mutation in the factor V gene was associated with an increased risk of venous thrombosis, but not of myocardial infarction or stroke . The secondary use of existing data is a creative part of observational research and does not necessarily make results less credible or less important. However, briefly restating the original aims might help readers understand the context of the research and possible limitations in the data.
We used a case-crossover design, a variation of a case-control design that is appropriate when a brief exposure (driver's phone use) causes a transient rise in the risk of a rare outcome (a crash). We compared a driver's use of a mobile phone at the estimated time of a crash with the same driver's use during another suitable time period. Because drivers are their own controls, the design controls for characteristics of the driver that may affect the risk of a crash but do not change over a short period of time. As it is important that risks during control periods and crash trips are similar, we compared phone activity during the hazard interval (time immediately before the crash) with phone activity during control intervals (equivalent times during which participants were driving but did not crash) in the previous week.
Describe the setting, locations, and relevant dates, including periods of recruitment, exposure, follow-up, and data collection.
Readers need information on setting and locations to assess the context and generalisability of a study's results. Exposures such as environmental factors and therapies can change over time. Also, study methods may evolve over time. Knowing when a study took place and over what period participants were recruited and followed up places the study in historical context and is important for the interpretation of results.
Information about setting includes recruitment sites or sources (e.g., electoral roll, outpatient clinic, cancer registry, or tertiary care centre). Information about location may refer to the countries, towns, hospitals or practices where the investigation took place. We advise stating dates rather than only describing the length of time periods. There may be different sets of dates for exposure, disease occurrence, recruitment, beginning and end of follow-up, and data collection. Of note, nearly 80% of 132 reports in oncology journals that used survival analysis included the starting and ending dates for accrual of patients, but only 24% also reported the date on which follow-up ended .
The Pasitos Cohort Study recruited pregnant women from Women, Infant and Child (WIC) clinics in Socorro and San Elizario, El Paso County, Texas and maternal-child clinics of the Mexican Social Security Institute (IMSS) in Ciudad Juarez, Mexico from April 1998 to October 2000. At baseline, prior to the birth of the enrolled cohort children, staff interviewed mothers regarding the household environment. In this ongoing cohort study, we target follow-up exams at 6-month intervals beginning at age 6 months.
6a Eligibility criteria
Give the eligibility criteria, and the sources and methods of case ascertainment and control selection. Give the rationale for the choice of cases and controls. For matched studies, give matching criteria and the number of controls per case.
Detailed descriptions of the study participants help readers understand the applicability of the results. Investigators usually restrict a study population by defining clinical, demographic and other characteristics of eligible participants. Typical eligibility criteria relate to age, gender, diagnosis and comorbid conditions. Despite their importance, eligibility criteria often are not reported adequately. In a survey of observational stroke research, 17 of 49 reports (35%) did not specify eligibility criteria .
Eligibility criteria may be presented as inclusion and exclusion criteria, although this distinction is not always necessary or useful. Regardless, we advise authors to report all eligibility criteria and also to describe the group from which the study population was selected (e.g., the general population of a region or country), and the method of recruitment (e.g., referral or self-selection through advertisements).
Knowing details about follow-up procedures, including whether procedures minimized non-response and loss to follow-up and whether the procedures were similar for all participants, informs judgments about the validity of results. For example, in a study that used IgM antibodies to detect acute infections, readers needed to know the interval between blood tests for IgM antibodies so that they could judge whether some infections likely were missed because the interval between blood tests was too long . In other studies where follow-up procedures differed between exposed and unexposed groups, readers might recognize substantial bias due to unequal ascertainment of events or differences in non-response or loss to follow-up . Accordingly, we advise that researchers describe the methods used for following participants and whether those methods were the same for all participants, and that they describe the completeness of ascertainment of variables (see also item 14).
In case-control studies, the choice of cases and controls is crucial to interpreting the results, and the method of their selection has major implications for study validity. In general, controls should reflect the population from which the cases arose. Various methods are used to sample controls, all with advantages and disadvantages: for cases that arise from a general population, population roster sampling, random digit dialling, neighbourhood or friend controls are used. Neighbourhood or friend controls may present intrinsic matching on exposure . Controls with other diseases may have advantages over population-based controls, in particular for hospital-based cases, because they better reflect the catchment population of a hospital, have greater comparability of recall and ease of recruitment. However, they can present problems if the exposure of interest affects the risk of developing or being hospitalized for the control condition(s) [43,44]. To remedy this problem often a mixture of the best defensible control diseases is used .
Cutaneous melanoma cases diagnosed in 1999 and 2000 were ascertained through the Iowa Cancer Registry (…). Controls, also identified through the Iowa Cancer Registry, were colorectal cancer patients diagnosed during the same time. Colorectal cancer controls were selected because they are common and have a relatively long survival, and because arsenic exposure has not been conclusively linked to the incidence of colorectal cancer.
6b Eligibility criteria
For matched studies, give matching criteria and the number of controls per case.
Matching is much more common in case-control studies, but occasionally, investigators use matching in cohort studies to make groups comparable at the start of follow-up. Matching in cohort studies makes groups directly comparable for potential confounders and presents fewer intricacies than with case-control studies. For example, it is not necessary to take the matching into account for the estimation of the relative risk . Because matching in cohort studies may increase statistical precision investigators might allow for the matching in their analyses and thus obtain narrower confidence intervals.
In case-control studies matching is done to increase a study's efficiency by ensuring similarity in the distribution of variables between cases and controls, in particular the distribution of potential confounding variables [48,49]. Because matching can be done in various ways, with one or more controls per case, the rationale for the choice of matching variables and the details of the method used should be described. Commonly used forms of matching are frequency matching (also called group matching) and individual matching. In frequency matching, investigators choose controls so that the distribution of matching variables becomes identical or similar to that of cases. Individual matching involves matching one or several controls to each case. Although intuitively appealing and sometimes useful, matching in case-control studies has a number of disadvantages, is not always appropriate, and needs to be taken into account in the analysis.
We aimed to select five controls for every case from among individuals in the study population who had no diagnosis of autism or other pervasive developmental disorders (PDD) recorded in their general practice record and who were alive and registered with a participating practice on the date of the PDD diagnosis in the case. Controls were individually matched to cases by year of birth (up to 1 year older or younger), sex, and general practice. For each of 300 cases, five controls could be identified who met all the matching criteria. For the remaining 994, one or more controls was excluded...
Clearly define all outcomes, exposures, predictors, potential confounders, and effect modifiers. Give diagnostic criteria, if applicable.
Authors should define all variables considered for and included in the analysis, including outcomes, exposures, predictors, potential confounders and potential effect modifiers. Disease outcomes require adequately detailed description of the diagnostic criteria. This applies to criteria for cases in a case-control study, disease events during follow-up in a cohort study and prevalent disease in a cross-sectional study. Clear definitions and steps taken to adhere to them are particularly important for any disease condition of primary interest in the study.
For some studies, ‘determinant' or ‘predictor' may be appropriate terms for exposure variables and outcomes may be called ‘endpoints'. In multivariable models, authors sometimes use ‘dependent variable' for an outcome and ‘independent variable' or ‘explanatory variable' for exposure and confounding variables. The latter is not precise as it does not distinguish exposures from confounders.
If many variables have been measured and included in exploratory analyses in an early discovery phase, consider providing a list with details on each variable in an appendix, additional table or separate publication. Of note, the International Journal of Epidemiology recently launched a new section with ‘cohort profiles', that includes detailed information on what was measured at different points in time in particular studies [56,57]. Finally, we advise that authors declare all ‘candidate variables' considered for statistical analysis, rather than selectively reporting only those included in the final models (see also item 16a) [58,59].
Only major congenital malformations were included in the analyses. Minor anomalies were excluded according to the exclusion list of European Registration of Congenital Anomalies (EUROCAT). If a child had more than one major congenital malformation of one organ system, those malformations were treated as one outcome in the analyses by organ system (…) In the statistical analyses, factors considered potential confounders were maternal age at delivery and number of previous parities. Factors considered potential effect modifiers were maternal age at reimbursement for antiepileptic medication and maternal age at delivery.
8. Data sources / measurement
For each variable of interest give sources of data and details of methods of assessment (measurement). Describe comparability of assessment methods if there is more than one group. Give information separately for cases and controls.
The way in which exposures, confounders and outcomes were measured affects the reliability and validity of a study. Measurement error and misclassification of exposures or outcomes can make it more difficult to detect cause-effect relationships, or may produce spurious relationships. Error in measurement of potential confounders can increase the risk of residual confounding [62,63]. It is helpful, therefore, if authors report the findings of any studies of the validity or reliability of assessments or measurements, including details of the reference standard that was used. Rather than simply citing validation studies (as in the first example), we advise that authors give the estimated validity or reliability, which can then be used for measurement error adjustment or sensitivity analyses (see items 12e and 17).
In addition, it is important to know if groups being compared differed with respect to the way in which the data were collected. This may be important for laboratory examinations (as in the second example) and other situations. For instance, if an interviewer first questions all the cases and then the controls, or vice versa, bias is possible because of the learning curve; solutions such as randomising the order of interviewing may avoid this problem. Information bias may also arise if the compared groups are not given the same diagnostic tests or if one group receives more tests of the same kind than another (see also item 9).
Samples pertaining to matched cases and controls were always analyzed together in the same batch and laboratory personnel were unable to distinguish among cases and controls.
Describe any efforts to address potential sources of bias.
Biased studies produce results that differ systematically from the truth (see also Box 3). It is important for a reader to know what measures were taken during the conduct of a study to reduce the potential of bias. Ideally, investigators carefully consider potential sources of bias when they plan their study. At the stage of reporting, we recommend that authors always assess the likelihood of relevant biases. Specifically, the direction and magnitude of bias should be discussed and, if possible, estimated. For instance, in case-control studies information bias can occur, but may be reduced by selecting an appropriate control group, as in the first example . Differences in the medical surveillance of participants were a problem in the second example . Consequently, the authors provide more detail about the additional data they collected to tackle this problem. When investigators have set up quality control programs for data collection to counter a possible “drift” in measurements of variables in longitudinal studies, or to keep variability at a minimum when multiple observers are used, these should be described.
Unfortunately, authors often do not address important biases when reporting their results. Among 43 case-control and cohort studies published from 1990 to 1994 that investigated the risk of second cancers in patients with a history of cancer, medical surveillance bias was mentioned in only 5 articles . A survey of reports of mental health research published during 1998 in three psychiatric journals found that only 13% of 392 articles mentioned response bias . A survey of cohort studies in stroke research found that 14 of 49 (28%) articles published from 1999 to 2003 addressed potential selection bias in the recruitment of study participants and 35 (71%) mentioned the possibility that any type of bias may have affected results.
In most case-control studies of suicide, the control group comprises living individuals but we decided to have a control group of people who had died of other causes (…). With a control group of deceased individuals, the sources of information used to assess risk factors are informants who have recently experienced the death of a family member or close associate - and are therefore more comparable to the sources of information in the suicide group than if living controls were used.
10. Study size
Explain how the study size was arrived at.
A study should be large enough to obtain a point estimate with a sufficiently narrow confidence interval to meaningfully answer a research question. Large samples are needed to distinguish a small association from no association. Small studies often provide valuable information, but wide confidence intervals may indicate that they contribute less to current knowledge in comparison with studies providing estimates with narrower confidence intervals. Also, small studies that show ‘interesting' or ‘statistically significant' associations are published more frequently than small studies that do not have ‘significant' findings. While these studies may provide an early signal in the context of discovery, readers should be informed of their potential weaknesses.
The importance of sample size determination in observational studies depends on the context. If an analysis is performed on data that were already available for other purposes, the main question is whether the analysis of the data will produce results with sufficient statistical precision to contribute substantially to the literature, and sample size considerations will be informal. Formal, a priori calculation of sample size may be useful when planning a new study. Such calculations are associated with more uncertainty than implied by the single number that is generally produced. For example, estimates of the rate of the event of interest or other assumptions central to calculations are commonly imprecise, if not guesswork. The precision obtained in the final analysis can often not be determined beforehand because it will be reduced by inclusion of confounding variables in multivariable analyses, the degree of precision with which key variables can be measured, and the exclusion of some individuals.
Few epidemiological studies explain or report deliberations about sample size. We encourage investigators to report pertinent formal sample size calculations if they were done. In other situations they should indicate the considerations that determined the study size (e.g., a fixed available sample, as in the first example above). If the observational study was stopped early when statistical significance was achieved, readers should be told. Do not bother readers with post hoc justifications for study size or retrospective power calculations. From the point of view of the reader, confidence intervals indicate the statistical precision that was ultimately obtained. It should be realized that confidence intervals reflect statistical uncertainty only, and not all uncertainty that may be present in a study (see item 20).
A survey of postnatal depression in the region had documented a prevalence of 19.8%. Assuming depression in mothers with normal weight children to be 20% and an odds ratio of 3 for depression in mothers with a malnourished child we needed 72 case-control sets (one case to one control) with an 80% power and 5% significance
The number of cases in the area during the study period determined the sample size.
11. Quantitative variables
Explain how quantitative variables were handled in the analyses. If applicable, describe which groupings were chosen, and why.
Investigators make choices regarding how to collect and analyse quantitative data about exposures, effect modifiers and confounders. For example, they may group a continuous exposure variable to create a new categorical variable. Grouping choices may have important consequences for later analyses. We advise that authors explain why and how they grouped quantitative data, including the number of categories, the cut-points, and category mean or median values. Whenever data are reported in tabular form, the counts of cases, controls, persons at risk, person-time at risk, etc. should be given for each category. Tables should not consist solely of effect-measure estimates or results of model fitting.
Investigators might model an exposure as continuous in order to retain all the information. In making this choice, one needs to consider the nature of the relationship of the exposure to the outcome. As it may be wrong to assume a linear relation automatically, possible departures from linearity should be investigated. Authors could mention alternative models they explored during analyses (e.g., using log transformation, quadratic terms or spline functions). Several methods exist for fitting a non-linear relation between the exposure and outcome. Also, it may be informative to present both continuous and grouped analyses for a quantitative exposure of prime interest.
In a recent survey, two thirds of epidemiological publications studied quantitative exposure variables. In 42 of 50 articles (84%) exposures were grouped into several ordered categories, but often without any stated rationale for the choices made. Fifteen articles used linear associations to model continuous exposure but only two reported checking for linearity. In another survey, of the psychological literature, dichotomization was justified in only 22 of 110 articles (20%).
Patients with a Glasgow Coma Scale less than 8 are considered to be seriously injured. A GCS of 9 or more indicates less serious brain injury. We examined the association of GCS in these two categories with the occurrence of death within 12 months from injury.
12a Statistical methods
Describe all statistical methods, including those used to control for confounding.
In general, there is no one correct statistical analysis but, rather, several possibilities that may address the same question, but make different assumptions. Regardless, investigators should pre-determine analyses at least for the primary study objectives in a study protocol. Often additional analyses are needed, either instead of, or as well as, those originally envisaged, and these may sometimes be motivated by the data. When a study is reported, authors should tell readers whether particular analyses were suggested by data inspection. Even though the distinction between pre-specified and exploratory analyses may sometimes be blurred, authors should clarify reasons for particular analyses.
If groups being compared are not similar with regard to some characteristics, adjustment should be made for possible confounding variables by stratification or by multivariable regression (see Box 5) . Often, the study design determines which type of regression analysis is chosen. For instance, Cox proportional hazard regression is commonly used in cohort studies . whereas logistic regression is often the method of choice in case-control studies [96,97]. Analysts should fully describe specific procedures for variable selection and not only present results from the final model [98,99]. If model comparisons are made to narrow down a list of potential confounders for inclusion in a final model, this process should be described. It is helpful to tell readers if one or two covariates are responsible for a great deal of the apparent confounding in a data analysis. Other statistical analyses such as imputation procedures, data transformation, and calculations of attributable risks should also be described. Non-standard or novel approaches should be referenced and the statistical software used reported. As a guiding principle, we advise statistical methods be described “with enough detail to enable a knowledgeable reader with access to the original data to verify the reported results” .
In an empirical study, only 93 of 169 articles (55%) reporting adjustment for confounding clearly stated how continuous and multi-category variables were entered into the statistical model . Another study found that among 67 articles in which statistical analyses were adjusted for confounders, it was mostly unclear how confounders were chosen.
The adjusted relative risk was calculated using the Mantel-Haenszel technique, when evaluating if confounding by age or gender was present in the groups compared. The 95% confidence interval (CI) was computed around the adjusted relative risk, using the variance according to Greenland and Robins and Robins et al.
12b Statistical methods
Describe any methods used to examine subgroups and interactions.
As discussed in detail under item 17, many debate the use and value of analyses restricted to subgroups of the study population [4,104]. Subgroup analyses are nevertheless often done . Readers need to know which subgroup analyses were planned in advance, and which arose while analysing the data. Also, it is important to explain what methods were used to examine whether effects or associations differed across groups (see item 17).
Interaction relates to the situation when one factor modifies the effect of another (therefore also called ‘effect modification'). The joint action of two factors can be characterized in two ways: on an additive scale, in terms of risk differences; or on a multiplicative scale, in terms of relative risk (see Box 8). Many authors and readers may have their own preference about the way interactions should be analysed. Still, they may be interested to know to what extent the joint effect of exposures differs from the separate effects. There is consensus that the additive scale, which uses absolute risks, is more appropriate for public health and clinical decision making . Whatever view is taken, this should be clearly presented to the reader, as is done in the example above . A lay-out presenting separate effects of both exposures as well as their joint effect, each relative to no exposure, might be most informative. It is presented in the example for interaction under item 17, and the calculations on the different scales are explained in Box 8.
Sex differences in susceptibility to the 3 lifestyle-related risk factors studied were explored by testing for biological interaction according to Rothman: a new composite variable with 4 categories (a−b−, a−b+, a+b−, and a+b+) was redefined for sex and a dichotomous exposure of interest where a− and b− denote absence of exposure. RR was calculated for each category after adjustment for age. An interaction effect is defined as departure from additivity of absolute effects, and excess RR caused by interaction (RERI) was calculated:
where RR(a+b+) denotes RR among those exposed to both factors where RR(a−b−) is used as reference category (RR = 1.0). Ninety-five percent CIs were calculated as proposed by Hosmer and Lemeshow. RERI of 0 means no interaction.
12c Statistical methods
Explain how missing data were addressed.
Missing data are common in observational research. Questionnaires posted to study participants are not always filled in completely, participants may not attend all follow-up visits and routine data sources and clinical databases are often incomplete. Despite its ubiquity and importance, few papers report in detail on the problem of missing data [5,107]. Investigators may use any of several approaches to address missing data. We describe some strengths and limitations of various approaches in Box 6. We advise that authors report the number of missing values for each variable of interest (exposures, outcomes, confounders) and for each step in the analysis. Authors should give reasons for missing values if possible, and indicate how many individuals were excluded because of missing data when describing the flow of participants through the study (see also item 13). For analyses that account for missing data, authors should describe the nature of the analysis (e.g., multiple imputation) and the assumptions that were made (e.g., missing at random, see Box 6).
Our missing data analysis procedures used missing at random (MAR) assumptions. We used the MICE (multivariate imputation by chained equations) method of multiple multivariate imputation in STATA. We independently analysed 10 copies of the data, each with missing values suitably imputed, in the multivariate logistic regression analyses. We averaged estimates of the variables to give a single mean estimate and adjusted standard errors according to Rubin's rules.
12d Statistical methods
If applicable, explain how matching of cases and controls was addressed.
In individually matched case-control studies a crude analysis of the odds ratio, ignoring the matching, usually leads to an estimation that is biased towards unity (see Box 2). A matched analysis is therefore often necessary. This can intuitively be understood as a stratified analysis: each case is seen as one stratum with his or her set of matched controls. The analysis rests on considering whether the case is more often exposed than the controls, despite having made them alike regarding the matching variables. Investigators can do such a stratified analysis using the Mantel-Haenszel method on a ‘matched' 2 by 2 table. In its simplest form the odds ratio becomes the ratio of pairs that are discordant for the exposure variable. If matching was done for variables like age and sex that are universal attributes, the analysis needs not retain the individual, person-to-person matching: a simple analysis in categories of age and sex is sufficient . For other matching variables, such as neighbourhood, sibship, or friendship, however, each matched set should be considered its own stratum.
In individually matched studies, the most widely used method of analysis is conditional logistic regression, in which each case and their controls are considered together. The conditional method is necessary when the number of controls varies among cases, and when, in addition to the matching variables, other variables need to be adjusted for. To allow readers to judge whether the matched design was appropriately taken into account in the analysis, we recommend that authors describe in detail what statistical methods were used to analyse the data. If taking the matching into account does have little effect on the estimates, authors may choose to present an unmatched analysis.
We used McNemar's test, paired t test, and conditional logistic regression analysis to compare dementia patients with their matched controls for cardiovascular risk factors, the occurrence of spontaneous cerebral emboli, carotid disease, and venous to arterial circulation shunt.
12e Statistical methods
Describe any sensitivity analyses.
Sensitivity analyses are useful to investigate whether or not the main results are consistent with those obtained with alternative analysis strategies or assumptions . Issues that may be examined include the criteria for inclusion in analyses, the definitions of exposures or outcomes , which confounding variables merit adjustment, the handling of missing data [120,123], possible selection bias or bias from inaccurate or inconsistent measurement of exposure, disease and other variables, and specific analysis choices, such as the treatment of quantitative variables (see item 11). Sophisticated methods are increasingly used to simultaneously model the influence of several biases or assumptions [124–126].
In 1959 Cornfield et al. famously showed that a relative risk of 9 for cigarette smoking and lung cancer was extremely unlikely to be due to any conceivable confounder, since the confounder would need to be at least nine times as prevalent in smokers as in non-smokers . This analysis did not rule out the possibility that such a factor was present, but it did identify the prevalence such a factor would need to have. The same approach was recently used to identify plausible confounding factors that could explain the association between childhood leukaemia and living near electric power lines . More generally, sensitivity analyses can be used to identify the degree of confounding, selection bias, or information bias required to distort an association. One important, perhaps under recognised, use of sensitivity analysis is when a study shows little or no association between an exposure and an outcome and it is plausible that confounding or other biases toward the null are present.
Because we had a relatively higher proportion of ‘missing' dead patients with insufficient data (38/148=25.7%) as compared to live patients (15/437=3.4%) (…), it is possible that this might have biased the results. We have, therefore, carried out a sensitivity analysis. We have assumed that the proportion of women using oral contraceptives in the study group applies to the whole (19.1% for dead, and 11.4% for live patients), and then applied two extreme scenarios: either all the exposed missing patients used second generation pills or they all used third-generation pills.
Report numbers of individuals at each stage of study—eg numbers potentially eligible, examined for eligibility, confirmed eligible, included in the study, completing follow-up, and analysed. Give information separately for cases and controls.
Detailed information on the process of recruiting study participants is important for several reasons. Those included in a study often differ in relevant ways from the target population to which results are applied. This may result in estimates of prevalence or incidence that do not reflect the experience of the target population. For example, people who agreed to participate in a postal survey of sexual behaviour attended church less often, had less conservative sexual attitudes and earlier age at first sexual intercourse, and were more likely to smoke cigarettes and drink alcohol than people who refused . These differences suggest that postal surveys may overestimate sexual liberalism and activity in the population. Such response bias (see Box 3) can distort exposure-disease associations if associations differ between those eligible for the study and those included in the study. As another example, the association between young maternal age and leukaemia in offspring, which has been observed in some case-control studies [131,132], was explained by differential participation of young women in case and control groups. Young women with healthy children were less likely to participate than those with unhealthy children . Although low participation does not necessarily compromise the validity of a study, transparent information on participation and reasons for non-participation is essential. Also, as there are no universally agreed definitions for participation, response or follow-up rates, readers need to understand how authors calculated such proportions .
Ideally, investigators should give an account of the numbers of individuals considered at each stage of recruiting study participants, from the choice of a target population to the inclusion of participants' data in the analysis. Depending on the type of study, this may include the number of individuals considered to be potentially eligible, the number assessed for eligibility, the number found to be eligible, the number included in the study, the number examined, the number followed up and the number included in the analysis. Information on different sampling units may be required, if sampling of study participants is carried out in two or more stages as in the example above (multistage sampling). In case-control studies, we advise that authors describe the flow of participants separately for case and control groups . Controls can sometimes be selected from several sources, including, for example, hospitalised patients and community dwellers. In this case, we recommend a separate account of the numbers of participants for each type of control group. Olson and colleagues proposed useful reporting guidelines for controls recruited through random-digit dialling and other methods .
A recent survey of epidemiological studies published in 10 general epidemiology, public health and medical journals found that some information regarding participation was provided in 47 of 107 case-control studies (59%), 49 of 154 cohort studies (32%), and 51 of 86 cross-sectional studies (59%) . Incomplete or absent reporting of participation and non-participation in epidemiological studies was also documented in two other surveys of the literature [4,5]. Finally, there is evidence that participation in epidemiological studies may have declined in recent decades [137,138], which underscores the need for transparent reporting.
Of the 105 freestanding bars and taverns sampled, 13 establishments were no longer in business and 9 were located in restaurants, leaving 83 eligible businesses. In 22 cases, the owner could not be reached by telephone despite 6 or more attempts. The owners of 36 bars declined study participation. (...) The 25 participating bars and taverns employed 124 bartenders, with 67 bartenders working at least 1 weekly daytime shift. Fifty-four of the daytime bartenders (81%) completed baseline interviews and spirometry; 53 of these subjects (98%) completed follow-up.
Give reasons for non-participation at each stage.
Explaining the reasons why people no longer participated in a study or why they were excluded from statistical analyses helps readers judge whether the study population was representative of the target population and whether bias was possibly introduced. For example, in a cross-sectional health survey, non-participation due to reasons unlikely to be related to health status (for example, the letter of invitation was not delivered because of an incorrect address) will affect the precision of estimates but will probably not introduce bias. Conversely, if many individuals opt out of the survey because of illness, or perceived good health, results may underestimate or overestimate the prevalence of ill health in the population.
The main reasons for non-participation were the participant was too ill or had died before interview (cases 30%, controls < 1%), nonresponse (cases 2%, controls 21%), refusal (cases 10%, controls 29%), and other reasons (refusal by consultant or general practitioner, non-English speaking, mental impairment) (cases 7%, controls 5%).
Consider use of a flow diagram.
An informative and well-structured flow diagram can readily and transparently convey information that might otherwise require a lengthy description , as in the example above. The diagram may usefully include the main results, such as the number of events for the primary outcome. While we recommend the use of a flow diagram, particularly for complex observational studies, we do not propose a specific format for the diagram.
Flow diagram from Hay et al. https://doi.org/10.1371/journal.pmed.0040297.g001 .
14a Descriptive data
Give characteristics of study participants (eg demographic, clinical, social) and information on exposures and potential confounders. Give information separately for cases and controls.
Readers need descriptions of study participants and their exposures to judge the generalisability of the findings. Information about potential confounders, including whether and how they were measured, influences judgments about study validity. We advise authors to summarize continuous variables for each study group by giving the mean and standard deviation, or when the data have an asymmetrical distribution, as is often the case, the median and percentile range (e.g., 25th and 75th percentiles). Variables that make up a small number of ordered categories (such as stages of disease I to IV) should not be presented as continuous variables; it is preferable to give numbers and proportions for each category (see also Box 4). In studies that compare groups, the descriptive characteristics and numbers should be given by group, as in the example above.
Inferential measures such as standard errors and confidence intervals should not be used to describe the variability of characteristics, and significance tests should be avoided in descriptive tables. Also, P values are not an appropriate criterion for selecting which confounders to adjust for in analysis; even small differences in a confounder that has a strong effect on the outcome can be important.
In cohort studies, it may be useful to document how an exposure relates to other characteristics and potential confounders. Authors could present this information in a table with columns for participants in two or more exposure categories, which permits to judge the differences in confounders between these categories.
In case-control studies potential confounders cannot be judged by comparing cases and controls. Control persons represent the source population and will usually be different from the cases in many respects. For example, in a study of oral contraceptives and myocardial infarction, a sample of young women with infarction more often had risk factors for that disease, such as high serum cholesterol, smoking and a positive family history, than the control group . This does not influence the assessment of the effect of oral contraceptives, as long as the prescription of oral contraceptives was not guided by the presence of these risk factors—e.g., because the risk factors were only established after the event (see also Box 5). In case-control studies the equivalent of comparing exposed and non-exposed for the presence of potential confounders (as is done in cohorts) can be achieved by exploring the source population of the cases: if the control group is large enough and represents the source population, exposed and unexposed controls can be compared for potential confounders [121,147].
Characteristics of the Study Base at Enrolment, Castellana G (Italy), 1985–1986
14b Descriptive data
Indicate number of participants with missing data for each variable of interest.
As missing data may bias or affect generalisability of results, authors should tell readers amounts of missing data for exposures, potential confounders, and other important characteristics of patients (see also item 12c and Box 6). In a cohort study, authors should report the extent of loss to follow-up (with reasons), since incomplete follow-up may bias findings (see also items 12d and 13). We advise authors to use their tables and figures to enumerate amounts of missing data.
Table. Symptom End Points Used in Survival Analysis
15. Outcome data
Report numbers in each exposure category, or summary measures of exposure. Give information separately for cases and controls.
Before addressing the possible association between exposures (risk factors) and outcomes, authors should report relevant descriptive data. It may be possible and meaningful to present measures of association in the same table that presents the descriptive data (see item 14a). In a cohort study with events as outcomes, report the numbers of events for each outcome of interest. Consider reporting the event rate per person-year of follow-up. If the risk of an event changes over follow-up time, present the numbers and rates of events in appropriate intervals of follow-up or as a Kaplan-Meier life table or plot. It might be preferable to show plots as cumulative incidence that go up from 0% rather than down from 100%, especially if the event rate is lower than, say, 30% . Consider presenting such information separately for participants in different exposure categories of interest. If a cohort study is investigating other time-related outcomes (e.g., quantitative disease markers such as blood pressure), present appropriate summary measures (e.g., means and standard deviations) over time, perhaps in a table or figure.
For cross-sectional studies, we recommend presenting the same type of information on prevalent outcome events or summary measures. For case-control studies, the focus will be on reporting exposures separately for cases and controls as frequencies or quantitative summaries . For all designs, it may be helpful also to tabulate continuous outcomes or exposures in categories, even if the data are not analysed as such.
Table. Exposure among Liver Cirrhosis Cases and Controls https://doi.org/10.1371/journal.pmed.0040297.t006 .
16a Main results
Give unadjusted estimates and, if applicable, confounder-adjusted estimates and their precision (eg, 95% confidence interval). Make clear which confounders were adjusted for and why they were included.
In many situations, authors may present the results of unadjusted or minimally adjusted analyses and those from fully adjusted analyses. We advise giving the unadjusted analyses together with the main data, for example the number of cases and controls that were exposed or not. This allows the reader to understand the data behind the measures of association (see also item 15). For adjusted analyses, report the number of persons in the analysis, as this number may differ because of missing values in covariates (see also item 12c). Estimates should be given with confidence intervals.
Readers can compare unadjusted measures of association with those adjusted for potential confounders and judge by how much, and in what direction, they changed. Readers may think that ‘adjusted' results equal the causal part of the measure of association, but adjusted results are not necessarily free of random sampling error, selection bias, information bias, or residual confounding (see Box 5). Thus, great care should be exercised when interpreting adjusted results, as the validity of results often depends crucially on complete knowledge of important confounders, their precise measurement, and appropriate specification in the statistical model (see also item 20) [157,158].
Authors should explain all potential confounders considered, and the criteria for excluding or including variables in statistical models. Decisions about excluding or including variables should be guided by knowledge, or explicit assumptions, on causal relations. Inappropriate decisions may introduce bias, for example by including variables that are in the causal pathway between exposure and disease (unless the aim is to asses how much of the effect is carried by the intermediary variable). If the decision to include a variable in the model was based on the change in the estimate, it is important to report what change was considered sufficiently important to justify its inclusion. If a ‘backward deletion' or ‘forward inclusion' strategy was used to select confounders, explain that process and give the significance level for rejecting the null hypothesis of no confounding. Of note, we and others do not advise selecting confounders based solely on statistical significance testing [147,159,160].
Recent studies of the quality of reporting of epidemiological studies found that confidence intervals were reported in most articles . However, few authors explained their choice of confounding variables [4,5].
Table. Relative Rates of Rehospitalisation by Treatment in Patients in Community Care after First Hospitalisation due to Schizophrenia and Schizoaffective Disorder https://doi.org/10.1371/journal.pmed.0040297.t008 .
16b Main results
Report category boundaries when continuous variables were categorized.
Categorizing continuous data has several important implications for analysis (see Box 4) and also affects the presentation of results. In tables, outcomes should be given for each exposure category, for example as counts of persons at risk, person-time at risk, if relevant separately for each group (e.g., cases and controls). Details of the categories used may aid comparison of studies and meta-analysis. If data were grouped using conventional cut-points, such as body mass index thresholds , group boundaries (i.e., range of values) can be derived easily, except for the highest and lowest categories. If quantile-derived categories are used, the category boundaries cannot be inferred from the data. As a minimum, authors should report the category boundaries; it is helpful also to report the range of the data and the mean or median values within categories.
Table. Polychlorinated Biphenyls in Cord Serum
16c Main results
If relevant, consider translating estimates of relative risk into absolute risk for a meaningful time period.
The results from studies examining the association between an exposure and a disease are commonly reported in relative terms, as ratios of risks, rates or odds (see Box 8). Relative measures capture the strength of the association between an exposure and disease. If the relative risk is a long way from 1 it is less likely that the association is due to confounding [164,165]. Relative effects or associations tend to be more consistent across studies and populations than absolute measures, but what often tends to be the case may be irrelevant in a particular instance. For example, similar relative risks were obtained for the classic cardiovascular risk factors for men living in Northern Ireland, France, the USA and Germany, despite the fact that the underlying risk of coronary heart disease varies substantially between these countries [166,167]. In contrast, in a study of hypertension as a risk factor for cardiovascular disease mortality, the data were more compatible with a constant rate difference than with a constant rate ratio .
Widely used statistical models, including logistic  and proportional hazards (Cox) regression  are based on ratio measures. In these models, only departures from constancy of ratio effect measures are easily discerned. Nevertheless, measures which assess departures from additivity of risk differences, such as the Relative Excess Risk from Interaction (RERI, see item 12b and Box 8), can be estimated in models based on ratio measures.
In many circumstances, the absolute risk associated with an exposure is of greater interest than the relative risk. For example, if the focus is on adverse effects of a drug, one will want to know the number of additional cases per unit time of use (e.g., days, weeks, or years). The example gives the additional number of breast cancer cases per 1000 women who used hormone-replacement therapy for 10 years . Measures such as the attributable risk or population attributable fraction may be useful to gauge how much disease can be prevented if the exposure is eliminated. They should preferably be presented together with a measure of statistical uncertainty (e.g., confidence intervals as in the example). Authors should be aware of the strong assumptions made in this context, including a causal relationship between a risk factor and disease (also see Box 7) . Because of the semantic ambiguity and complexities involved, authors should report in detail what methods were used to calculate attributable risks, ideally giving the formulae used .
A recent survey of abstracts of 222 articles published in leading medical journals found that in 62% of abstracts of randomised trials including a ratio measure absolute risks were given, but only in 21% of abstracts of cohort studies . A free text search of Medline 1966 to 1997 showed that 619 items mentioned attributable risks in the title or abstract, compared to 18,955 using relative risk or odds ratio, for a ratio of 1 to 31 .
10 years' use of HRT [hormone replacement therapy] is estimated to result in five (95% CI 3–7) additional breast cancers per 1000 users of oestrogen-only preparations and 19 (15–23) additional cancers per 1000 users of oestrogen-progestagen combinations.
17. Other analyses
Report other analyses done—e.g., analyses of subgroups and interactions, and sensitivity analyses.
In addition to the main analysis other analyses are often done in observational studies. They may address specific subgroups, the potential interaction between risk factors, the calculation of attributable risks, or use alternative definitions of study variables in sensitivity analyses.
There is debate about the dangers associated with subgroup analyses, and multiplicity of analyses in general [4,104]. In our opinion, there is too great a tendency to look for evidence of subgroup-specific associations, or effect-measure modification, when overall results appear to suggest little or no effect. On the other hand, there is value in exploring whether an overall association appears consistent across several, preferably pre-specified subgroups especially when a study is large enough to have sufficient data in each subgroup. A second area of debate is about interesting subgroups that arose during the data analysis. They might be important findings, but might also arise by chance. Some argue that it is neither possible nor necessary to inform the reader about all subgroup analyses done as future analyses of other data will tell to what extent the early exciting findings stand the test of time . We advise authors to report which analyses were planned, and which were not (see also items 4, 12b and 20). This will allow readers to judge the implications of multiplicity, taking into account the study's position on the continuum from discovery to verification or refutation.
A third area of debate is how joint effects and interactions between risk factors should be evaluated: on additive or multiplicative scales, or should the scale be determined by the statistical model that fits best (see also item 12b and Box 8)? A sensible approach is to report the separate effect of each exposure as well as the joint effect—if possible in a table, as in the first example above , or in the study by Martinelli et al. . Such a table gives the reader sufficient information to evaluate additive as well as multiplicative interaction (how these calculations are done is shown in Box 8). Confidence intervals for separate and joint effects may help the reader to judge the strength of the data. In addition, confidence intervals around measures of interaction, such as the Relative Excess Risk from Interaction (RERI) relate to tests of interaction or homogeneity tests. One recurrent problem is that authors use comparisons of P-values across subgroups, which lead to erroneous claims about an effect modifier. For instance, a statistically significant association in one category (e.g., men), but not in the other (e.g., women) does not in itself provide evidence of effect modification. Similarly, the confidence intervals for each point estimate are sometimes inappropriately used to infer that there is no interaction when intervals overlap. A more valid inference is achieved by directly evaluating whether the magnitude of an association differs across subgroups.
Sensitivity analyses are helpful to investigate the influence of choices made in the statistical analysis, or to investigate the robustness of the findings to missing data or possible biases (see also item 12b). Judgement is needed regarding the level of reporting of such analyses. If many sensitivity analyses were performed, it may be impractical to present detailed findings for them all. It may sometimes be sufficient to report that sensitivity analyses were carried out and that they were consistent with the main results presented. Detailed presentation is more appropriate if the issue investigated is of major concern, or if effect estimates vary considerably [59,186].
Pocock and colleagues found that 43 out of 73 articles reporting observational studies contained subgroup analyses. The majority claimed differences across groups but only eight articles reported a formal evaluation of interaction (see item 12b) .
Table. Analysis of Oral Contraceptive Use, Presence of Factor V Leiden Allele, and Risk for Venous Thromboembolism
Table. Sensitivity of the Rate Ratio for Cardiovascular Outcome to an Unmeasured Confounder https://doi.org/10.1371/journal.pmed.0040297.t010 .
18. Key results
Summarise key results with reference to study objectives.
It is good practice to begin the discussion with a short summary of the main findings of the study. The short summary reminds readers of the main findings and may help them assess whether the subsequent interpretation and implications offered by the authors are supported by the findings.
We hypothesized that ethnic minority status would be associated with higher levels of cardiovascular disease (CVD) risk factors, but that the associations would be explained substantially by socioeconomic status (SES). Our hypothesis was not confirmed. After adjustment for age and SES, highly significant differences in body mass index, blood pressure, diabetes, and physical inactivity remained between white women and both black and Mexican American women. In addition, we found large differences in CVD risk factors by SES, a finding that illustrates the high-risk status of both ethnic minority women as well as white women with low SES.
Discuss limitations of the study, taking into account sources of potential bias or imprecision. Discuss both direction and magnitude of any potential bias.
The identification and discussion of the limitations of a study are an essential part of scientific reporting. It is important not only to identify the sources of bias and confounding that could have affected results, but also to discuss the relative importance of different biases, including the likely direction and magnitude of any potential bias (see also item 9 and Box 3).
Authors should also discuss any imprecision of the results. Imprecision may arise in connection with several aspects of a study, including the study size (item 10) and the measurement of exposures, confounders and outcomes (item 8). The inability to precisely measure true values of an exposure tends to result in bias towards unity: the less precisely a risk factor is measured, the greater the bias. This effect has been described as ‘attenuation' [201,202], or more recently as ‘regression dilution bias' . However, when correlated risk factors are measured with different degrees of imprecision, the adjusted relative risk associated with them can be biased towards or away from unity [204–206].
When discussing limitations, authors may compare the study being presented with other studies in the literature in terms of validity, generalisability and precision. In this approach, each study can be viewed as contribution to the literature, not as a stand-alone basis for inference and action . Surprisingly, the discussion of important limitations of a study is sometimes omitted from published reports. A survey of authors who had published original research articles in The Lancet found that important weaknesses of the study were reported by the investigators in the survey questionnaires, but not in the published article.
Since the prevalence of counseling increases with increasing levels of obesity, our estimates may overestimate the true prevalence. Telephone surveys also may overestimate the true prevalence of counseling. Although persons without telephones have similar levels of overweight as persons with telephones, persons without telephones tend to be less educated, a factor associated with lower levels of counseling in our study. Also, of concern is the potential bias caused by those who refused to participate as well as those who refused to respond to questions about weight. Furthermore, because data were collected cross-sectionally, we cannot infer that counseling preceded a patient's attempt to lose weight.
Give a cautious overall interpretation considering objectives, limitations, multiplicity of analyses, results from similar studies, and other relevant evidence.
The heart of the discussion section is the interpretation of a study's results. Over-interpretation is common and human: even when we try hard to give an objective assessment, reviewers often rightly point out that we went too far in some respects. When interpreting results, authors should consider the nature of the study on the discovery to verification continuum and potential sources of bias, including loss to follow-up and non-participation (see also items 9, 12 and 19). Due consideration should be given to confounding (item 16a), the results of relevant sensitivity analyses, and to the issue of multiplicity and subgroup analyses (item 17). Authors should also consider residual confounding due to unmeasured variables or imprecise measurement of confounders. For example, socioeconomic status (SES) is associated with many health outcomes and often differs between groups being compared. Variables used to measure SES (income, education, or occupation) are surrogates for other undefined and unmeasured exposures, and the true confounder will by definition be measured with error . Authors should address the real range of uncertainty in estimates, which is larger than the statistical uncertainty reflected in confidence intervals. The latter do not take into account other uncertainties that arise from a study's design, implementation, and methods of measurement .
To guide thinking and conclusions about causality, some may find criteria proposed by Bradford Hill in 1965 helpful . How strong is the association with the exposure? Did it precede the onset of disease? Is the association consistently observed in different studies and settings? Is there supporting evidence from experimental studies, including laboratory and animal studies? How specific is the exposure's putative effect, and is there a dose-response relationship? Is the association biologically plausible? These criteria should not, however, be applied mechanically. For example, some have argued that relative risks below 2 or 3 should be ignored [210,211]. This is a reversal of the point by Cornfield et al. about the strength of large relative risks (see item 12b) . Although a causal effect is more likely with a relative risk of 9, it does not follow that one below 3 is necessarily spurious. For instance, the small increase in the risk of childhood leukaemia after intrauterine irradiation is credible because it concerns an adverse effect of a medical procedure for which no alternative explanations are obvious . Moreover, the carcinogenic effects of radiation are well established. The doubling in the risk of ovarian cancer associated with eating 2 to 4 eggs per week is not immediately credible, since dietary habits are associated with a large number of lifestyle factors as well as SES . In contrast, the credibility of much debated epidemiologic findings of a difference in thrombosis risk between different types of oral contraceptives was greatly enhanced by the differences in coagulation found in a randomised cross-over trial . A discussion of the existing external evidence, from different types of studies, should always be included, but may be particularly important for studies reporting small increases in risk. Further, authors should put their results in context with similar studies and explain how the new study affects the existing body of evidence, ideally by referring to a systematic review.
Any explanation for an association between death from myocardial infarction and use of second generation oral contraceptives must be conjectural. There is no published evidence to suggest a direct biologic mechanism, and there are no other epidemiologic studies with relevant results. (…) The increase in absolute risk is very small and probably applies predominantly to smokers. Due to the lack of corroborative evidence, and because the analysis is based on relatively small numbers, more evidence on the subject is needed. We would not recommend any change in prescribing practice on the strength of these results.
Discuss the generalisability (external validity) of the study results.
Generalisability, also called external validity or applicability, is the extent to which the results of a study can be applied to other circumstances . There is no external validity per se; the term is meaningful only with regard to clearly specified conditions . Can results be applied to an individual, groups or populations that differ from those enrolled in the study with regard to age, sex, ethnicity, severity of disease, and co-morbid conditions? Are the nature and level of exposures comparable, and the definitions of outcomes relevant to another setting or population? Are data that were collected in longitudinal studies many years ago still relevant today? Are results from health services research in one country applicable to health systems in other countries?
The question of whether the results of a study have external validity is often a matter of judgment that depends on the study setting, the characteristics of the participants, the exposures examined, and the outcomes assessed. Thus, it is crucial that authors provide readers with adequate information about the setting and locations, eligibility criteria, the exposures and how they were measured, the definition of outcomes, and the period of recruitment and follow-up. The degree of non-participation and the proportion of unexposed participants in whom the outcome develops are also relevant. Knowledge of the absolute risk and prevalence of the exposure, which will often vary across populations, are helpful when applying results to other settings and populations (see Box 7).
How applicable are our estimates to other HIV-1-infected patients? This is an important question because the accuracy of prognostic models tends to be lower when applied to data other than those used to develop them. We addressed this issue by penalising model complexity, and by choosing models that generalised best to cohorts omitted from the estimation procedure. Our database included patients from many countries from Europe and North America, who were treated in different settings. The range of patients was broad: men and women, from teenagers to elderly people were included, and the major exposure categories were well represented. The severity of immunodeficiency at baseline ranged from not measureable to very severe, and viral load from undetectable to extremely high.
- Other Information
Give the source of funding and the role of the funders for the present study and, if applicable, for the original study on which the present article is based.
Some journals require authors to disclose the presence or absence of financial and other conflicts of interest [100,218]. Several investigations show strong associations between the source of funding and the conclusions of research articles [219–222]. The conclusions in randomised trials recommended the experimental drug as the drug of choice much more often (odds ratio 5.3) if the trial was funded by for-profit organisations, even after adjustment for the effect size . Other studies document the influence of the tobacco and telecommunication industries on the research they funded [224–227]. There are also examples of undue influence when the sponsor is governmental or a non-profit organisation.
Authors or funders may have conflicts of interest that influence any of the following: the design of the study ; choice of exposures [228,229], outcomes , statistical methods , and selective publication of outcomes  and studies . Consequently, the role of the funders should be described in detail: in what part of the study they took direct responsibility (e.g., design, data collection, analysis, drafting of manuscript, decision to publish) . Other sources of undue influence include employers (e.g., university administrators for academic researchers and government supervisors, especially political appointees, for government researchers), advisory committees, litigants, and special interest groups.
To acknowledge this checklist in your methods, please state "We used the STROBE case-control checklist when writing our report [citation]". Then cite this checklist as von Elm E, Altman DG, Egger M, Pocock SJ, Gotzsche PC, Vandenbroucke JP. The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) Statement: guidelines for reporting observational studies..
The STROBE checklist is distributed under the terms of the Creative Commons Attribution License CC-BY
In case-control studies , investigators compare exposures between people with a particular disease outcome (cases) and people without that outcome (controls). Investigators aim to collect cases and controls that are representative of an underlying cohort or a cross-section of a population. That population can be defined geographically, but also more loosely as the catchment area of health care facilities. The case sample may be 100% or a large fraction of available cases, while the control sample usually is only a small fraction of the people who do not have the pertinent outcome. Controls represent the cohort or population of people from which the cases arose. Investigators calculate the ratio of the odds of exposures to putative causes of the disease among cases and controls (see Box 7). Depending on the sampling strategy for cases and controls and the nature of the population studied, the odds ratio obtained in a case-control study is interpreted as the risk ratio, rate ratio or (prevalence) odds ratio [16,17]. The majority of published case-control studies sample open cohorts and so allow direct estimations of rate ratios.
A classic case-control study is Smoking and Carcinoma of the Lung by Richard Doll and A Bradford Hill which suggested a link between smoking and lung cancer.
An official website of the United States government
The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.
The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.
- Account settings
- Advanced Search
- Journal List
- Biochem Med (Zagreb)
- v.24(2); 2014 Jun
Observational and interventional study design types; an overview
The appropriate choice in study design is essential for the successful execution of biomedical and public health research. There are many study designs to choose from within two broad categories of observational and interventional studies. Each design has its own strengths and weaknesses, and the need to understand these limitations is necessary to arrive at correct study conclusions.
Observational study designs, also called epidemiologic study designs, are often retrospective and are used to assess potential causation in exposure-outcome relationships and therefore influence preventive methods. Observational study designs include ecological designs, cross sectional, case-control, case-crossover, retrospective and prospective cohorts. An important subset of observational studies is diagnostic study designs, which evaluate the accuracy of diagnostic procedures and tests as compared to other diagnostic measures. These include diagnostic accuracy designs, diagnostic cohort designs, and diagnostic randomized controlled trials.
Interventional studies are often prospective and are specifically tailored to evaluate direct impacts of treatment or preventive measures on disease. Each study design has specific outcome measures that rely on the type and quality of data utilized. Additionally, each study design has potential limitations that are more severe and need to be addressed in the design phase of the study. This manuscript is meant to provide an overview of study design types, strengths and weaknesses of common observational and interventional study designs.
Study design plays an important role in the quality, execution, and interpretation of biomedical and public health research ( 1 – 12 ). Each study design has their own inherent strengths and weaknesses, and there can be a general hierarchy in study designs, however, any hierarchy cannot be applied uniformly across study design types ( 3 , 5 , 6 , 9 ). Epidemiological and interventional research studies include three elements; 1) definition and measure of exposure in two or more groups, 2) measure of health outcome(s) in these same groups, and 3) statistical comparison made between groups to assess potential relationships between the exposure and outcome, all of which are defined by the researcher ( 1 – 4 , 8 , 13 ). The measure of exposure in epidemiologic studies may be tobacco use (“Yes” vs . “No”) to define the two groups and may be the treatment (Active drug vs . placebo) in interventional studies. Health outcome(s) can be the development of a disease or symptom (e.g. lung cancer) or curing a disease or symptom (e.g. reduction of pain). Descriptive studies, which are not epidemiological or interventional, lack one or more of these elements and have limited application. High quality epidemiological and interventional studies contain detailed information on the design, execution and interpretation of results, with methodology clearly written and able to be reproduced by other researchers.
Research is generally considered as primary or secondary research. Primary research relies upon data gathered from original research expressly for that purpose ( 1 , 3 , 5 ). Secondary research focuses on single or multiple data sources that are not collected for a single research purpose ( 14 , 15 ). Secondary research includes meta-analyses and best practice guidelines for treatments. This paper will focus on the study designs and their strengths, weaknesses, and common statistical outcomes of primary research.
The choice of a study design hinges on many factors, including prior research, availability of study participants, funding, and time constraints. One common decision point is the desire to suggest causation. The most common causation criteria are proposed by Hill ( 16 ). Of these, demonstrating temporality is the only mandatory criterion for suggesting temporality. Therefore, prospective studies that follow study participants forward through time, including prospective cohort studies and interventional studies, are best suited for suggesting causation. Causal conclusions cannot be proven from an observational study. Additionally, causation between an exposure and an outcome cannot be proven by one study alone; multiple studies across different populations should be considered when making causation assessments ( 17 ).
Primary research has been categorized in different ways. Common categorization schema include temporal nature of the study design (retrospective or prospective), usability of the study results (basic or applied), investigative purpose (descriptive or analytical), purpose (prevention, diagnosis or treatment), or role of the investigator (observational or interventional). This manuscript categorizes study designs by observational and interventional criteria, however, other categorization methods are described as well.
Observational and interventional studies
Within primary research there are observational studies and interventional studies. Observational studies, also called epidemiological studies, are those where the investigator is not acting upon study participants, but instead observing natural relationships between factors and outcomes. Diagnostic studies are classified as observational studies, but are a unique category and will be discussed independently. Interventional studies, also called experimental studies, are those where the researcher intercedes as part of the study design. Additionally, study designs may be classified by the role that time plays in the data collection, either retrospective or prospective. Retrospective studies are those where data are collected from the past, either through records created at that time or by asking participants to remember their exposures or outcomes. Retrospective studies cannot demonstrate temporality as easily and are more prone to different biases, particularly recall bias. Prospective studies follow participants forward through time, collecting data in the process. Prospective studies are less prone to some types of bias and can more easily demonstrate that the exposure preceded the disease, thereby more strongly suggesting causation. Table 1 describes the broad categories of observational studies: the disease measures applicable to each, the appropriate measures of risk, and temporality of each study design. Epidemiologic measures include point prevalence, the proportion of participants with disease at a given point in time, period prevalence, the proportion of participants with disease within a specified time frame, and incidence, the accumulation of new cases over time. Measures of risk are generally categorized into two categories: those that only demonstrate an association, such as an odds ratio (and some other measures), and those that demonstrate temporality and therefore suggest causation, such as hazard ratio. Table 2 outlines the strengths and weaknesses of each observational study design.
Observational study design measures of disease, measures of risk, and temporality.
Observational study design strengths and weaknesses.
Ecological study design.
The most basic observational study is an ecological study. This study design compares clusters of people, usually grouped based on their geographical location or temporal associations ( 1 , 2 , 6 , 9 ). Ecological studies assign one exposure level for each distinct group and can provide a rough estimation of prevalence of disease within a population. Ecological studies are generally retrospective. An example of an ecological study is the comparison of the prevalence of obesity in the United States and France. The geographic area is considered the exposure and the outcome is obesity. There are inherent potential weaknesses with this approach, including loss of data resolution and potential misclassification ( 10 , 11 , 13 , 18 , 19 ). This type of study design also has additional weaknesses. Typically these studies derive their data from large databases that are created for purposes other than research, which may introduce error or misclassification ( 10 , 11 ). Quantification of both the number of cases and the total population can be difficult, leading to error or bias. Lastly, due to the limited amount of data available, it is difficult to control for other factors that may mask or falsely suggest a relationship between the exposure and the outcome. However, ecological studies are generally very cost effective and are a starting point for hypothesis generation.
Proportional mortality ratio study design
Proportional mortality ratio studies (PMR) utilize the defined well recorded outcome of death and subsequent records that are maintained regarding the decedent ( 1 , 6 , 8 , 20 ). By using records, this study design is able to identify potential relationships between exposures, such as geographic location, occupation, or age and cause of death. The epidemiological outcomes of this study design are proportional mortality ratio and standardized mortality ratio. In general these are the ratio of the proportion of cause-specific deaths out of all deaths between exposure categories ( 20 ). As an example, these studies can address questions about higher proportion of cardiovascular deaths among different ethnic and racial groups ( 21 ). A significant drawback to the PMR study design is that these studies are limited to death as an outcome ( 3 , 5 , 22 ). Additionally, the reliance on death records makes it difficult to control for individual confounding factors, variables that either conceal or falsely demonstrate associations between the exposure and outcome. An example of a confounder is tobacco use confounding the relationship between coffee intake and cardiovascular disease. Historically people often smoked and drank coffee while on coffee breaks. If researchers ignore smoking they would inaccurately find a strong relationship between coffee use and cardiovascular disease, where some of the risk is actually due to smoking. There are also concerns regarding the accuracy of death certificate data. Strengths of the study design include the well-defined outcome of death, the relative ease and low cost of obtaining data, and the uniformity of collection of these data across different geographical areas.
Cross-sectional study design
Cross-sectional studies are also called prevalence studies because one of the main measures available is study population prevalence ( 1 – 12 ). These studies consist of assessing a population, as represented by the study sample, at a single point in time. A common cross-sectional study type is the diagnostic accuracy study, which is discussed later. Cross-sectional study samples are selected based on their exposure status, without regard for their outcome status. Outcome status is obtained after participants are enrolled. Ideally, a wider distribution of exposure will allow for a higher likelihood of finding an association between the exposure and outcome if one exists ( 1 – 3 , 5 , 8 ). Cross-sectional studies are retrospective in nature. An example of a cross-sectional study would be enrolling participants who are either current smokers or never smokers, and assessing whether or not they have respiratory deficiencies. Random sampling of the population being assessed is more important in cross-sectional studies as compared to other observational study designs. Selection bias from non-random sampling may result in flawed measure of prevalence and calculation of risk. The study sample is assessed for both exposure and outcome at a single point in time. Because both exposure and outcome are assessed at the same time, temporality cannot be demonstrated, i.e. it cannot be demonstrated that the exposure preceded the disease ( 1 – 3 , 5 , 8 ). Point prevalence and period prevalence can be calculated in cross-sectional studies. Measures of risk for the exposure-outcome relationship that can be calculated in cross-sectional study design are odds ratio, prevalence odds ratio, prevalence ratio, and prevalence difference. Cross-sectional studies are relatively inexpensive and have data collected on an individual which allows for more complete control for confounding. Additionally, cross-sectional studies allow for multiple outcomes to be assessed simultaneously.
Case-control study design
Case-control studies were traditionally referred to as retrospective studies, due to the nature of the study design and execution ( 1 – 12 , 23 , 24 ). In this study design, researchers identify study participants based on their case status, i.e. diseased or not diseased. Quantification of the number of individuals among the cases and the controls who are exposed allow for statistical associations between exposure and outcomes to be established ( 1 – 3 , 5 , 8 ). An example of a case control study is analysing the relationship between obesity and knee replacement surgery. Cases are participants who have had knee surgery, and controls are a random sampling of those who have not, and the comparison is the relative odds of being obese if you have knee surgery as compared to those that do not. Matching on one or more potential confounders allows for minimization of those factors as potential confounders in the exposure-outcome relationship ( 1 – 3 , 5 , 8 ). Additionally, case-control studies are at increased risk for bias, particularly recall bias, due to the known case status of study participants ( 1 – 3 , 5 , 8 ). Other points of consideration that have specific weight in case-control studies include the appropriate selection of controls that balance generalizability and minimize bias, the minimization of survivor bias, and the potential for length time bias ( 25 ). The largest strength of case-control studies is that this study design is the most efficient study design for rare diseases. Additional strengths include low cost, relatively fast execution compared to cohort studies, the ability to collect individual participant specific data, the ability to control for multiple confounders, and the ability to assess multiple exposures of interest. The measure of risk that is calculated in case-control studies is the odds ratio, which are the odds of having the exposure if you have the disease. Other measures of risk are not applicable to case-control studies. Any measure of prevalence and associated measures, such as prevalence odds ratio, in a case-control study is artificial because the researcher arbitrarily sets the proportion of cases to non-cases in this study design. Temporality can be suggested, however, it is rarely definitively demonstrated because it is unknown if the development of the disease truly preceded the exposure. It should be noted that for certain outcomes, particularly death, the criteria for demonstrating temporality in that specific exposure-outcome relationship are met and the use of relative risk as a measure of risk may be justified.
Case-crossover study design
A case-crossover study relies upon an individual to act as their own control for comparison issues, thereby minimizing some potential confounders ( 1 , 5 , 12 ). This study design should not be confused with a crossover study design which is an interventional study type and is described below. For case-crossover studies, cases are assessed for their exposure status immediately prior to the time they became a case, and then compared to their own exposure at a prior point where they didn’t become a case. The selection of the prior point for comparison issues is often chosen at random or relies upon a mean measure of exposure over time. Case-crossover studies are always retrospective. An example of a case-crossover study would be evaluating the exposure of talking on a cell phone and being involved in an automobile crash. Cases are drivers involved in a crash and the comparison is that same driver at a random timeframe where they were not involved in a crash. These types of studies are particularly good for exposure-outcome relationships where the outcome is acute and well defined, e.g. electrocutions, lacerations, automobile crashes, etc. ( 1 , 5 ). Exposure-outcome relationships that are assessed using case-crossover designs should have health outcomes that do not have a subclinical or undiagnosed period prior to becoming a “case” in the study ( 12 ). The exposure is cell phone use during the exposure periods, both before the crash and during the control period. Additionally, the reliance upon prior exposure time requires that the exposure not have an additive or cumulative effect over time ( 1 , 5 ). Case-crossover study designs are at higher risk for having recall bias as compared with other study designs ( 12 ). Study participants are more likely to remember an exposure prior to becoming a case, as compared to not becoming a case.
Retrospective and prospective cohort study design
Cohort studies involve identifying study participants based on their exposure status and either following them through time to identify which participants develop the outcome(s) of interest, or look back at data that were created in the past, prior to the development of the outcome. Prospective cohort studies are considered the gold standard of observational research ( 1 – 3 , 5 , 8 , 10 , 11 ). These studies begin with a cross-sectional study to categorize exposure and identify cases at baseline. Disease-free participants are then followed and cases are measured as they develop. Retrospective cohort studies also begin with a cross-sectional study to categorize exposure and identify cases. Exposures are then measured based on records created at that time. Additionally, in an ideal retrospective cohort, case status is also tracked using historical data that were created at that point in time. Occupational groups, particularly those that have regular surveillance or certifications such as Commercial Truck Drivers, are particularly well positioned for retrospective cohort studies because records of both exposure and outcome are created as part of commercial and regulatory purposes ( 8 ). These types of studies have the ability to demonstrate temporality and therefore identify true risk factors, not associated factors, as can be done in other types of studies.
Cohort studies are the only observational study that can calculate incidence, both cumulative incidence and an incidence rate ( 1 , 3 , 5 , 6 , 10 , 11 ). Also, because the inception of a cohort study is identical to a cross-sectional study, both point prevalence and period prevalence can be calculated. There are many measures of risk that can be calculated from cohort study data. Again, the measures of risk for the exposure-outcome relationship that can be calculated in cross-sectional study design of odds ratio, prevalence odds ratio, prevalence ratio, and prevalence difference can be calculated in cohort studies as well. Measures of risk that leverage a cohort study’s ability to calculate incidence include incidence rate ratio, relative risk, risk ratio, and hazard ratio. These measures that demonstrate temporality are considered stronger measures for demonstrating causation and identification of risk factors.
Diagnostic testing and evaluation study designs
A specific study design is the diagnostic accuracy study, which is often used as part of the clinical decision making process. Diagnostic accuracy study designs are those that compare a new diagnostic method with the current “gold standard” diagnostic procedure in a cross-section of both diseased and healthy study participants. Gold standard diagnostic procedures are the current best-practice for diagnosing a disease. An example is comparing a new rapid test for a cancer with the gold standard method of biopsy. There are many intricacies to diagnostic testing study designs that should be considered. The proper selection of the gold standard evaluation is important for defining the true measures of accuracy for the new diagnostic procedure. Evaluations of diagnostic test results should be blinded to the case status of the participant. Similar to the intention-to-treat concept discussed later in interventional studies, diagnostic tests have a procedure of analyses called intention to diagnose (ITD), where participants are analysed in the diagnostic category they were assigned, regardless of the process in which a diagnosis was obtained. Performing analyses according to an a priori defined protocol, called per protocol analyses (PP or PPA), is another potential strength to diagnostic study testing. Many measures of the new diagnostic procedure, including accuracy, sensitivity, specificity, positive predictive value, negative predictive value, positive likelihood ratio, negative likelihood ratio, and diagnostic odds ratio can be calculated. These measures of the diagnostic test allow for comparison with other diagnostic tests and aid the clinician in determining which test to utilize.
Interventional study designs
Interventional study designs, also called experimental study designs, are those where the researcher intervenes at some point throughout the study. The most common and strongest interventional study design is a randomized controlled trial, however, there are other interventional study designs, including pre-post study design, non-randomized controlled trials, and quasi-experiments ( 1 , 5 , 13 ). Experimental studies are used to evaluate study questions related to either therapeutic agents or prevention. Therapeutic agents can include prophylactic agents, treatments, surgical approaches, or diagnostic tests. Prevention can include changes to protective equipment, engineering controls, management, policy or any element that should be evaluated as to a potential cause of disease or injury.
Pre-post study design
A pre-post study measures the occurrence of an outcome before and again after a particular intervention is implemented. A good example is comparing deaths from motor vehicle crashes before and after the enforcement of a seat-belt law. Pre-post studies may be single arm, one group measured before the intervention and again after the intervention, or multiple arms, where there is a comparison between groups. Often there is an arm where there is no intervention. The no-intervention arm acts as the control group in a multi-arm pre-post study. These studies have the strength of temporality to be able to suggest that the outcome is impacted by the intervention, however, pre-post studies do not have control over other elements that are also changing at the same time as the intervention is implemented. Therefore, changes in disease occurrence during the study period cannot be fully attributed to the specific intervention. Outcomes measured for pre-post intervention studies may be binary health outcomes such as incidence or prevalence, or mean values of a continuous outcome such as systolic blood pressure may also be used. The analytic methods of pre-post studies depend on the outcome being measured. If there are multiple treatment arms, it is also likely that the difference from beginning to end within each treatment arm are analysed.
Non-randomized trial study design
Non-randomized trials are interventional study designs that compare a group where an intervention was performed with a group where there was no intervention. These are convenient study designs that are most often performed prospectively and can suggest possible relationships between the intervention and the outcome. However, these study designs are often subject to many types of bias and error and are not considered a strong study design.
Randomized controlled trial study design
Randomized controlled trials (RCTs) are the most common type of interventional study, and can have many modifications ( 26 – 28 ). These trials take a homogenous group of study participants and randomly divide them into two separate groups. If the randomization is successful then these two groups should be the same in all respects, both measured confounders and unmeasured factors. The intervention is then implemented in one group and not the other and comparisons of intervention efficacy between the two groups are analysed. Theoretically, the only difference between the two groups through the entire study is the intervention. An excellent example is the intervention of a new medication to treat a specific disease among a group of patients. This randomization process is arguably the largest strength of an RCT ( 26 – 28 ). Additional methodological elements are utilized among RCTs to further strengthen the causal implication of the intervention’s impact. These include allocation concealment, blinding, measuring compliance, controlling for co-interventions, measuring dropout, analysing results by intention to treat, and assessing each treatment arm at the same time point in the same manner.
Crossover randomized controlled trial study design
A crossover RCT is a type of interventional study design where study participants intentionally “crossover” to the other treatment arm. This should not be confused with the observational case-crossover design. A crossover RCT begins the same as a traditional RCT, however, after the end of the first treatment phase, each participant is re-allocated to the other treatment arm. There is often a wash-out period in between treatment periods. This design has many strengths, including demonstrating reversibility, compensating for unsuccessful randomization, and improving study efficiency by not using time to recruit subjects.
Allocation concealment theoretically guarantees that the implementation of the randomization is free from bias. This is done by ensuring that the randomization scheme is concealed from all individuals involved ( 26 – 30 ). A third party who is not involved in the treatment or assessment of the trial creates the randomization schema and study participants are randomized according to that schema. By concealing the schema, there is a minimization of potential deviation from that randomization, either consciously or otherwise by the participant, researcher, provider, or assessor. The traditional method of allocation concealment relies upon sequentially numbered opaque envelopes with the treatment allocation inside. These envelopes are generated before the study begins using the selected randomization scheme. Participants are then allocated to the specific intervention arm in the pre-determined order dictated by the schema. If allocation concealment is not utilized, there is the possibility of selective enrolment into an intervention arm, potentially with the outcome of biased results.
Blinding in an RCT is withholding the treatment arm from individuals involved in the study. This can be done through use of placebo pills, deactivated treatment modalities, or sham therapy. Sham therapy is a comparison procedure or treatment which is identical to the investigational intervention except it omits a key therapeutic element, thus rendering the treatment ineffective. An example is a sham cortisone injection, where saline solution of the same volume is injected instead of cortisone. This helps ensure that patients do not know if they are receiving the active or control treatment. The process of blinding is utilized to help ensure equal treatment of the different groups, therefore continuing to isolate the difference in outcome between groups to only the intervention being administered ( 28 – 31 ). Blinding within an RCT includes patient blinding, provider blinding, or assessor blinding. In some situations it is difficult or impossible to blind one or more of the parties involved, but an ideal study would have all parties blinded until the end of the study ( 26 – 28 , 31 , 32 ).
Compliance is the degree of how well study participants adhere to the prescribed intervention. Compliance or non-compliance to the intervention can have a significant impact on the results of the study ( 26 – 29 ). If there is a differentiation in the compliance between intervention arms, that differential can mask true differences, or erroneously conclude that there are differences between the groups when one does not exist. The measurement of compliance in studies addresses the potential for differences observed in intervention arms due to intervention adherence, and can allow for partial control of differences either through post hoc stratification or statistical adjustment.
Co-interventions, interventions that impact the outcome other than the primary intervention of the study, can also allow for erroneous conclusions in clinical trials ( 26 – 28 ). If there are differences between treatment arms in the amount or type of additional therapeutic elements then the study conclusions may be incorrect ( 29 ). For example, if a placebo treatment arm utilizes more over-the-counter medication than the experimental treatment arm, both treatment arms may have the same therapeutic improvement and show no effect of the experimental treatment. However, the placebo arm improvement is due to the over-the-counter medication and if that was prohibited, there may be a therapeutic difference between the two treatment arms. The exclusion or tracking and statistical adjustment of co-interventions serves to strengthen an RCT by minimizing this potential effect.
Participants drop out of a study for multiple reasons, but if there are differential dropout rates between intervention arms or high overall dropout rates, there may be biased data or erroneous study conclusions ( 26 – 28 ). A commonly accepted dropout rate is 20% however, studies with dropout rates below 20% may have erroneous conclusions ( 29 ). Common methods for minimizing dropout include incentivizing study participation or short study duration, however, these may also lead to lack of generalizability or validity.
Intention-to-treat (ITT) analysis is a method of analysis that quantitatively addresses deviations from random allocation ( 26 – 28 ). This method analyses individuals based on their allocated intervention, regardless of whether or not that intervention was actually received due to protocol deviations, compliance concerns or subsequent withdrawal. By maintaining individuals in their allocated intervention for analyses, the benefits of randomization will be captured ( 18 , 26 – 29 ). If analysis of actual treatment is solely relied upon, then some of the theoretical benefits of randomization may be lost. This analysis method relies on complete data. There are different approaches regarding the handling of missing data and no consensus has been put forth in the literature. Common approaches are imputation or carrying forward the last observed data from individuals to address issues of missing data ( 18 , 19 ).
Assessment timing can play an important role in the impact of interventions, particularly if intervention effects are acute and short lived ( 26 – 29 , 33 ). The specific timing of assessments are unique to each intervention, however, studies that allow for meaningfully different timing of assessments are subject to erroneous results. For example, if assessments occur differentially after an injection of a particularly fast acting, short-lived medication the difference observed between intervention arms may be due to a higher proportion of participants in one intervention arm being assessed hours after the intervention instead of minutes. By tracking differences in assessment times, researchers can address the potential scope of this problem, and try to address it using statistical or other methods ( 26 – 28 , 33 ).
Randomized controlled trials are the principle method for improving treatment of disease, and there are some standardized methods for grading RCTs, and subsequently creating best practice guidelines ( 29 , 34 – 36 ). Much of the current practice of medicine lacks moderate or high quality RCTs to address what treatment methods have demonstrated efficacy and much of the best practice guidelines remains based on consensus from experts ( 28 , 37 ). The reliance on high quality methodology in all types of studies will allow for continued improvement in the assessment of causal factors for health outcomes and the treatment of diseases.
Standards of research and reporting
There are many published standards for the design, execution and reporting of biomedical research, which can be found in Table 3 . The purpose and content of these standards and guidelines are to improve the quality of biomedical research which will result in providing sound conclusions to base medical decision making upon. There are published standards for categories of study designs such as observational studies (e.g. STROBE), interventional studies (e.g. CONSORT), diagnostic studies (e.g. STARD, QUADAS), systematic reviews and meta-analyses (e.g. PRISMA ), as well as others. The aim of these standards and guideline are to systematize and elevate the quality of biomedical research design, execution, and reporting.
Published standard for study design and reporting.
- Consolidated Standards Of Reporting Trials (CONSORT, www.consort-statement.org ) are interventional study standards, a 25 item checklist and flowchart specifically designed for RCTs to standardize reporting of key elements including design, analysis and interpretation of the RCT.
- Strengthening the Reporting of Observational studies in Epidemiology (STROBE, www.strobe-statement.org ) is a collection of guidelines specifically for standardization and improvement of the reporting of observational epidemiological research. There are specific subsets of the STROBE statement including molecular epidemiology (STROBE-ME), infectious diseases (STROBE-ID) and genetic association studies (STREGA).
- Standards for Reporting Studies of Diagnos tic Accuracy (STARD, www.stard-statement.org ) is a 25 element checklist and flow diagram specifically designed for the reporting of diagnostic accuracy studies.
- Quality assessment of diagnostic accuracy studies (QUADAS, www.bris.ac.uk/quadas ) is a quality assessment of diagnostic accuracy studies.
- Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA, www.prisma-statement.org ) is a 27 element checklist and multiphase flow diagram to improve quality of reporting systematic reviews and meta-analyses. It replaces the QUOROM statement.
- Consolidated criteria for reporting qualitative research (COREQ) is a 32 element checklist designed for reporting of qualitative data from interviews and focus groups.
- Statistical Analyses and Methods in the Published Literature (SAMPL) is a guideline for statistical methods and analyses of all types of biomedical research.
- Consensus-based Clinical Case Reporting Guideline Development (CARE, www.carestatement.org ) is a checklist comprised of 13 elements and is designed only for case reports.
- Standards for Quality Improvement Reporting Excellence (SQUIRE, www.squire-statement.org ) are publication guidelines comprised of 19 elements, for authors aimed at quality improvement in health care reporting.
- Consolidated Health Economic Evaluation Reporting Standards (CHEERS) is a 24 element checklist of reporting practices for economic evaluations of interventional studies.
- Enhancing transparency in reporting the synthesis of qualitative research (ENTREQ) is a guideline specifically for standardizing and improving the reporting of qualitative biomedical research.
When designing or evaluating a study it may be helpful to review the applicable standards prior to executing and publishing the study. All published standards and guidelines are available on the web, and are updated based on current best practices as biomedical research evolves. Additionally, there is a network called “Enhancing the quality and transparency of health research” (EQUATOR, www.equator-network.org ) , which has guidelines and checklists for all standards reported in Table 3 and is continually updated with new study design or specialty specific standards.
The appropriate selection of a study design is only one element in successful research. The selection of a study design should incorporate consideration of costs, access to cases, identification of the exposure, the epidemiologic measures that are required, and the level of evidence that is currently published regarding the specific exposure-outcome relationship that is being assessed. Reviewing appropriate published standards when designing a study can substantially strengthen the execution and interpretation of study results.
Potential conflict of interest