• En español – ExME
  • Em português – EME

Case-control and Cohort studies: A brief overview

Posted on 6th December 2017 by Saul Crandon

Man in suit with binoculars

Introduction

Case-control and cohort studies are observational studies that lie near the middle of the hierarchy of evidence . These types of studies, along with randomised controlled trials, constitute analytical studies, whereas case reports and case series define descriptive studies (1). Although these studies are not ranked as highly as randomised controlled trials, they can provide strong evidence if designed appropriately.

Case-control studies

Case-control studies are retrospective. They clearly define two groups at the start: one with the outcome/disease and one without the outcome/disease. They look back to assess whether there is a statistically significant difference in the rates of exposure to a defined risk factor between the groups. See Figure 1 for a pictorial representation of a case-control study design. This can suggest associations between the risk factor and development of the disease in question, although no definitive causality can be drawn. The main outcome measure in case-control studies is odds ratio (OR) .

case control study and cohort study

Figure 1. Case-control study design.

Cases should be selected based on objective inclusion and exclusion criteria from a reliable source such as a disease registry. An inherent issue with selecting cases is that a certain proportion of those with the disease would not have a formal diagnosis, may not present for medical care, may be misdiagnosed or may have died before getting a diagnosis. Regardless of how the cases are selected, they should be representative of the broader disease population that you are investigating to ensure generalisability.

Case-control studies should include two groups that are identical EXCEPT for their outcome / disease status.

As such, controls should also be selected carefully. It is possible to match controls to the cases selected on the basis of various factors (e.g. age, sex) to ensure these do not confound the study results. It may even increase statistical power and study precision by choosing up to three or four controls per case (2).

Case-controls can provide fast results and they are cheaper to perform than most other studies. The fact that the analysis is retrospective, allows rare diseases or diseases with long latency periods to be investigated. Furthermore, you can assess multiple exposures to get a better understanding of possible risk factors for the defined outcome / disease.

Nevertheless, as case-controls are retrospective, they are more prone to bias. One of the main examples is recall bias. Often case-control studies require the participants to self-report their exposure to a certain factor. Recall bias is the systematic difference in how the two groups may recall past events e.g. in a study investigating stillbirth, a mother who experienced this may recall the possible contributing factors a lot more vividly than a mother who had a healthy birth.

A summary of the pros and cons of case-control studies are provided in Table 1.

case control study and cohort study

Table 1. Advantages and disadvantages of case-control studies.

Cohort studies

Cohort studies can be retrospective or prospective. Retrospective cohort studies are NOT the same as case-control studies.

In retrospective cohort studies, the exposure and outcomes have already happened. They are usually conducted on data that already exists (from prospective studies) and the exposures are defined before looking at the existing outcome data to see whether exposure to a risk factor is associated with a statistically significant difference in the outcome development rate.

Prospective cohort studies are more common. People are recruited into cohort studies regardless of their exposure or outcome status. This is one of their important strengths. People are often recruited because of their geographical area or occupation, for example, and researchers can then measure and analyse a range of exposures and outcomes.

The study then follows these participants for a defined period to assess the proportion that develop the outcome/disease of interest. See Figure 2 for a pictorial representation of a cohort study design. Therefore, cohort studies are good for assessing prognosis, risk factors and harm. The outcome measure in cohort studies is usually a risk ratio / relative risk (RR).

case control study and cohort study

Figure 2. Cohort study design.

Cohort studies should include two groups that are identical EXCEPT for their exposure status.

As a result, both exposed and unexposed groups should be recruited from the same source population. Another important consideration is attrition. If a significant number of participants are not followed up (lost, death, dropped out) then this may impact the validity of the study. Not only does it decrease the study’s power, but there may be attrition bias – a significant difference between the groups of those that did not complete the study.

Cohort studies can assess a range of outcomes allowing an exposure to be rigorously assessed for its impact in developing disease. Additionally, they are good for rare exposures, e.g. contact with a chemical radiation blast.

Whilst cohort studies are useful, they can be expensive and time-consuming, especially if a long follow-up period is chosen or the disease itself is rare or has a long latency.

A summary of the pros and cons of cohort studies are provided in Table 2.

case control study and cohort study

The Strengthening of Reporting of Observational Studies in Epidemiology Statement (STROBE)

STROBE provides a checklist of important steps for conducting these types of studies, as well as acting as best-practice reporting guidelines (3). Both case-control and cohort studies are observational, with varying advantages and disadvantages. However, the most important factor to the quality of evidence these studies provide, is their methodological quality.

  • Song, J. and Chung, K. Observational Studies: Cohort and Case-Control Studies .  Plastic and Reconstructive Surgery.  2010 Dec;126(6):2234-2242.
  • Ury HK. Efficiency of case-control studies with multiple controls per case: Continuous or dichotomous data .  Biometrics . 1975 Sep;31(3):643–649.
  • von Elm E, Altman DG, Egger M, Pocock SJ, Gøtzsche PC, Vandenbroucke JP; STROBE Initiative.  The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement: guidelines for reporting observational studies.   Lancet 2007 Oct;370(9596):1453-14577. PMID: 18064739.

' src=

Saul Crandon

Leave a reply cancel reply.

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

No Comments on Case-control and Cohort studies: A brief overview

' src=

Very well presented, excellent clarifications. Has put me right back into class, literally!

' src=

Very clear and informative! Thank you.

' src=

very informative article.

' src=

Thank you for the easy to understand blog in cohort studies. I want to follow a group of people with and without a disease to see what health outcomes occurs to them in future such as hospitalisations, diagnoses, procedures etc, as I have many health outcomes to consider, my questions is how to make sure these outcomes has not occurred before the “exposure disease”. As, in cohort studies we are looking at incidence (new) cases, so if an outcome have occurred before the exposure, I can leave them out of the analysis. But because I am not looking at a single outcome which can be checked easily and if happened before exposure can be left out. I have EHR data, so all the exposure and outcome have occurred. my aim is to check the rates of different health outcomes between the exposed)dementia) and unexposed(non-dementia) individuals.

' src=

Very helpful information

' src=

Thanks for making this subject student friendly and easier to understand. A great help.

' src=

Thanks a lot. It really helped me to understand the topic. I am taking epidemiology class this winter, and your paper really saved me.

Happy new year.

' src=

Wow its amazing n simple way of briefing ,which i was enjoyed to learn this.its very easy n quick to pick ideas .. Thanks n stay connected

' src=

Saul you absolute melt! Really good work man

' src=

am a student of public health. This information is simple and well presented to the point. Thank you so much.

' src=

very helpful information provided here

' src=

really thanks for wonderful information because i doing my bachelor degree research by survival model

' src=

Quite informative thank you so much for the info please continue posting. An mph student with Africa university Zimbabwe.

' src=

Thank you this was so helpful amazing

' src=

Apreciated the information provided above.

' src=

So clear and perfect. The language is simple and superb.I am recommending this to all budding epidemiology students. Thanks a lot.

' src=

Great to hear, thank you AJ!

' src=

I have recently completed an investigational study where evidence of phlebitis was determined in a control cohort by data mining from electronic medical records. We then introduced an intervention in an attempt to reduce incidence of phlebitis in a second cohort. Again, results were determined by data mining. This was an expedited study, so there subjects were enrolled in a specific cohort based on date(s) of the drug infused. How do I define this study? Thanks so much.

' src=

thanks for the information and knowledge about observational studies. am a masters student in public health/epidemilogy of the faculty of medicines and pharmaceutical sciences , University of Dschang. this information is very explicit and straight to the point

' src=

Very much helpful

Subscribe to our newsletter

You will receive our monthly newsletter and free access to Trip Premium.

Related Articles

""

Cluster Randomized Trials: Concepts

This blog summarizes the concepts of cluster randomization, and the logistical and statistical considerations while designing a cluster randomized controlled trial.

""

Expertise-based Randomized Controlled Trials

This blog summarizes the concepts of Expertise-based randomized controlled trials with a focus on the advantages and challenges associated with this type of study.

""

An introduction to different types of study design

Conducting successful research requires choosing the appropriate study design. This article describes the most common types of designs conducted by researchers.

Case Cohort Study

  • Reference work entry
  • Cite this reference work entry

case control study and cohort study

In a case-cohort study, cases are defined as those participants of the cohort who developed the disease of interest, but controls are identified before the cases develop. This means that controls are randomly chosen from all cohort participants regardless of whether they have the disease of interest or not, and that baseline data can be collected early in the study.

Case-cohort studies are very similar to nested case-control studies . The main difference between a nested case-control study and a case-cohort study is the way in which controls are chosen. Generally, the main advantage of case-cohort design over nested case-control design is that the same control group can be used for comparison with different case groups in a case-cohort study. The main disadvantages of the case-cohort design is that it requires a more complicated statistical analysis and it can be less efficient than a nested case-control study under some circumstances (e. g.,...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Editor information

Editors and affiliations.

Network EUROlifestyle Research Association Public Health Saxony-Saxony Anhalt e.V. Medical Faculty, University of Technology, Fiedlerstr. 27, 01307, Dresden, Germany

Wilhelm Kirch ( Professor Dr. Dr. ) ( Professor Dr. Dr. )

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag

About this entry

Cite this entry.

(2008). Case Cohort Study . In: Kirch, W. (eds) Encyclopedia of Public Health. Springer, Dordrecht. https://doi.org/10.1007/978-1-4020-5614-7_323

Download citation

DOI : https://doi.org/10.1007/978-1-4020-5614-7_323

Publisher Name : Springer, Dordrecht

Print ISBN : 978-1-4020-5613-0

Online ISBN : 978-1-4020-5614-7

eBook Packages : Medicine Reference Module Medicine

Share this entry

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

Observational studies: cohort and case-control studies

Affiliation.

  • 1 Ann Arbor, Mich. From the Section of Plastic Surgery, Department of Surgery, University of Michigan Health System.
  • PMID: 20697313
  • PMCID: PMC2998589
  • DOI: 10.1097/PRS.0b013e3181f44abc

Observational studies constitute an important category of study designs. To address some investigative questions in plastic surgery, randomized controlled trials are not always indicated or ethical to conduct. Instead, observational studies may be the next best method of addressing these types of questions. Well-designed observational studies have been shown to provide results similar to those of randomized controlled trials, challenging the belief that observational studies are second rate. Cohort studies and case-control studies are two primary types of observational studies that aid in evaluating associations between diseases and exposures. In this review article, the authors describe these study designs and methodologic issues, and provide examples from the plastic surgery literature.

Publication types

  • Research Support, N.I.H., Extramural
  • Case-Control Studies*
  • Cohort Studies*
  • Evidence-Based Medicine*
  • Plastic Surgery Procedures / methods*
  • Plastic Surgery Procedures / standards*
  • Prospective Studies
  • Randomized Controlled Trials as Topic
  • Retrospective Studies

Grants and funding

  • F32 AR058105/AR/NIAMS NIH HHS/United States
  • K24 AR053120/AR/NIAMS NIH HHS/United States
  • K24 AR053120-01A2/AR/NIAMS NIH HHS/United States

Log in using your username and password

  • Search More Search for this keyword Advanced search
  • Latest content
  • Current issue
  • BMJ Journals More You are viewing from: Google Indexer

You are here

  • Volume 20, Issue 1
  • Observational research methods. Research design II: cohort, cross sectional, and case-control studies
  • Article Text
  • Article info
  • Citation Tools
  • Rapid Responses
  • Article metrics

Download PDF

  • Department of Accident and Emergency Medicine, Taunton and Somerset Hospital, Taunton, Somerset, UK
  • Correspondence to:
 Dr C J Mann; 
 tonygood{at}doctors.org.uk

Cohort, cross sectional, and case-control studies are collectively referred to as observational studies. Often these studies are the only practicable method of studying various problems, for example, studies of aetiology, instances where a randomised controlled trial might be unethical, or if the condition to be studied is rare. Cohort studies are used to study incidence, causes, and prognosis. Because they measure events in chronological order they can be used to distinguish between cause and effect. Cross sectional studies are used to determine prevalence. They are relatively quick and easy but do not permit distinction between cause and effect. Case controlled studies compare groups retrospectively. They seek to identify possible predictors of outcome and are useful for studying rare diseases or outcomes. They are often used to generate hypotheses that can then be studied via prospective cohort or other studies.

  • research methods
  • cohort study
  • case-control study
  • cross sectional study

https://doi.org/10.1136/emj.20.1.54

Statistics from Altmetric.com

Request permissions.

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Cohort, cross sectional, and case-control studies are often referred to as observational studies because the investigator simply observes. No interventions are carried out by the investigator. With the recent emphasis on evidence based medicine and the formation of the Cochrane Database of randomised controlled trials, such studies have been somewhat glibly maligned. However, they remain important because many questions can be efficiently answered by these methods and sometimes they are the only methods available.

The objective of most clinical studies is to determine one of the following—prevalence, incidence, cause, prognosis, or effect of treatment; it is therefore useful to remember which type of study is most commonly associated with each objective (table 1)

  • View inline

While an appropriate choice of study design is vital, it is not sufficient. The hallmark of good research is the rigor with which it is conducted. A checklist of the key points in any study irrespective of the basic design is given in box 1.

Study purpose

The aim of the study should be clearly stated.

The sample should accurately reflect the population from which it is drawn.

The source of the sample should be stated.

The sampling method should be described and the sample size should be justified.

Entry criteria and exclusions should be stated and justified.

The number of patients lost to follow up should be stated and explanations given.

Control group

The control group should be easily identifiable.

The source of the controls should be explained—are they from the same population as the sample?

Are the controls matched or randomised—to minimise bias and confounding.

Quality of measurements and outcomes

Validity—are the measurements used regarded as valid by other investigators?

Reproducibility—can the results be repeated or is there a reason to suspect they may be a “one off”?

Blinded—were the investigators or subjects aware of their subject/control allocation?

Quality control—has the methodology been rigorously adhered to?

Completeness

Compliance—did all patients comply with the study?

Drop outs—how many failed to complete the study?

Missing data—how much are unavailable and why?

Distorting influences

Extraneous treatments—other interventions that may have affected some but not all of the subjects.

Confounding factors—Are there other variables that might influence the results?

Appropriate analysis—Have appropriate statistical tests been used?

All studies should be internally valid. That is, the conclusions can be logically drawn from the results produced by an appropriate methodology. For a study to be regarded as valid it must be shown that it has indeed demonstrated what it says it has. A study that is not internally valid should not be published because the findings cannot be accepted.

The question of external validity relates to the value of the results of the study to other populations—that is, the generalisability of the results. For example, a study showing that 80% of the Swedish population has blond hair, might be used to make a sensible prediction of the incidence of blond hair in other Scandinavian countries, but would be invalid if applied to most other populations.

Every published study should contain sufficient information to allow the reader to analyse the data with reference to these key points.

In this article each of the three important observational research methods will be discussed with emphasis on their strengths and weaknesses. In so doing it should become apparent why a given study used a particular research method and which method might best answer a particular clinical problem.

COHORT STUDIES

These are the best method for determining the incidence and natural history of a condition. The studies may be prospective or retrospective and sometimes two cohorts are compared.

Prospective cohort studies

A group of people is chosen who do not have the outcome of interest (for example, myocardial infarction). The investigator then measures a variety of variables that might be relevant to the development of the condition. Over a period of time the people in the sample are observed to see whether they develop the outcome of interest (that is, myocardial infarction).

In single cohort studies those people who do not develop the outcome of interest are used as internal controls.

Where two cohorts are used, one group has been exposed to or treated with the agent of interest and the other has not, thereby acting as an external control.

Retrospective cohort studies

These use data already collected for other purposes. The methodology is the same but the study is performed posthoc. The cohort is “followed up” retrospectively. The study period may be many years but the time to complete the study is only as long as it takes to collate and analyse the data.

Advantages and disadvantages

The use of cohorts is often mandatory as a randomised controlled trial may be unethical; for example, you cannot deliberately expose people to cigarette smoke or asbestos. Thus research on risk factors relies heavily on cohort studies.

As cohort studies measure potential causes before the outcome has occurred the study can demonstrate that these “causes” preceded the outcome, thereby avoiding the debate as to which is cause and which is effect.

A further advantage is that a single study can examine various outcome variables. For example, cohort studies of smokers can simultaneously look at deaths from lung, cardiovascular, and cerebrovascular disease. This contrasts with case-control studies as they assess only one outcome variable (that is, whatever outcome the cases have entered the study with).

Cohorts permit calculation of the effect of each variable on the probability of developing the outcome of interest (relative risk). However, where a certain outcome is rare then a prospective cohort study is inefficient. For example, studying 100 A&E attenders with minor injuries for the outcome of diabetes mellitus will probably produce only one patient with the outcome of interest. The efficiency of a prospective cohort study increases as the incidence of any particular outcome increases. Thus a study of patients with a diagnosis of deliberate self harm in the 12 months after initial presentation would be efficiently studied using a cohort design.

Another problem with prospective cohort studies is the loss of some subjects to follow up. This can significantly affect the outcome. Taking incidence analysis as an example (incidence = cases/per period of time), it can be seen that the loss of a few cases will seriously affect the numerator and hence the calculated incidence. The rarer the condition the more significant this effect.

Retrospective studies are much cheaper as the data have already been collected. One advantage of such a study design is the lack of bias because the outcome of current interest was not the original reason for the data to be collected. However, because the cohort was originally constructed for another purpose it is unlikely that all the relevant information will have been rigorously collected.

Retrospective cohorts also suffer the disadvantage that people with the outcome of interest are more likely to remember certain antecedents, or exaggerate or minimise what they now consider to be risk factors (recall bias).

Where two cohorts are compared one will have been exposed to the agent of interest and one will not. The major disadvantage is the inability to control for all other factors that might differ between the two groups. These factors are known as confounding variables.

A confounding variable is independently associated with both the variable of interest and the outcome of interest. For example, lung cancer (outcome) is less common in people with asthma (variable). However, it is unlikely that asthma in itself confers any protection against lung cancer. It is more probable that the incidence of lung cancer is lower in people with asthma because fewer asthmatics smoke cigarettes (confounding variable). There are a virtually infinite number of potential confounding variables that, however unlikely, could just explain the result. In the past this has been used to suggest that there is a genetic influence that makes people want to smoke and also predisposes them to cancer.

The only way to eliminate all possibility of a confounding variable is via a prospective randomised controlled study. In this type of study each type of exposure is assigned by chance and so confounding variables should be present in equal numbers in both groups.

Finally, problems can arise as a result of bias. Bias can occur in any research and reflects the potential that the sample studied is not representative of the population it was drawn from and/or the population at large. A classic example is using employed people, as employment is itself associated with generally better health than unemployed people. Similarly people who respond to questionnaires tend to be fitter and more motivated than those who do not. People attending A&E departments should not be presumed to be representative of the population at large.

How to run a cohort study

If the data are readily available then a retrospective design is the quickest method. If high quality, reliable data are not available a prospective study will be required.

The first step is the definition of the sample group. Each subject must have the potential to develop the outcome of interest (that is, circumcised men should not be included in a cohort designed to study paraphimosis). Furthermore, the sample population must be representative of the general population if the study is primarily looking at the incidence and natural history of the condition (descriptive).

If however the aim is to analyse the relation between predictor variables and outcomes (analytical) then the sample should contain as many patients likely to develop the outcome as possible, otherwise much time and expense will be spent collecting information of little value.

Cohort studies

Cohort studies describe incidence or natural history.

They analyse predictors (risk factors) thereby enabling calculation of relative risk.

Cohort studies measure events in temporal sequence thereby distinguishing causes from effects.

Retrospective cohorts where available are cheaper and quicker.

Confounding variables are the major problem in analysing cohort studies.

Subject selection and loss to follow up is a major potential cause of bias.

Each variable studied must be accurately measured. Variables that are relatively fixed, for example, height need only be recorded once. Where change is more probable, for example, drug misuse or weight, repeated measurements will be required.

To minimise the potential for missing a confounding variable all probable relevant variables should be measured. If this is not done the study conclusions can be readily criticised. All patients entered into the study should also be followed up for the duration of the study. Losses can significantly affect the validity of the results. To minimise this as much information about the patient (name, address, telephone, GP, etc) needs to be recorded as soon as the patient is entered into the study. Regular contact should be made; it is hardly surprising if the subjects have moved or lost interest and become lost to follow up if they are only contacted at 10 year intervals!

Beware, follow up is usually easier in people who have been exposed to the agent of interest and this may lead to bias.

There are many famous examples of Cohort studies including the Framingham heart study, 2 the UK study of doctors who smoke 3 and Professor Neville Butler‘s studies on British children born in 1958. 4 A recent example of a prospective cohort study by Davey Smith et al was published in the BMJ 5 and a retrospective cohort design was used to assess the use of A&E departments by people with diabetes. 6

CROSS SECTIONAL STUDIES

These are primarily used to determine prevalence. Prevalence equals the number of cases in a population at a given point in time. All the measurements on each person are made at one point in time. Prevalence is vitally important to the clinician because it influences considerably the likelihood of any particular diagnosis and the predictive value of any investigation. For example, knowing that ascending cholangitis in children is very rare enables the clinician to look for other causes of abdominal pain in this patient population.

Cross sectional studies are also used to infer causation.

At one point in time the subjects are assessed to determine whether they were exposed to the relevant agent and whether they have the outcome of interest. Some of the subjects will not have been exposed nor have the outcome of interest. This clearly distinguishes this type of study from the other observational studies (cohort and case controlled) where reference to either exposure and/or outcome is made.

The advantage of such studies is that subjects are neither deliberately exposed, treated, or not treated and hence there are seldom ethical difficulties. Only one group is used, data are collected only once and multiple outcomes can be studied; thus this type of study is relatively cheap.

Many cross sectional studies are done using questionnaires. Alternatively each of the subjects may be interviewed. Table 2 lists the advantages and disadvantages of each.

Any study with a low response rate can be criticised because it can miss significant differences in the responders and non-responders. At its most extreme all the non-responders could be dead! Strenuous efforts must be made to maximise the numbers who do respond. The use of volunteers is also problematic because they too are unlikely to be representative of the general population. A good way to produce a valid sample would be to randomly select people from the electoral role and invite them to complete a questionnaire. In this way the response rate is known and non-responders can be identified. However, the electoral role itself is not an entirely accurate reflection of the general population. A census is another example of a cross sectional study.

Market research organisations often use cross sectional studies (for example, opinion polls). This entails a system of quotas to ensure the sample is representative of the age, sex, and social class structure of the population being studied. However, to be commercially viable they are convenience samples—only people available can be questioned. This technique is insufficiently rigorous to be used for medical research.

How to run a cross sectional study

Formulate the research question(s) and choose the sample population. Then decide what variables of the study population are relevant to the research question. A method for contacting sample subjects must be devised and then implemented. In this way the data are collected and can then be analysed

The most important advantage of cross sectional studies is that in general they are quick and cheap. As there is no follow up, less resources are required to run the study.

Cross sectional studies are the best way to determine prevalence and are useful at identifying associations that can then be more rigorously studied using a cohort study or randomised controlled study.

The most important problem with this type of study is differentiating cause and effect from simple association. For example, a study finding an association between low CD4 counts and HIV infection does not demonstrate whether HIV infection lowers CD4 levels or low CD4 levels predispose to HIV infection. Moreover, male homosexuality is associated with both but causes neither. (Another example of a confounding variable).

Often there are a number of plausible explanations. For example, if a study shows a negative relation between height and age it could be concluded that people lose height as they get older, younger generations are getting taller, or that tall people have a reduced life expectancy when compared with short people. Cross sectional studies do not provide an explanation for their findings.

Rare conditions cannot efficiently be studied using cross sectional studies because even in large samples there may be no one with the disease. In this situation it is better to study a cross sectional sample of patients who already have the disease (a case series). In this way it was found in 1983 that of 1000 patients with AIDS, 727 were homosexual or bisexual men and 236 were intrvenous drug abusers. 6 The conclusion that individuals in these two groups had a higher relative risk was inescapable. The natural history of HIV infection was then studied using cohort studies and efficacy of treatments via case controlled studies and randomised clinical trials.

An example of a cross sectional study was the prevalence study of skull fractures in children admitted to hospital in Edinburgh from 1983 to 1989. 7 Note that although the study period was seven years it was not a longitudinal or cohort study because information about each subject was recorded at a single point in time.

A questionnaire based cross sectional study explored the relation between A&E attendance and alcohol consumption in elderly persons. 9

A recent example can be found in the BMJ , in which the prevalence of serious eye disease in a London population was evaluated. 10

Cross sectional studies

Cross sectional studies are the best way to determine prevalence

Are relatively quick

Can study multiple outcomes

Do not themselves differentiate between cause and effect or the sequence of events

CASE-CONTROL STUDIES

In contrast with cohort and cross sectional studies, case-control studies are usually retrospective. People with the outcome of interest are matched with a control group who do not. Retrospectively the researcher determines which individuals were exposed to the agent or treatment or the prevalence of a variable in each of the study groups. Where the outcome is rare, case-control studies may be the only feasible approach.

As some of the subjects have been deliberately chosen because they have the disease in question case-control studies are much more cost efficient than cohort and cross sectional studies—that is, a higher percentage of cases per study.

Case-control studies determine the relative importance of a predictor variable in relation to the presence or absence of the disease. Case-control studies are retrospective and cannot therefore be used to calculate the relative risk; this a prospective cohort study. Case-control studies can however be used to calculate odds ratios, which in turn, usually approximate to the relative risk.

How to run a case-control study

Decide on the research question to be answered. Formulate an hypothesis and then decide what will be measured and how. Specify the characteristics of the study group and decide how to construct a valid control group. Then compare the “exposure” of the two groups to each variable.

When conditions are uncommon, case-control studies generate a lot of information from relatively few subjects. When there is a long latent period between an exposure and the disease, case-control studies are the only feasible option. Consider the practicalities of a cohort study or cross sectional study in the assessment of new variant CJD and possible aetiologies. With less than 300 confirmed cases a cross sectional study would need about 200 000 subjects to include one symptomatic patient. Given a postulated latency of 10 to 30 years a cohort study would require both a vast sample size and take a generation to complete.

In case-control studies comparatively few subjects are required so more resources are available for studying each. In consequence a huge number of variables can be considered. This type of study is therefore useful for generating hypotheses that can then be tested using other types of study.

This flexibility of the variables studied comes at the expense of the restricted outcomes studied. The only outcome is the presence or absence of the disease or whatever criteria was chosen to select the cases.

The major problems with case-control studies are the familiar ones of confounding variables (see above) and bias. Bias may take two major forms.

Sampling bias

The patients with the disease may be a biased sample (for example, patients referred to a teaching hospital) or the controls may be biased (for example, volunteers, different ages, sex or socioeconomic group).

Observation and recall bias

As the study assesses predictor variables retrospectively there is great potential for a biased assessment of their presence and significance by the patient or the investigator, or both.

Overcoming sampling bias

Ideally the cases studied should be a random sample of all the patients with the disease. This is not only very difficult but in many instances is impossible because many cases may not have been diagnosed or have been misdiagnosed. For example, many cases of non-insulin dependent diabetes will not have sought medical attention and therefore be undiagnosed. Conversely many psychiatric diseases may be differently labelled in different countries and even by different doctors in the same country. As a result they will be misdiagnosed for the purposes of the study. However, in reality you are often left studying a sample of those patients who it is possible to recruit. Selecting the controls is often a more difficult problem.

To enable the controls to represent the same population as the cases, one of four techniques may be used.

A convenience sample—sampled in the same way as the cases, for example, attending the same outpatient department. While this is certainly convenient it may reduce the external validity of the study.

Matching—the controls may be a matched or unmatched random sample from the unaffected population. Again the problems of controlling for unknown influences is present but if the controls are too closely matched they may not be representative of the general population. “Over matching” may cause the true difference to be underestimated.

The advantage of matching is that it allows a smaller sample size for any given effect to be statistically significant.

Using two or more control groups. If the study demonstrates a significant difference between the patients with the outcome of interest and those without, even when the latter have been sampled in a number of different ways (for example, outpatients, in patients, GP patients) then the conclusion is more robust.

Using a population based sample for both cases and controls. It is possible to take a random sample of all the patients with a particular disease from specific registers. The control group can then be constructed by selecting age and sex matched people randomly selected from the same population as the area covered by the disease register.

Overcoming observation and recall bias

Overcoming retrospective recall bias can be achieved by using data recorded, for other purposes, before the outcome had occurred and therefore before the study had started. The success of this strategy is limited by the availability and reliability of the data collected. Another technique is blinding where neither the subject nor the observer know if they are a case or control subject. Nor are they aware of the study hypothesis. In practice this is often difficult or impossible and only partial blinding is practicable. It is usually possible to blind the subjects and observers to the study hypothesis by asking spurious questions. Observers can also be easily blinded to the case or control status of the patient where the relevant observation is not of the patient themselves but a laboratory test or radiograph.

Case-control studies

Case-control studies are simple to organise

Retrospectively compare two groups

Aim to identify predictors of an outcome

Permit assessment of the influence of predictors on outcome via calculation of an odds ratio

Useful for hypothesis generation

Can only look at one outcome

Bias is an major problem

Blinding cases to their case or control status is usually impracticable as they already know that they have a disease or illness. Similarly observers can hardly be blinded to the presence of physical signs, for example, cyanosis or dyspnoea.

As a result of the problems of matching, bias and confounding, case-control studies, are often flawed. They are however useful for generating hypotheses. These hypotheses can then be tested more rigorously by other methods—randomised controlled trials or cohort studies.

Case-control studies are very common. They are particularly useful for studying infrequent events, for example, cot death, survival from out of hospital cardiac arrest, and toxicological emergencies.

A recent example was the study of atrial fibrillation in middle aged men during exercise. 11

USING DATABASES FOR RESEARCH (SECONDARY DATA)

Pre-existing databases provide an excellent and convenient source of data. There are a host of such databases and the increasing archiving of information on computers means that this is an enlarging area for obtaining data. Table 3 lists some common examples of potentially useful databases.

Such databases enable vast numbers of people to be entered into a study prospectively or retrospectively. They can be used to construct a cohort, to produce a sample for a cross sectional study, or to identify people with certain conditions or outcomes and produce a sample for a case controlled study. A recent study used census data from 11 countries to look at the relation between social class and mortality in middle aged men. 12

These type of data are ordinarily collected by people other than the researcher and independently of any specific hypothesis. The opportunity for observer bias is thus diminished. The use of previously collected data is efficient and comparatively inexpensive and moreover the data are collected in a very standardised way, permitting comparisons over time and between different countries. However, because the data are collected for other purposes it may not be ideally suited to the testing of the current hypothesis, additionally it may be incomplete. This may result in sampling bias. For example, the electoral roll depends upon registration by each individual. Many homeless, mentally ill, and chronically sick people will not be registered. Similarly the notification of certain communicable diseases is a statutory responsibility for doctors in the UK: while it is probable that most cases of cholera are reported it is highly unlikely that most cases of food poisoning are.

Causes and associations

Because observational studies are not experiments (as are randomised controlled trials) it is difficult to control many external variables. In consequence when faced with a clear and significant association between some form of illness or cause of death and some environmental influence a judgement has to be made as to whether this is a causal link or simply an association. Table 4 outlines the points to be considered when making this judgement. 13

None of these judgements can provide indisputable evidence of cause and effect, but taken together they do permit the investigator to answer the fundamental questions “is there any other way to explain the available evidence?” and is there any other more likely than cause and effect?”

Qualitative studies can produce high quality information but all such studies can be influenced by known and unknown confounding variables. Appropriate use of observational studies permits investigation of prevalence, incidence, associations, causes, and outcomes. Where there is little evidence on a subject they are cost effective ways of producing and investigating hypotheses before larger and more expensive study designs are embarked upon. In addition they are often the only realistic choice of research methodology, particularly where a randomised controlled trial would be impractical or unethical.

Cohort studies look forwards in time by following up each subject

Subjects are selected before the outcome of interest is observed

They establish the sequence of events

Numerous outcomes can be studied

They are the best way to establish the incidence of a disease

They are a good way to determine causes of diseases

The principal summary statistic of cohort studies is the relative risk ratio

If prospective, they are expensive and often take a long time for sufficient outcome events to occur to produce meaningful results

Cross sectional studies look at each subject at one point in time only

Subjects are selected without regard to the outcome of interest

Less expensive

They are the best way to determine prevalence

The principal summary statistic of cross sectional studies is the odds ratio

Weaker evidence of causality than cohort studies

Inaccurate when studying rare conditions

Case-control studies look back at what has happened to each subject

Subjects are selected specifically on the basis of the outcome of interest

Efficient (small sample sizes)

Produce odds ratios that approximate to relative risks for each variable studied

Prone to sampling bias and retrospective analysis bias

Only one outcome is studied

GLOSSARY OF TERMS

The inclusion of subjects or methods such that the results obtained are not truly representative of the population from which it is drawn

The process by which the researcher and or the subject is ignorant of which intervention or exposure has occurred.

Cochrane database

An international collaborative project collating peer reviewed prospective randomised clinical trials.

Is a component of a population identified so that one or more characteristic can be studied as it ages through time.

Confounding variable

A variable that is associated with both the exposure and outcome of interest that is not the variable being studied.

A group of people without the condition of interest, or unexposed to or not treated with the agent of interest.

False positive

A test result that suggests that the subject has a specific disease or condition when in fact the subject does not.

Is a rate and therefore is always related either explicitly or by implication to a time period. With regard to disease it can be defined as the number of new cases that develop during a specified time interval.

A period of time between exposure to an agent and the development of symptoms, signs, or other evidence of changes associated with that exposure.

The process by which each case is matched with one or more controls, which have been deliberately chosen to be as similar as the test subjects in all regards other than the variable being studied.

Observational study

A study in which no intervention is made (in contrast with an experimental study). Such studies provide estimates and examine associations of events in their natural settings without recourse to experimental intervention.

The ratio of the probability of an event occurring to the probability of non-occurrence. In a clinical setting this would be equivalent to the odds of a condition occurring in the exposed group divided by the odds of it occurring in the non-exposed group.

Is not defined by a time interval and is therefore not a rate. It may be defined as the number of cases of a disease that exist in a defined population at a specified point in time.

Randomised controlled trial

Subjects are assigned by statistically randomised methods to two or more groups. In doing so it is assumed that all variables other than the proposed intervention are evenly distributed between the groups. In this way bias is minimised.

Relative risk

This is the ratio of the probability of developing the condition if exposed to a certain variable compared with the probability if not exposed.

Response rate

The proportion of subjects who respond to either a treatment or a questionnaire.

Risk factor

A variable associated with a specific disease or outcome.

Validity—internal

The rigour with which a study has been designed and executed—that is, can the conclusion be relied upon?

Validity—external

The usefulness of the findings of a study with respect to other populations.

A value or quality that can vary between subjects and/or over time

  • Download figure
  • Open in new tab
  • Download powerpoint

Study design for cohort studies.

Study design for cross sectional studies

Study design for case-control studies.

  • Fowkes F , Fulton P. Critical appraisal of published research: introductory guidelines. BMJ 1991 ; 302 : 1136 –40.
  • ↵ Lerner DJ , Kannel WB. Patterns of coronary heart disease morbidity and mortality in the sexes: a 26 year follow-up of the Framingham population. Am Heart J 1986 ; 111 : 383 –90. OpenUrl CrossRef PubMed Web of Science
  • ↵ Doll R , Peto H. Mortality in relation to smoking. 40 years observation on female British doctors. BMJ 1989 ; 208 : 967 –73. OpenUrl
  • ↵ Alberman ED , Butler NR, Sheridan MD. Visual acuity of a national sample (1958 cohort) at 7 years. Dev Med Child Neurol 1971 ; 13 : 9 –14. OpenUrl PubMed Web of Science
  • ↵ Davey Smith G , Hart C, Blane D, et al . Adverse socioeconomic conditions in childhood and cause specific mortality: prospective observational study. BMJ 1998 ; 316 : 1631 –5. OpenUrl Abstract / FREE Full Text
  • ↵ Goyder EC , Goodacre SW, Botha JL, et al . How do individuals with diabetes use the accident and emergency department? J Accid Emerg Med 1997 ; 14 : 371 –4. OpenUrl Abstract / FREE Full Text
  • ↵ Jaffe HW , Bregman DJ, Selik RM. Acquired immune deficiency in the US: the first 1000 cases. J Inf Dis 1983 ; 148 : 339 –45. OpenUrl Abstract / FREE Full Text
  • Johnstone AJ , Zuberi SH, Scobie WH. Skull fractures in children: a population study. J Accid Emerg Med 1996 ; 13 : 386 –9. OpenUrl Abstract / FREE Full Text
  • ↵ van der Pol V , Rodgers H, Aitken P, et al . Does alcohol contribute to accident and emergency department attendance in elderly people? J Accid Emerg Med 1996 ; 13 : 258 –60. OpenUrl Abstract / FREE Full Text
  • ↵ Reidy A , Minassian DC, Vafadis G, et al . BMJ 1998 ; 316 : 1643 –7. OpenUrl Abstract / FREE Full Text
  • ↵ Karjaleinen , Kujala U, Kaprio J, et al . BMJ 1998 ; 316 : 1784 –5. OpenUrl FREE Full Text
  • ↵ Kunst A , Groenhof F, Mackenbach J. BMJ 1998 ; 316 : 1636 –42. OpenUrl Abstract / FREE Full Text
  • ↵ Hill AB , Hill ID. Bradford Hills principles of medical statistics. 12th edn. London: Edward Arnold, 1991.

Read the full text or download the PDF:

U.S. flag

Official websites use .gov

A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS

A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

THE CDC FIELD EPIDEMIOLOGY MANUAL

Designing and Conducting Analytic Studies in the Field

Brendan R. Jackson And Patricia M. Griffin

Analytic studies can be a key component of field investigations, but beware of an impulse to begin one too quickly. Studies can be time- and resource-intensive, and a hastily constructed study might not answer the correct questions. For example, in a foodborne disease outbreak investigation, if the culprit food is not on your study’s questionnaire, you probably will not be able to implicate it. Analytic studies typically should be used to test hypotheses, not generate them. However, in certain situations, collecting data quickly about patients and a comparison group can be a way to explore multiple hypotheses. In almost all situations, generating hypotheses before designing a study will help you clarify your study objectives and ask better questions.

  • Generating Hypotheses
  • Study Designs for Testing Hypotheses
  • Types of Observational Studies for Testing Hypotheses
  • Selection of Controls in Case–Control Studies
  • Matching in Case–Control Studies
  • Example: Using an Analytic Study to Solve an Outbreak at a Church Potluck Dinner (But Not That Church Potluck)
  • Outbreaks with Universal Exposure

The initial steps of an investigation, described in previous chapters, are some of your best sources of hypotheses. Key activities include the following:

  • By examining the sex distribution among persons in outbreaks, US enteric disease investigators have learned to suspect a vegetable as the source when most patients are women. (Of course, generalizations do not always hold true!)
  • In an outbreak of bloodstream infections caused by Serratia marcescens among patients receiving parenteral nutrition (food administered through an intravenous catheter), investigators had a difficult time finding the source until they noted that none of the 19 cases were among children. Further investigation of the parenteral nutrition administered to adults but not children in that hospital identified contaminated amino acid solution as the source ( 1 ).
  • Focus on outliers. Give extra attention to the earliest and latest cases on an epidemic curve and to persons who recently visited the neighborhood where the outbreak is occurring. Interviews with these patients can yield important clues (e.g., by identifying the index case, secondary case, or a narrowed list of common exposures).
  • Determine sources of similar outbreaks. Consult health department records, review the literature, and consult experts to learn about previous sources. Be mindful that new sources frequently occur, given ever-changing social, behavioral, and commercial trends.
  • Conduct a small number of in-depth, open-ended interviews. When a likely source is not quickly evident, conducting in-depth (often >1 hour), open-ended interviews with a subset of patients (usually 5 to 10) or their caregivers can be the best way to identify possible sources. It helps to begin with a semistructured list of questions designed to help the patient recall the events and exposures of every day during the incubation period. The interview can end with a “shotgun” questionnaire (see activity 6) ( Box 7.1 ). A key component of this technique is that one investigator ideally conducts, or at least participates in, as many interviews as possible (five or more) because reading notes from others’ interviews is no substitute for soliciting and hearing the information first-hand. For example, in a 2009 Escherichia coli O157 outbreak, investigators were initially unable to find the source through general and targeted questionnaires. During open-ended interviews with five patients, the interviewer noted that most reported having eaten strawberries, a particular type of candy, and uncooked prepackaged cookie dough. An analytic study was then conducted that included questions about these exposures; it confirmed cookie dough as the source ( 3 ).
  • Ask patients what they think. Patients can have helpful thoughts about the source of their illness. However, be aware that patients often associate their most recent food exposure (e.g., a meal) with illness, whereas the inciting exposure might have been long before.
  • Consider administering a shotgun questionnaire. Such questionnaires, which typically ask about hundreds of possible exposures, are best used on a limited number of patients as part of hypothesis-generating interviews. After generating hypotheses, investigators can create a questionnaire targeted to that investigation. Although not an ideal method, shotgun questionnaires can be used by multiple interviewers to obtain data about large numbers of patients ( Box 7.1 ).

In November 2014, a US surveillance system for foodborne diseases (PulseNet) detected a cluster (i.e., a possible outbreak) of listeriosis cases based on similar-appearing Listeria monocytogenes isolates by pulsed-field gel electrophoresis of the isolates. No suspected foods were identified through routine patient interviews by using a Listeria -specific questionnaire with approximately 40 common food sources of listeriosis (e.g., soft cheese and deli meat). The outbreak’s descriptive epidemiology offered no clear leads: the sex distribution was nearly even, the age spectrum was wide, and the case-fatality rate of approximately 20% was typical. Notably, however, 3 of the 35 cases occurred among previously healthy school-aged children, which is highly unusual for listeriosis. Most cases occurred during late October and early November.

Investigators began reinterviewing patients by using a hypothesis-generating shotgun questionnaire with more than 500 foods, but it did not include caramel apples. By comparing the first nine patient responses with data from a published survey of food consumption, strawberries and ice cream emerged as hypotheses. However, several interviewed patients denied having eaten these foods during the month before illness. An investigator then conducted lengthy, open-ended interviews with patients and their family members. During one interview, he asked about special foods eaten during recent holidays, and the patient’s wife replied that her husband had eaten prepackaged caramel apples around Halloween. Although produce items had been implicated in past listeriosis outbreaks, caramel apples seemed an unlikely source. However, the interviewer took note of this connection because he had previously interviewed another patient who reported having eaten caramel apples. This event underscores the importance of one person conducting multiple interviews because that person might make subtle mental connections that may be missed when reviewing other interviewers’ notes. In fact, several other investigators listening to the interview noted this exposure—among hundreds of others—but thought little of it.

In this investigation, the finding of high strawberry and ice cream consumption among patients, coupled with the timing of the outbreak during a holiday period, helped make a sweet food (i.e., caramel apples) seem more plausible as the possible source.

To explore the caramel apple hypothesis, investigators asked five other patients about this exposure, and four reported having eaten them. On the basis of these initial results, investigators designed and administered a targeted questionnaire to patients involved in the outbreak, as well as to patients infected with unrelated strains of L. monocytogenes (i.e., a case–case study). This study, combined with testing of apples and the apple packing facility, confirmed that caramel apples were the source (2). Had a single interviewer performed multiple open-ended interviews to generate hypotheses before the shotgun questionnaire, the outbreak might have been solved sooner.

As evident in public health and clinical guidelines, randomized controlled trials (e.g., trials of drugs, vaccines, and community-level interventions) are the reference standard for epidemiology, providing the highest level of evidence. However, such studies are not possible in certain situations, including outbreak investigations. Instead, investigators must rely on observational studies, which can provide sufficient evidence for public health action. In observational studies, the epidemiologist documents rather than determines the exposures, quantifying the statistical association between exposure and disease. Here again, the key when designing such studies is to obtain a relevant comparison group for the patients ( Box 7.2 ).

Because field analytic studies are used to quantify the association between exposure and disease, defining what is meant by exposure and disease is essential. Exposure is used broadly, meaning demographic characteristics, genetic or immunologic makeup, behaviors, environmental exposures, and other factors that might influence a person’s risk for disease. Because precise information can help accurately estimate an exposure’s effect on disease, exposure measures should be as objective and standard as possible. Developing a measure of exposure can be conceptually straightforward for an exposure that is a relatively discrete event or characteristic—for example, whether a person received a spinal injection with steroid medication compounded at a specific pharmacy or whether a person received a typhoid vaccination during the year before international travel. Although these exposures might be straightforward in theory, they can be subject to interpretation in practice. Should a patient injected with a medication from an unknown pharmacy be considered exposed? Whatever decision is made should be documented and applied consistently.

Additionally, exposures often are subject to the whims of memory. Memory aids (e.g., restaurant menus, vaccination cards, credit card receipts, and shopper cards) can be helpful. More than just a binary yes or no, the dose of an exposure can also be enlightening. For example, in an outbreak of fungal bloodstream infections linked to contaminated intravenous saline flushes administered at an oncology clinic, affected patients had received a greater number of flushes than unaffected patients ( 4 ). Similarly, in an outbreak of Listeria monocytogenes infections, the association with deli meat became apparent only when the exposure evaluated was consumption of deli meat more than twice a week ( 5 ).

Defining disease (e.g., does a person have botulism?) might sound simple, but often it is not; read more about making and applying disease case definitions in Chapter 3 .

Three types of observational studies are commonly used in the field. All are best performed by using a standard questionnaire specific for that investigation, developed on the basis of hypothesis-generating interviews.

Observational Study Type 1: Cohort

In concept, a cohort study, like an experimental study, begins with a group of persons without the disease under study, but with different exposure experiences, and follows them over time to find out whether they experience the disease or health condition of interest. However, in a cohort study, each person’s exposure is merely recorded rather than assigned randomly by the investigator. Then the occurrence of disease among persons with different exposures is compared to assess whether the exposures are associated with increased risk for disease. Cohort studies can be prospective or retrospective.

Prospective Cohort Studies

A prospective cohort study enrolls participants before they experience the disease or condition of interest. The enrollees are then followed over time for occurrence of the disease or condition. The unexposed or lowest exposure group serves as the comparison group, providing an estimate of the baseline or expected amount of disease. An example of a prospective cohort study is the Framingham Heart Study. By assessing the exposures of an original cohort of more than 5,000 adults without cardiovascular disease (CVD), beginning in 1948 and following them over time, the study was the first to identify common CVD risk factors ( 6 ). Each case of CVD identified after enrollment was counted as an incident case. Incidence was then quantified as the number of cases divided by the sum of time that each person was followed (incidence rate) or as the number of cases divided by the number of participants being followed (attack rate or risk or i ncidence proportion). In field epidemiology, prospective cohort studies also often involve a group of persons who have had a known exposure (e.g., survived the World Trade Center attack on September 11, 2001 [ 7 ]) and who are then followed to examine the risk for subsequent illnesses with long incubation or latency periods.

Retrospective Cohort Studies

A retrospective cohort study enrolls a defined participant group after the disease or condition of interest has occurred. In field epidemiology, these studies are more common than prospective studies. The population affected is often well-defined (e.g., banquet attendees, a particular school’s students, or workers in a certain industry). Investigators elicit exposure histories and compare disease incidence among persons with different exposures or exposure levels.

Observational Study Type 2: Case–Control

In a case–control study, the investigator must identify a comparison group of control persons who have had similar opportunities for exposure as the case-patients. Case–control studies are commonly performed in field epidemiology when a cohort study is impractical (e.g., no defined cohort or too many non-ill persons in the group to interview). Whereas a cohort study proceeds conceptually from exposure to disease or condition, a case–control study begins conceptually with the disease or condition and looks backward at exposures. Excluding controls by symptoms alone might not guarantee that they do not have mild cases of the illness under investigation. Table 7.1 presents selected key differences between a case–control and retrospective cohort study.

Observational Study Type 3: Case–Case

In case–case studies, a group of patients with the same or similar disease serve as a comparison group (8). This method might require molecular subtyping of the suspected pathogen to distinguish outbreak-associated cases from other cases and is especially useful when relevant controls are difficult to identify. For example, controls for an investigation of Listeria illnesses typically are patients with immunocompromising conditions (e.g., cancer or corticosteroid use) who might be difficult to identify among the general population. Patients with Listeria isolates of a different subtype than the outbreak strain can serve as comparisons to help reduce bias when comparing food exposures. However, patients with similar illnesses can have similar exposures, which can introduce a bias, making identifying the source more difficult. Moreover, other considerations should influence the choice of a comparison group. If most outbreak-associated case-patients are from a single neighborhood or are of a certain race/ethnicity, other patients with listeriosis from across the country will serve as an inadequate comparison group.

Considerations for Selecting Controls

Selecting relevant controls is one of the most important considerations when designing a case–control study. Several key considerations are presented here; consult other resources for in-depth discussion ( 9,10 ). Ideally, controls should

  • Thoroughly reflect the source population from which case-patients arose, and
  • Provide a good estimate of the level of exposure one would expect from that population. Sometimes the source population is not so obvious, and a case–control study using controls from the general population might be needed to implicate a general exposure (e.g., visiting a specific clinic, restaurant, or fair). The investigation can then focus on specific exposures among persons with the general exposure (see also next section).

Controls should be chosen independently of any specific exposure under evaluation. If you select controls on the basis of lack of exposure, you are likely to find an association between illness and that exposure regardless of whether one exists. Also important is selecting controls from a source population in a way that minimizes confounding (see Chapter 8 ), which is the existence of a factor (e.g., annual income) that, by being associated with both exposure and disease, can affect the associations you are trying to examine.

When trying to enroll controls who reflect the source population, try to avoid overmatching (i.e., enrolling controls who are too similar to case-patients, resulting in fewer differences among case-patients and controls than ought to exist and decreased ability to identify exposure–disease associations). When conducting case–control studies in hospitals and other healthcare settings, ensure that controls do not have other diseases linked to the exposure under study.

Commonly Used Control Selection Methods

When an outbreak does not affect a defined population (e.g., potluck dinner attendees) but rather the community at large, a range of options can be used to determine how to select controls from a large group of persons.

  • Random-digit dialing . This method, which involves selecting controls by using a system that randomly selects telephone numbers from a directory, has been a staple of US outbreak investigations. In recent years, however, declining response rates because of increasing use of caller identification and cellular phones and lack of readily available directory listings of cellular phone numbers by geographic area have made this method increasingly difficult. Even when this method was most useful, often 50 or more numbers needed to be dialed to reach one household or person who both answered and provided a usable match for the case-patient. Commercial databases that include cellular phone numbers have been used successfully to partially address this problem, but the method remains time-consuming ( 11 ).
  • Random or systematic sampling from a list . For investigations in settings where a roster is available (e.g., attendees at a resort on certain dates), controls can be selected by either random or systematic sampling. Government records (e.g., motor vehicle, voter, or tax records) can provide lists of possible controls, but they might not be representative of the population being studied ( 11 ). For random sampling, a table or computer-generated list of random numbers can be used to select every n th persons to contact (e.g., every 12th or 13th).
  • Neighborhood . Recruiting controls from the same neighborhood as case-patients (i.e., neighborhood matching) has commonly been used during case–control studies, particularly in low-and middle-income countries. For example, during an outbreak of typhoid fever in Tajikistan ( 12 ), investigators recruited controls by going door-to-door down a street, starting at a case-patient’s house; a study of cholera in Haiti used a similar method ( 13 ). Typically, the immediately neighboring households are skipped to prevent overmatching.
  • Patients’ friends or relatives . Using friends and relatives as controls can be an effective technique when the characteristics of case-patients (e.g., very young children) make finding controls by a random method difficult. Typically, the investigator interviews a patient or his or her parent, then asks for the names and contact information for more friends or relatives who are needed as controls. One advantage is that the friends of an ill person are usually willing to participate, knowing their cooperation can help solve the puzzle. However, because they can have similar personal habits and preferences as patients, their exposures might be similar. Such overmatching can decrease the likelihood of finding the source of the illness or condition.
  • Databases of persons with exposure information . Sources of data on persons with exposure information include survey data (e.g., FoodNet Population Survey [ 14 ]), public health databases of patients with other illnesses or a different subtype of the same illness, and previous studies. ( Chapter 4 describes additional sources.)

When considering outside data sources, investigators must determine whether those data provide an appropriate comparison group. For example, persons in surveys might differ from case-patients in ways that are impossible to determine. Other patients might be so similar to case-patients that risky exposures are unidentifiable, or they might be so different that exposures identified as risks are not true risks.

To help control for confounding, controls can be matched to case-patients on characteristics specified by investigators, including age group, sex, race/ethnicity, and neighborhood. Such matching does not itself reduce confounding, but it enables greater efficiency when matched analyses are performed that do ( 15 ). When deciding to match, however, be judicious. Matching on too many characteristics can make controls difficult to find (making a tough process even harder). Imagine calling hundreds of random telephone numbers trying to find a man of a particular ethnicity aged 50–54 years who is then willing to answer your questions. Also, remember not to match on the exposure of interest or on any other characteristic you wish to examine. Matched case–control study data typically necessitate a matched analysis (e.g., conditional logistic regression) ( 15 ).

Matching Types

The two main types of matching are pair matching and frequency matching.

Pair Matching

In pair matching, each control is matched to a specific case-patient. This method can be helpful logistically because it allows matching by friends or relatives, neighborhood, or telephone exchange, but finding controls who meet specific criteria can be burdensome.

Frequency Matching

In frequency matching, also called category matching , controls are matched to case-patients in proportion to the distribution of a characteristic among case-patients. For example, if 20% of case-patients are children aged 5–18 years, 50% are adults aged 19–49 years, and 30% are adults 50 years or older, controls should be enrolled in similar proportions. This method works best when most case-patients have been identified before control selection begins. It is more efficient than pair matching because a person identified as a possible control who might not meet the criteria for matching a particular case-patient might meet criteria for one of the case-patient groups.

Number of Controls

Most field case–control studies use control-to-case-patient ratios of 1:1, 2:1, or 3:1. Enrolling more than one control per case-patient can increase study power, which might be needed to detect a statistically significant difference in exposure between case-patients and controls, particularly when an outbreak involves a limited number of cases. The incremental gain of adding more controls beyond three or four is small because study power begins to plateau. Note that not all case-patients need to have the same number of controls. Sample size calculations can help in estimating a target number of controls to enroll, although sample sizes in certain field investigations are limited more by time and resource constraints. Still, estimating study power under a range of scenarios is wise because an analytic study might not be worth doing if you have little chance of detecting a statistically significant association. Sample size calculators for unmatched case–control studies are available at http://www.openepi.com and in the StatCalc function of Epi Info ( https://www.cdc.gov/epiinfo ).

More than One Control Group

Sometimes the choice of a control group is so vexing that investigators decide to use more than one type of control group (e.g., a hospital-based group and a community group). If the two control groups provide similar results and conclusions about risk factors for disease, the credibility of the findings is increased. In contrast, if the two control groups yield conflicting results, interpretation becomes more difficult.

Since the 1940s, field epidemiology students have studied a classic outbreak of gastrointestinal illness at a church potluck dinner in Oswego, New York ( 16 ). However, the case study presented here, used to illustrate study designs, is a different potluck dinner.

In April 2015, an astute neurologist in Lancaster, Ohio, contacted the local health department about a patient in the emergency department with a suspected case of botulism. Within 2 hours, four more patients arrived with similar symptoms, including blurred vision and shortness of breath. Health officials immediately recognized this as a botulism outbreak.

  • If the source is a widely distributed commercial product, then the population to study is persons across the United States and possibly abroad.
  • If the source is airborne, then the population to study is residents of a single city or area.
  • If the source is food from a restaurant, then the population to study is predominantly local residents and some travelers.
  • If the source is a meal at a workplace or social setting, then the population to study is meal attendees.
  • If the source is a meal at home, then the population to study is household members and any guests.

Descriptive epidemiology and questioning of the case-patients revealed that all had eaten at the same church potluck dinner and had no other common exposures, making the potluck the likely exposure site and attendees the likely source population. Thus, an analytic study would be targeted at potluck attendees, although investigators must remain alert to case-patients among nonattendees. As initial interviews were conducted, more cases of botulism were being diagnosed, quickly increasing to more than 25. The source of the outbreak needed to be identified rapidly to halt further exposure and illness.

  • List of foods served at the potluck.
  • Approximate number of attendees.
  • A case definition.
  • Information from 5–10 hypothesis-generating interviews with a few case-patients or their family members.
  • A cohort study would be a reasonable option because a defined group exists (i.e., a cohort) of exposed persons who could be interviewed in a reasonable amount of time. The study would be retrospective because the outcome (i.e., botulism) has already occurred, and investigators could assess exposures retrospectively (i.e., foods eaten at the potluck) by interviewing attendees.
  • In a cohort study, investigators can calculate the attack rate for botulism among potluck attendees who reported having eaten each food and for those who had not. For example, if 20 of the 30 attendees who had eaten a particular food (e.g., potato salad) had botulism, you would calculate the attack rate by dividing 20 (corresponding to cell a in Handout 7.1 ) by 30 (total exposed, or a + b), yielding approximately 67%. If 5 of the 45 attendees who had not eaten potato salad had botulism, the attack rate among the unexposed—5 / 45, corresponding to c/ (c + d)—would be approximately 11%. The risk ratio would be 6, which is calculated by dividing the attack rate among the exposed (67%) by the attack rate among the unexposed (11%).
  • A case–control study would be the most feasible option because the entire cohort could not be identified and because the large number of attendees could make interviewing them all difficult. Rather than interview all non-ill persons, a subset could be interviewed as control subjects.
  • The method of control subject selection should be considered carefully. If all attendees are not interviewed, determining the risk for botulism among the exposed and unexposed is impossible because investigators would not know the exposures for all non-ill attendees. Instead of risk, investigators calculate the odds of exposure, which can approximate risk. For example, if 20 (80%) of 25 case-patients had eaten potato salad, the odds of potato salad exposure among case-patients would be 20/ 5 = 4 (exposed/ unexposed, or a/ c in Handout 7.2 ). If 10 (20%) of 50 selected controls had eaten potato salad, the odds of exposure among control subjects would be 10/ 40 = 0.25 (or b/ d in Handout 7.2). Dividing the odds of exposure among the case-patients (a/ c) by the odds of exposure among control subjects (b / d) yields an odds ratio of 16 (4/ 0.25). The odds ratio is not a true measure of risk, but it can be used to implicate a food. An odds ratio can approximate a risk ratio when the outcome or disease is rare (e.g., roughly <5% of a population). In such cases, a/ b is similar to a/ (a + b). The odds ratio is typically higher than the risk ratio when >5% of exposed persons in the analysis have the illness.

In the actual outbreak, 29 (38%) of 77 potluck attendees had botulism. The investigators performed a cohort study, interviewing 75 of the 77 attendees about 52 foods served ( 17 ). The attack rate among persons who had eaten potato salad was significantly and substantially higher than the attack rate among those who had not, with a risk ratio of 14 (95% confidence interval 5–42). One of the potato salads served was made with incorrectly home-canned potatoes (a known source of botulinum toxin), and samples of discarded potato salad tested positive for botulinum toxin, supporting the findings of the analytic study. (Of note, persons often blame potato salad for causing illness when, in fact, it rarely is a source. This outbreak was a notable exception.)

In field epidemiology, the link between exposure and illness is often so strong that it is evident despite such inherent study limitations as small sample size and exposure misclassification. In this outbreak, a few of the patients with botulism reported not having eaten potato salad, and some of the attendees without botulism reported having eaten it. In epidemiologic studies, you rarely find 100% concordance between exposure and outcome for various reasons, including incomplete or erroneous recall because remembering everything eaten is difficult. Here, cross-contamination of potato salad with other foods might have helped explain cases among patients who had not eaten potato salad because only a small amount of botulinum toxin is needed to produce illness.

Two-by-Two Table to Calculate the Relative Risk, or Risk Ratio, in Cohort Studies

Two- by- two tables are covered in more detail in Chapter 8 .

Risk Ratio = Incidence in exposed over Incidence in unexposed = a over a+b over c over c+d

Two-by-Two Table to Calculate the Odds Ratio in Case–Control Studies

A risk ratio cannot be calculated from a case–control study because true attack rates cannot be calculated.

Odds ratio = Odds of exposure in cases over Odds of exposure in controls = a/c over b/d = ad over bc

What kind of study would you design if your hypothesis-generating interviews lead you to believe that everyone, or nearly everyone, was exposed to the same suspected infection source? How would you test hypotheses if all barbecue attendees, ill and non-ill, had eaten the chicken or if all town residents had drunk municipal tap water, and no unexposed group exists for comparison? A few factors that might be of help are the exposure timing (e.g., a particularly undercooked batch of barbeque), the exposure place (e.g., a section of the water system more contaminated than others), and the exposure dose (e.g., number of chicken pieces eaten or glasses of water drunk). Including questions about the time, place, and frequency of highly suspected exposures in a questionnaire can improve the chances of detecting a difference ( 18 ).

Cohort, case–control, and case–case studies are the types of analytic studies that field epidemiologists use most often. They are best used as mechanisms for evaluating—quantifying and testing—hypotheses identified in earlier phases of the investigation. Cohort studies, which are oriented conceptually from exposure to disease, are appropriate in settings in which an entire population is well-defined and available for enrollment (e.g., guests at a wedding reception). Cohort studies are also appropriate when well-defined groups can be enrolled by exposure status (e.g., employees working in different parts of a manufacturing plant). Case–control studies, in contrast, are useful when the population is less clearly defined. Case–control studies, oriented from disease to exposure, identify persons with disease and a comparable group of persons without disease (controls). Then the exposure experiences of the two groups are compared. Case–case studies are similar to case–control studies, except that controls have an illness not linked to the outbreak. Case–control studies are probably the type most often appropriate for field investigations. Although conceptually straightforward, the design of an effective epidemiologic study requires many careful decisions. Taking the time needed to develop good hypotheses can result in a questionnaire that is useful for identifying risk factors. The choice of an appropriate comparison group, how many controls per case-patient to enroll, whether to match, and how best to avoid potential biases are all crucial decisions for a successful study.

This chapter relies heavily on the work of Richard C. Dicker, who authored this chapter in the previous edition.

  • Gupta N, Hocevar SN, Moulton-Meissner HA, et al. Outbreak of Serratia marcescens bloodstream infections in patients receiving parenteral nutrition prepared by a compounding pharmacy. Clin Infect Dis. 2014;59:1–8.
  • Angelo K, Conrad A, Saupe A, et al. Multistate outbreak of Listeria monocytogenes infections linked to whole apples used in commercially produced, prepackaged caramel apples: United States, 2014–2015. Epidemiol Infect. 2017;145:848–56.
  • Neil KP, Biggerstaff G, MacDonald JK, et al. A novel vehicle for transmission of Escherichia coli O157: H7 to humans: multistate outbreak of E. coli O157: H7 infections associated with consumption of ready-to-bake commercial prepackaged cookie dough—United States, 2009. Clin Infect Dis. 2012;54:511–8.
  • Vasquez AM, Lake J, Ngai S, et al. Notes from the field: fungal bloodstream infections associated with a compounded intravenous medication at an outpatient oncology clinic—New York City, 2016. MMWR. 2016;65:1274–5.
  • Gottlieb SL, Newbern EC, Griffin PM, et al. Multistate outbreak of listeriosis linked to turkey deli meat and subsequent changes in US regulatory policy. Clin Infect Dis. 2006;42:29–36.
  • Framingham Heart Study: A Project of the National Heart, Lung, and Blood Institute and Boston University. Framingham, MA: Framingham Heart Study; 2017. https://www.framinghamheartstudy.org/
  • Jordan HT, Brackbill RM, Cone JE, et al. Mortality among survivors of the Sept 11, 2001, World Trade Center disaster: results from the World Trade Center Health Registry cohort. Lancet. 2011;378:879–87.
  • McCarthy N, Giesecke J. Case– case comparisons to study causation of common infectious diseases. Int J Epidemiol. 1999;28:764–8.
  • Rothman KJ, Greenland S. Modern epidemiology . 3rd ed. Philadelphia: Lippincott Williams & Wilkins; 2008.
  • Wacholder S, McLaughlin JK, Silverman DT, Mandel JS. Selection of controls in case–control studies. I. Principles. Am J Epidemiol. 1992;135:1019–28.
  • Chintapalli S, Goodman M, Allen M, et al. Assessment of a commercial searchable population directory as a means of selecting controls for case–control studies. Public Health Rep. 2009;124:378–83.
  • Centers for Disease Control and Prevention. Epidemiologic case studies: typhoid in Tajikistan. http://www.cdc.gov/epicasestudies/classroom_typhoid.html
  • Dunkle SE, Mba-Jonas A, Loharikar A, Fouche B, Peck M, Ayers T. Epidemic cholera in a crowded urban environment, Port-au-Prince, Haiti. Emerg Infect Dis. 2011;17:2143–6.
  • Centers for Disease Control and Prevention. Foodborne Diseases Active Surveillance Network (FoodNet): population survey. http://www.cdc.gov/foodnet/surveys/population.html
  • Pearce N. Analysis of matched case–control studies. BMJ. 2016;352:1969.
  • Centers for Disease Control and Prevention. Case studies in applied epidemiology: Oswego: an outbreak of gastrointestinal illness following a church supper. http://www.cdc.gov/eis/casestudies.html
  • McCarty CL, Angelo K, Beer KD, et al. Notes from the field.: large outbreak of botulism associated with a church potluck meal—Ohio, 2015. MMWR. 2015;64:802–3.
  • Tostmann A, Bousema JT, Oliver I. Investigation of outbreaks complicated by universal exposure. Emerg Infect Dis. 2012;18:1717–22.

< Previous Chapter 6: Describing Epidemiologic Data

Next Chapter 8: Analayzing and Interpreting Data >

The fellowship application period will be open March 1-June 5, 2024.

The host site application period is closed.

For questions about the EIS program, please contact us directly at [email protected] .

  • Laboratory Leadership Service (LLS)
  • Fellowships and Training Opportunities
  • Division of Workforce Development

Exit Notification / Disclaimer Policy

  • The Centers for Disease Control and Prevention (CDC) cannot attest to the accuracy of a non-federal website.
  • Linking to a non-federal website does not constitute an endorsement by CDC or any of its employees of the sponsors or the information and products presented on the website.
  • You will be subject to the destination website's privacy policy when you follow the link.
  • CDC is not responsible for Section 508 compliance (accessibility) on other federal or private website.

User Preferences

Content preview.

Arcu felis bibendum ut tristique et egestas quis:

  • Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris
  • Duis aute irure dolor in reprehenderit in voluptate
  • Excepteur sint occaecat cupidatat non proident

Keyboard Shortcuts

7.2.1 - case-cohort study design.

A case-cohort study is similar to a nested case-control study in that the cases and non-cases are within a parent cohort; cases and non-cases are identified at time \(t_1\), after baseline. In a case-cohort study, the cohort members were assessed for risk factors at any time prior to \(t_1\). Non-cases are randomly selected from the parent cohort, forming a subcohort. No matching is performed.

Advantages of Case-Cohort Study:

Similar to nested case-control study design:

  • Efficient– not all members of the parent cohort require diagnostic testing
  • Flexible– allows testing hypotheses not anticipated when the cohort was drawn \((t_0)\)
  • Reduces selection bias – cases and noncases sampled from the same population
  • Reduced information bias – risk factor exposure can be assessed with investigator blind to case status

Other advantages, as compared to nested case-control study design:

  • The subcohort can be used to study multiple outcomes
  • Risk can be measured at any time up to \(t_1\) (e.g. elapsed time from a variable event, such as menopause, birth)
  • Subcohort can be used to calculate person-time risk

Disadvantages of Case-Cohort Study:

As compared to nested case-control study design:

  • subcohort may have been established after \(t_0\)
  • exposure information collected at different times (e.g. potential for sample deterioration)

Statistical Analysis for Case-Cohort Study:

Weighted Cox proportional hazards regression model (we will look at proportional hazards regression later in this course)

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 27 May 2024

Hemodynamic parameters and diabetes mellitus in community-dwelling middle-aged adults and elders: a community-based study

  • Tzu-Wei Wu 1 ,
  • Yih-Jer Wu 1 , 2 , 3 , 4 ,
  • Chao-Liang Chou 1 , 5 ,
  • Chun-Fang Cheng 6 ,
  • Shu-Xin Lu 5 &
  • Li-Yu Wang 1  

Scientific Reports volume  14 , Article number:  12032 ( 2024 ) Cite this article

Metrics details

  • Prognostic markers

Hemodynamic parameters have been correlated with stroke, hypertension, and arterial stenosis. While only a few small studies have examined the link between hemodynamics and diabetes mellitus (DM). This case-control study enrolled 417 DM patients and 3475 non-DM controls from a community-based cohort. Peak systolic velocity (PSV), end-diastolic velocity (EDV), blood flow velocity (MFV), pulsatility index (PI), and the resistance index (RI) of the common carotid arteries were measured by color Doppler ultrasonography. Generalized linear regression analyses showed that as compared to the non-DM controls, the age-sex-adjusted means of PSV, EDV, and MFV were − 3.28 cm/sec, − 1.94 cm/sec, and − 2.38 cm/sec, respectively, lower and the age-sex-adjusted means of RI and PI were 0.013 and 0.0061, respectively, higher for the DM cases (all p -values < 0.0005). As compared to the lowest quartiles, the multivariable-adjusted ORs of DM for the highest quartiles of PSV, EDV, MFV, RI, and PI were 0.59 (95% confidence interval [CI] 0.41–0.83), 0.45 (95% CI 0.31–0.66), 0.53 (95% CI 0.37–0.77), 1.61 (95% CI 1.15–2.25), and 1.58 (95% CI 1.12–2.23), respectively. More importantly, the additions of EDV significantly improved the predictabilities of the regression models on DM. As compared to the model contained conventional CVD risk factors alone, the area under the receiver operating curve (AUROC) increased by 1.00% (95% CI 0.29–1.73%; p  = 0.0059) and 0.80% (95% CI 0.15–1.46%; p  = 0.017) for models that added EDV in continuous and quartile scales, respectively. Additionally, the additions of PSV and MFV also significantly improved the predictabilities of the regression models (all 0.01 <  p -value < 0.05). This study reveals a significant correlation between DM and altered hemodynamic parameters. Understanding this relationship could help identify individuals at higher risk of DM and facilitate targeted preventive strategies to reduce cardiovascular complications in DM patients.

Similar content being viewed by others

case control study and cohort study

Increased arterial stiffness and cardiovascular risk prediction in controlled hypertensive patients with coronary artery disease: post hoc analysis of FMD-J (Flow-mediated Dilation Japan) Study A

case control study and cohort study

Left ventricular mass versus pulse wave velocity as predictors of coronary artery disease in hypertensive patients: data from a 6-year-follow-up study

case control study and cohort study

Relationship between arterial stiffness and chronic kidney disease in patients with primary hypertension

Atherosclerosis is a chronic disease that causes the occlusion of arteries by the accumulation of plaques within the arterial intima 1 . These plaques consist of lipids, predominantly low-density lipoprotein (LDL), and inflammatory cells, such as macrophages that transform into foam cells after phagocytosing lipids 2 , 3 . Atherosclerosis advances gradually and often asymptomatically, but it can be aggravated by other factors such as hypertension 4 . As the plaques enlarge, they can impair blood flow and induce shear stress in the vessel wall. This can provoke the erosion of vulnerable plaques and the generation of thrombi that can occlude the artery or embolize other organs 5 . Atherosclerosis can result in severe cardiovascular complications such as myocardial infarction and stroke, which are among the leading causes of mortality worldwide 6 , 7 . Atherosclerosis is especially common in developed countries, but it is also increasing in developing countries 8 . In Taiwan, for instance, five of the top ten causes of mortality are associated with atherosclerosis 9 .

Hemodynamics is the study of blood flow and the forces acting on the blood vessels and the heart. The relationship between atherosclerosis and hemodynamics is complex and bidirectional. On one hand, hemodynamic shear stress can influence the development and progression of atherosclerosis by modulating the phenotype and function of endothelial cells and smooth muscle cells, and by promoting or inhibiting inflammation, oxidative stress, lipid accumulation, and matrix remodeling in the arterial wall 10 , 11 , 12 . On the other hand, atherosclerosis can alter the geometry and elasticity of the arteries, which can affect the hemodynamic patterns and parameters such as pressure, flow, velocity, and shear stress 13 . These changes can further influence the stability and rupture risk of atherosclerotic plaques. Key hemodynamic parameters include peak systolic velocity (PSV), end-diastolic velocity (EDV), and mean blood flow velocity (MFV) measured by Doppler ultrasonography. Pulsatility index (PI) and resistance index (RI) were secondary parameters calculated from velocities 14 , 15 and were accepted as methods of examing microcirculation with a variety of clinical applications 16 . PI is defined as the difference between PSV and EDV, divided by MFV, and RI is defined as the difference between PSV and EDV, divided by PSV.

Diabetes mellitus (DM) is a metabolic disorder characterized by chronic hyperglycemia that induces polyuria, polydipsia, and polyphagia. DM results from inadequate insulin secretion and/or impaired insulin action in the target tissues 17 . There are two main types of diabetes: type 1 and type 2. Type 1 diabetes is an autoimmune disease that causes β-cell destruction in the pancreatic islets. It typically manifests in children and adolescents and necessitates exogenous insulin therapy. Type 2 diabetes is more prevalent and involves insulin resistance that exacerbates as the β-cell function deteriorates 18 . DM affects over 450 million people worldwide and accounts for 4.2 million deaths annually 19 . DM is diagnosed by assessing fasting and post-load plasma glucose levels.

Clinically, DM is associated with increased risks of vascular events, including carotid artery diseases 20 , 21 . Our previous study demonstrated the prevalence of DM is significantly associated with the development and severity of carotid atherosclerosis 22 . Later we identified 9 DM SNPs showing promising associations with the presence of carotid plaque in a community-based case-control study 23 . The associations of hemodynamics and carotid pulsatility with DM were noted in a few studies previously 24 , 25 , 26 . However, this clinical correlation is not fully explored. In this community-based case-control study, the relationship between DM and hemodynamic parameters was investigated in more than 3800 subjects, including 417 DM patients and 3475 non-DM controls, from the Northern coast of Taiwan.

Study subjects

The study subjects were recruited from our two previous community-based cohort studies that enrolled 40–74-year-old middle-aged adults and elders residing in the five districts in the northern coastal area of Taiwan for at least six months 22 , 27 . Cohort I and II enrolled study subjects from September 2010 to May 2011 and from September 2014 to May 2020, respectively. During each period, well-informed invitation letters describing the objective and protocols of the study were sent to households with eligible subject(s), and recruitment sites were set up at the local health stations, schools, or community activity centers. Residents who were willing to complete a structured questionnaire regarding personal health information and willing to provide blood samples were recruited. A total of 4102 residents voluntarily provided informed consent and were enrolled. Subjects who had a positive history of physician-diagnosed myocardial infarction or had ever received a cardiac catheter or stent (n = 165) and who were without a proper flow pattern sample (n = 45) were excluded, leaving a total of 3892 middle-aged adults and elders in this study. The study complied with the 1975 Helsinki Declaration on ethics in medical research and was reviewed and approved by the institutional review boards of MacKay Medical College (No. P990001) and MacKay Memorial Hospital (No. 14MMHIS075).

Anthropometric and biochemical measurements

The measurements of baseline anthropometric and clinical characteristics were described previously 27 , 28 . In brief, blood pressure was measured three times by a digital system (UDEX-Twin; ELK Co., Daejon, Korea) in the morning after 10 min of rest. Three blood pressure measurements, with an interval of ≥ 3 minutes, were made for each participant. The averages of repeated measurements of systolic blood pressure (SBP) and diastolic blood pressure (DBP) were used for analyses.

A venous blood sample was collected from each participant for blood lipids and glucose analyses after at least 10 hours of fasting. We used an autoanalyzer (Toshiba TBA c16000; Toshiba Medical System, Holliston, MA, USA) to determine the blood levels of lipids, including total cholesterol (TCHO), high-density lipoprotein cholesterol (HDL-C), low-density lipoprotein cholesterol (LDL-C), and triglycerides (FTG), and glucose (FPG) with commercial kits (Denka Seiken, Tokyo, Japan).

In this study, DM was defined as FPG ≥ 126 mg/dL or the use of insulin or other hypoglycemic agents. Hypertension was defined as SBP ≥ 140 mmHg, DBP ≥ 90 mmHg, or a history of taking antihypertensive medications. Current cigarette smoking was defined as having smoked cigarettes at least 4 days per week during the past month before enrollment. Current alcohol drinking was defined as having drunk alcohol-containing beverages at least 4 days per week during the past month before enrollment.

Ultrasonographic measurements of carotid blood flow

In the study, blood flows, including PSV, EDV, and MFV, of extracranial carotid arteries, were measured at the middle segment of the bilateral common carotid arteries by color Doppler ultrasonography. The ultrasonographic systems (GE Healthcare Logie E, Vivid 7, and Vivid E9; General Electric Company, Milwaukee, USA), which were equipped with a multi-frequency linear array transducer, were operated by two experienced technicians who were blind to patients’ clinical profiles. Each participant was examined in the supine position with his/her head turned 45° from the site being measured. An insonation angle equal to or less than 60° and a sample volume size covering 1/2–2/3 of the arterial lumen were maintained for all Doppler measurements. In the study, a proper flow pattern sample was defined as subjects with at least 3 waveforms with similar patterns. The subject’s PI and RI were calculated as (PSV-EDV)/MFV and (PSV-EDV)/PSV, respectively. In the study, the averages of the measurements of the right and left common carotid arteries were used for statistical analyses.

Statistical analyses

In this study, the student t-test and one-way analysis of variance were used to test the significance of means of continuous measurements among groups. Logarithmic transformation was performed for continuous random variables with positive skewness. The Chi-square test was used to test the significance of the associations between DM status and categorical variables. The effects of age, sex, and DM on the carotid hemodynamic parameters were assessed by the generalized linear regression analyses. The odds ratio (OR), which was estimated by the unconditional logistic regression model, was used as the indicator of the strength of association between carotid hemodynamic parameters and DM status. To assess the independent effects of carotid hemodynamic parameters on DM, we used multivariable logistic regression analyses to control for the confounding effects of other conventional cardiovascular risk factors. The area under the receiver operating curve (AUROC) was used as the indicator of the predictability of the regression model on DM. To explore whether there were interactions between hemodynamic biomarkers and other significant factors on the likelihoods of having DM, we carried out stratified analyses. For continuous variables, the values close to the medians in the non-DM subjects were used as the cut-points. We used the statistical method proposed by Clogg et al. 29 to test the significance in the regression coefficients between two groups. All statistical analyses were performed using SAS 9.4 (SAS Institute Inc., Cary, NC, USA).

Ethics approval and consent to participate

The study complied with the 1975 Helsinki Declaration on ethics in medical research and was reviewed and approved by the institutional review boards of MacKay Medical College (No. P990001, granted date: 2010/7/5) and MacKay Memorial Hospital (No. 14MMHIS075, Granted date: 2014/5/23).

Among 3892 participants, 417 (10.7%) of them fulfilled the DM definition and were regarded as cases. Table 1 shows that all baseline anthropometric and biochemical measurements, except for alcohol drinking, were significantly different between DM cases and non-DM controls. As compared to the non-DM controls, DM cases had significantly higher means of age, body mass index (BMI), waist circumference (WC), hip circumference (HIP), waist-to-hip ratio (WHR), blood pressure, and Log (TG) and higher proportions of the male sex, hypertension, schooling years < 12 years, and cigarette smoking. The means of TCHO, LDL-C, and HDL-C of DM cases were significantly lower than those of the non-DM controls.

Multivariable logistic regression analyses of the conventional cardiovascular risk factors showed that older age, hypertension, fewer schooling years, cigarette smoking, higher BMI, higher WHR, and higher TG were correlated with significantly higher ORs of having DM (Table 2 ). The multivariable-adjusted ORs of having DM with TCHO and HDL-C levels were significantly inverse. The multivariable-adjusted ORs for per 1.0 SD increases in BMI, WHR, TCHO, HDL-C, and log(TG) were 1.24 (95% CI 1.11–1.40), 1.30 (95% CI 1.14–1.48), 0.77 (95% CI 0.67–0.87), 0.84 (95% CI 0.71–0.99), and 1.98 (95% CI 1.56–2.51) respectively.

The effects of age, sex, and DM on carotid blood flows, RI, and PI are shown in Table 3 . As compared to female subjects, male subjects had significantly higher means of PSV, PI, and RI and significantly lower means of EDV and MFV (all p -values < 0.0001). The means of these five carotid hemodynamic parameters were all significantly different among seven age groups (all p -values < 0.0001). The means (SD) of PSV, EDV, and MFV for subjects aged 40–44 years were 95.1 (17.6) cm/sec, 26.1 (5.6) cm/sec, and 44.4 (8.0) cm/sec, respectively, for subjects aged 55–59 years were 84.5 (17.4) cm/sec, 24.4 (5.4) cm/sec, and 41.4 (7.9) cm/sec, respectively, and for subjects aged 70–74 years were 75.9 (16.7) cm/sec, 18.2 (4.6) cm/sec, and 34.5 (7.2) cm/sec, respectively. The means of RI and PI were lower for subjects aged 45–54 years and were higher for elderly subjects. Table 3 also shows that DM cases had significantly lower means of PSV, EDV, and MFV and significantly higher means of RI and PI as compared to the non-DM controls (all p -values < 0.0001).

The results of generalized linear regression analyses were also shown in Table 3 . The age trends for PSV, EDV, and MFV were significantly negative while for RI and PI were significantly positive. The adjusted regression coefficients of PSV, EDV, MFV, RI, and PI per 5.0 years increase in age at enrollment were − 3.17 cm/sec, − 1.17 cm/sec, − 1.51 cm/sec, 0.0038, and 0.0079, respectively (all p -values < 0.005). As compared to female subjects, male subjects had significantly higher adjusted means for PSV, RI, and PI, while exhibiting significantly lower adjusted means for EDV and MFV (all p -values < 0.0001). After adjustment for the effects of age and sex, the effects of DM status on all five carotid hemodynamic parameters remained statistically significant. As compared to the non-DM controls, the adjusted means of PSV, EDV, and MFV were − 3.28 cm/sec ( p  = 0.0003), − 1.94 cm/sec ( p  < 0.0001), and − 2.38 cm/sec ( p  < 0.0001), respectively, lower for the DM cases. The age-sex-adjusted means of RI and PI of DM cases were 0.013 and 0.0061 (both p -values < 0.0001), respectively, higher than those of the non-DM controls.

Table 4 shows that the prevalence rates of DM were negatively correlated with increased levels of PSV, EDV, and MFV and were positively correlated with increased levels of RI and PI. The prevalence rates of DM for subjects whose carotid blood flows were of the lowest quartile (Q1) and the highest quartile (Q4) ranged from 14.1 to 17.5% and from 4.9 to 5.1%, respectively. The prevalence rates of DM for subjects who had Q1 levels of RI or PI were approximately 7.0% and for Q4 levels of RI or PI were approximately 16.0%. As compared to subjects who had Q1 levels of carotid blood flows, subjects who had Q4 levels of PSV, EDV, and MFV had significantly decreased ORs of having DM. The corresponding age-sex-adjusted ORs were 0.51 (95% CI 0.37–0.72), 0.37 (95% CI 0.26–0.54), and 0.40 (95% CI 0.28–0.57), respectively. The age-sex-adjusted ORs were significantly increased for subjects who had Q3 and Q4 levels of RI and PI as compared to those who had Q1 levels of RI and PI.

The results of multivariable analyses showed that the multivariable-adjusted ORs of having DM remained statistically significant for subjects who had Q4 levels of PSV, EDV, MFV, RI, and PI, relative to those with Q1 levels (Table 4 ). The corresponding multivariable-adjusted ORs of having DM were 0.59 (95% CI 0.41–0.83), 0.45 (95% CI 0.31–0.66), 0.53 (95% CI 0.37–0.77), 1.61 (95% CI 1.15–2.25), and 1.58 (95% CI 1.12–2.23), respectively. As compared to those who had a Q1 level of EDV, subjects who had a Q3 level of EDV also had a significantly lower OR (0.63; 95% CI 0.46–0.87). The multivariable-adjusted ORs of having DM per 5.0 cm/sec increase in PSV, EDV, and MFV were 0.95 (95% CI 0.92–0.98), 0.74 (95% CI 0.66–0.83), and 0.86 (95% CI 0.80–0.93), respectively. Increased PI and RI were significantly positively correlated with the likelihood of DM. The multivariable-adjusted ORs of having DM per 0.1 increases in RI was 1.52 (95% CI 1.21–1.91) and for per 1.0 increase in PI was 1.49 (95% CI 1.05–2.12).

The comparisons of the predictabilities of the regression models that contained different carotid hemodynamic parameters are shown in Table 5 . The AUROC for the basic model, i.e., the most predictive model selected from the regression analyses which contained all significantly conventional cardiovascular risk factors, was 0.7578 (95% CI 0.7346–0.7809). The results of multivariable logistic regression analyses showed that EDV was the most significantly independent predictor of DM. The AUROC were 0.7658 (95% CI 0.7430–0.7885) and 0.7678 (95% CI 0.7453–0.7904) for models adding EDV as a continuous or a categorical variable, respectively. The additions of PSV and MFV also significantly increased the predictabilities of DM status but with smaller added AUROC (Table 5 ).

To explore whether there were interactive effects between EDV and conventional CVD risk factors on the likelihoods of having DM, we carried out stratified analyses. Table 6 shows that increased EDV were correlated with significantly decreased ORs of having DM in all strata. The regression coefficient (SE) for per 5 cm/sec increase in EDV for subjects aged < 55 year was non-significantly different that of subjects aged ≥ 55 years (− 0.229 (0.096) vs. − 0.414 (0.069), p  = 0.12). Similarly, there was no significant difference in the regression coefficients between two strata of other factors.

In this study, we conducted a community-based case-control study, in which we enrolled approximately 4000 subjects aged 40–74 residing in the northern coastal area of Taiwan. In the case-control study, large numbers of DM cases and non-DM controls received color Doppler ultrasonographic measurements, including PSV, EDV, MFV, PI, and RI. We found significant age and sex effects on these hemodynamic parameters. After adjustment for the effects of age and sex, all these five carotid hemodynamic parameters remained significantly influenced by DM status. As compared to the non-DM controls, the adjusted means of PSV, EDV, and MFV were significantly lower and the adjusted means of RI and PI were significantly higher for the DM cases. We also found that after controlling for the effects of other conventional CVD risk factors, the multivariable-adjusted ORs of having DM were negatively correlated with PSV, EDV, and MFV and were positively correlated with PI and RI. More importantly, the additions of PSV, EDV, and MFV, either in categorical or continuous scales, significantly improved the predictabilities of the regression models on DM status and among them EDV was the most significantly independent predictor.

Pulsatility is a crucial aspect of the cardiovascular system, linked to artery elasticity. The natural pressure pulsations from each left ventricle contraction are reduced by the elasticity of large arteries. The aorta's expansion stores part of the stroke volume, lessening pulsatile stress on microvasculature 30 . However, with the loss of elastic fiber with age and disorders of metabolism, such as hyperlipidemia or DM, arterial walls continually increase their stiffness resulting in a gradual increase in blood pressure and, eventually affecting global cardiovascular health 16 . Pulsatile hemodynamics can be measured with invasive or non-invasive methods. Inserting an intraarterial catheter is the most accurate method of assessing pulsatile hemodynamics, however, multiple studies indicated that non-invasive methods could be reasonable surrogates for invasive ones 31 , 32 , 33 . Hemodynamic parameters including blood velocities such as PSV, EDV, and MFV as well as PI and RI were used to study their clinical correlation with different cardiovascular conditions including but not limited to stroke 34 , 35 , 36 , 37 , 38 , 39 , hypertension 40 , 41 , 42 , arterial stenosis 43 , 44 , 45 , 46 .

Prolonged hyperglycemia in patients with DM can damage the vascular endothelium leading to an increase in vascular stiffness and likely a change in hemodynamics 47 . The increase in the stiffness of large vessels can result in increased pulsation and microvascular complications 48 . Several studies have shown possible applications of hemodynamic parameters in predicting and preventing microvascular complications. In 2000, Lee et al. first studied 56 type 2 DM patients and 70 controls and measured their flow velocities and PI of the middle cerebral artery (MCA), extracranial internal carotid artery (ICA), and basilar artery (BA) 25 . They found that PIs of the MCA and ICA were closely correlated with the duration of DM. Some of these studies were lack of sex and age-matched controls 49 , 50 , 51 , 52 while some studies were designed to test the effect of drugs with only DM patients 53 , 54 . In studies with sex and age-matched controls, Agha et al. measured the velocity and PI of BA, ICA, and MCA in 141 DM patients and 132 controls 55 ; Dikanovic et al. measured the velocity and PI of MCA in 100 type 2 DM patients and 100 controls 26 ; Park et al. measured the velocity and PI of MCA in 90 type 2 DM patients and 45 controls 56 ; Zou et al. measured the velocity, PI and RI of dorsalis pedis artery and plantar digital artery in 56 type 2 DM patients and 50 controls 57 . All of these studies came to the same conclusion as we did that hemodynamic parameters including velocities, PI, and RI can be useful indicators and predictors of DM. However, none of them perform their studies at the same large scale as we did.

In a previous study, we included 4073 participants from the same study area, with prevalence rates of carotid plaque and DM at 35.4% and 11.3%, respectively 22 . The study found statistically significant linear trends between the likelihood of having DM and the total number of carotid plaques, maximum carotid stenosis, or severity of carotid atherosclerosis. The multivariate-adjusted odds ratio (OR) for DM was 1.57 (1.25–1.98), indicating a significantly higher risk for subjects with carotid plaques compared to those without observable plaque images. Furthermore, a greater number of carotid plaques, increased maximum carotid atherosclerosis, and more severe carotid atherosclerosis were associated with significantly higher ORs for DM. The prevalence rate of carotid plaque in the prevalent DM group was also significantly higher than in the incident DM group. In our most recent case-control study, we enrolled 309 carotid plaque-positive subjects and 439 carotid plaque-negative subjects from a community-based cohort 23 . Multivariable analyses of anthropometric attributes and biochemical profiles revealed that DM was a significant independent predictor in the best-fit regression model for the presence of carotid plaque. Among the 43 tested DM SNPs, 9 showed promising associations with carotid atherosclerosis, controlling for age, cigarette smoking, and hypertension. Although not all of these promising SNPs demonstrated significant independent effects in the multivariable analyses, a notable linear trend between their composite indicator 9-GCS and the risks of carotid atherosclerosis was observed. We identified four SNPs (rs9937354, rs10842993, rs7180016, and rs4383154) that exhibited significant independent effects with carotid atherosclerosis. Genes that are closely associated with these SNPs include FTO, PRC1, GP2, and KLHL42.

Several potential mechanisms of increased arterial stiffness and altered hemodynamics in DM have been implicated including the formation of advanced glycation endproducts (AGEs) and the dysregulation of nitric oxide (NO) 58 . The formation of AGE involves multiple reversible and irreversible steps, ultimately leading to the pathological binding of collagen molecules within the arterial vessel wall 59 . Numerous studies have linked AGEs to the acceleration of age-related vascular changes and the development of cardiovascular events in both diabetic and non-diabetic populations 60 . The presence of AGE-induced cross-links can make collagen highly resistant to enzymatic breakdown, resulting in a reduced degradation rate. This, in turn, contributes to the increased collagen content observed in arterial walls, which is a characteristic of aging and is further accelerated in conditions such as DM 61 . Research has shown a positive correlation between carotid-femoral pulse wave velocity and collagen crosslinking 62 . Moreover, the levels of specific AGEs in aortic tissue have been found to correlate with aortic stiffness in individuals with and without DM 63 . NO possesses various beneficial properties, including vasodilation, anti-platelet activity, anti-inflammatory effects, and antioxidant properties 64 . However, in the state of insulin resistance, the activation of NO synthase is impaired, and there is an increase in the production of superoxide. These factors together contribute to a decrease in the availability of NO 65 . In individuals with diabetes, particularly those with microvascular disease, basal levels of NO are reduced compared to those without such complications. Furthermore, the severity of microvascular disease correlates with a further decline in NO levels 66 . Further mechanical studies including gene-association studies based on our current findings will provide insight into finding therapeutic targets for atherosclerosis and related complications in DM patients.

The findings of this study highlight a noteworthy association between DM and changes in hemodynamic parameters. Adding hemodynamic parameters enhanced the predictabilities of the regression models on DM status. Gaining a deeper understanding of this relationship can aid in identifying individuals who are at a heightened risk of DM. Future follow-up and mechanical studies will enlighten us on factors that contribute to the development of vascular complications in DM patients.

Data availability

The datasets used and/or analyzed during the current study are available from the corresponding author upon reasonable request.

Abbreviations

Area under the receiver operating curve

Body mass index

Common carotid artery

Coronary heart disease

Total cholesterol

Confidence interval

Diastolic blood pressure

  • Diabetes mellitus

End-diastolic velocity

Fasting plasma glucose

Fasting triglycerides

High-density lipoprotein cholesterol

Hip circumference

Low-density lipoprotein cholesterol

Mean blood flow velocity

  • Pulsatility index

Peak systolic velocity

  • Resistance index

Systolic blood pressure

Standard deviation

Standard error

Waist circumference

Waist-to-hip ratio

Wohlschlaeger, J., Bertram, S., Theegarten, D., Hager, T. & Baba, H. A. Coronary atherosclerosis and progression to unstable plaques : Histomorphological and molecular aspects. Herz 40 (6), 837–844 (2015).

Article   PubMed   Google Scholar  

Falk, E. Pathogenesis of atherosclerosis. J. Am. Coll. Cardiol. 47 (8 Suppl), C7-12 (2006).

Article   CAS   PubMed   Google Scholar  

Summerhill, V. I., Grechko, A. V., Yet, S. F., Sobenin, I. A. & Orekhov, A. N. The atherogenic role of circulating modified lipids in atherosclerosis. Int. J. Mol. Sci. 20 (14), 3561 (2019).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Vasdev, S., Gill, V. & Singal, P. Role of advanced glycation end products in hypertension and atherosclerosis: Therapeutic implications. Cell Biochem. Biophys. 49 (1), 48–63 (2007).

Galis, Z. S., Sukhova, G. K., Lark, M. W. & Libby, P. Increased expression of matrix metalloproteinases and matrix degrading activity in vulnerable regions of human atherosclerotic plaques. J. Clin. Invest. 94 (6), 2493–2503 (1994).

Robinson, J. G., Fox, K. M., Bullano, M. F., Grandy, S. & Group, S. S. Atherosclerosis profile and incidence of cardiovascular events: A population-based survey. BMC Cardiovasc. Disord. 9 , 1–8 (2009).

Article   Google Scholar  

Mozaffarian, D. et al. Executive summary: Heart disease and stroke statistics–2016 update: A report from the American heart association. Circulation 133 (4), 447–454 (2016).

Bortnick, A. E. et al. Biomarkers of mineral metabolism and progression of aortic valve and mitral annular calcification: The Multi-Ethnic Study of Atherosclerosis. Atherosclerosis 285 , 79–86 (2019).

Li, Y. H. et al. 2017 Taiwan lipid guidelines for high risk patients. J. Formos. Med. Assoc. 116 (4), 217–248 (2017).

Hastings, N. E., Simmers, M. B., McDonald, O. G., Wamhoff, B. R. & Blackman, B. R. Atherosclerosis-prone hemodynamics differentially regulates endothelial and smooth muscle cell phenotypes and promotes pro-inflammatory priming. Am. J. Physiol. Cell Physiol. 293 (6), C1824-1833 (2007).

Glagov, S., Zarins, C., Giddens, D. P. & Ku, D. N. Hemodynamics and atherosclerosis. Insights and perspectives gained from studies of human arteries. Arch. Pathol. Lab. Med. 112 (10), 1018–1031 (1988).

CAS   PubMed   Google Scholar  

Malek, A. M., Alper, S. L. & Izumo, S. Hemodynamic shear stress and its role in atherosclerosis. JAMA 282 (21), 2035–2042 (1999).

Wong, K. K. L., Wu, J., Liu, G., Huang, W. & Ghista, D. N. Coronary arteries hemodynamics: Effect of arterial geometry on hemodynamic parameters causing atherosclerosis. Med. Biol. Eng. Comput. 58 (8), 1831–1843 (2020).

Gosling, R. G. & King, D. H. Arterial assessment by doppler-shift ultrasound. Proc. R. Soc. Med. 67 (6 Pt 1), 447–449 (1974).

CAS   PubMed   PubMed Central   Google Scholar  

George, P., Pourcelot, L., Fourcade, C., Guillaud, C. & Descotes, J. The Doppler effect and measurement of the blood flow. C R Acad. Hebd. Seances Acad. Sci. D 261 (1), 253–256 (1965).

Wielicka, M., Neubauer-Geryk, J., Kozera, G. & Bieniaszewski, L. Clinical application of pulsatility index. Med. Res. J. 5 (3), 201–210 (2020).

Kharroubi, A. T. & Darwish, H. M. Diabetes mellitus: The epidemic of the century. World J. Diabetes 6 (6), 850–867 (2015).

Article   PubMed   PubMed Central   Google Scholar  

Stancakova, A. et al. Changes in insulin sensitivity and insulin release in relation to glycemia and glucose tolerance in 6414 Finnish men. Diabetes 58 (5), 1212–1221 (2009).

Saeedi, P. et al. Global and regional diabetes prevalence estimates for 2019 and projections for 2030 and 2045: Results from the International Diabetes Federation Diabetes Atlas, 9(th) edition. Diabetes Res. Clin. Pract. 157 , 107843 (2019).

Grant, P. J. & Cosentino, F. The 2019 ESC Guidelines on diabetes, pre-diabetes, and cardiovascular diseases developed in collaboration with the EASD. Eur. Heart J. 40 (39), 3215–3217 (2019).

Katsiki, N. & Mikhailidis, D. P. Diabetes and carotid artery disease: A narrative review. Ann. Transl. Med. 8 (19), 1280 (2020).

Wu, T. W., Chou, C. L., Cheng, C. F., Lu, S. X. & Wang, L. Y. Prevalences of diabetes mellitus and carotid atherosclerosis and their relationships in middle-aged adults and elders: A community-based study. J. Formos. Med. Assoc. 121 (6), 1133–1140 (2022).

Wu, T. W. et al. Associations of genetic markers of diabetes mellitus with carotid atherosclerosis: A community-based case-control study. Cardiovasc. Diabetol. 22 (1), 51 (2023).

Lau, K. K. et al. Age and sex-specific associations of carotid pulsatility with small vessel disease burden in transient ischemic attack and ischemic stroke. Int. J. Stroke 13 (8), 832–839 (2018).

Lee, K. Y., Sohn, Y. H., Baik, J. S., Kim, G. W. & Kim, J. S. Arterial pulsatility as an index of cerebral microangiopathy in diabetes. Stroke 31 (5), 1111–1115 (2000).

Dikanovic, M. et al. Transcranial Doppler ultrasound assessment of intracranial hemodynamics in patients with type 2 diabetes mellitus. Ann. Saudi Med. 25 (6), 486–488 (2005).

Wu, T. W. et al. Differential patterns of effects of age and sex on metabolic syndrome in Taiwan: implication for the inadequate internal consistency of the current criteria. Diabetes Res. Clin. Pract. 105 (2), 239–244 (2014).

Chou, C. L. et al. Segment-specific prevalence of carotid artery plaque and stenosis in middle-aged adults and elders in Taiwan: A community-based study. J. Formos Med. Assoc. 118 (1 Pt 1), 64–71 (2019).

Clogg, C. C., Petkova, E. & Haritou, A. Statistical methods for comparing regression coefficients between models. Am. J. Sociol. 100 (5), 1261–1293 (1995).

Climie, R. E. et al. Measuring the interaction between the macro- and micro-vasculature. Front. Cardiovasc. Med. 6 , 169 (2019).

Kang, J. et al. Relationship between brachial-ankle pulse wave velocity and invasively measured aortic pulse pressure. J. Clin. Hypertens (Greenwich) 20 (3), 462–468 (2018).

Weber, T., Wassertheurer, S., Hametner, B., Parragh, S. & Eber, B. Noninvasive methods to assess pulse wave velocity: Comparison with the invasive gold standard and relationship with organ damage. J. Hypertens. 33 (5), 1023–1031 (2015).

Yamashina, A. et al. Validity, reproducibility, and clinical significance of noninvasive brachial-ankle pulse wave velocity measurement. Hypertens. Res. 25 (3), 359–364 (2002).

Chuang, S. Y. et al. Blood pressure, carotid flow pulsatility, and the risk of stroke: A community-based study. Stroke 47 (9), 2262–2268 (2016).

van Sloten, T. T. et al. Carotid stiffness is associated with incident stroke: A systematic review and individual participant data meta-analysis. J. Am. Coll. Cardiol. 66 (19), 2116–2125 (2015).

Chuang, S. Y. et al. Common carotid end-diastolic velocity and intima-media thickness jointly predict ischemic stroke in Taiwan. Stroke 42 (5), 1338–1344 (2011).

Chuang, S. Y. et al. Common carotid artery end-diastolic velocity is independently associated with future cardiovascular events. Eur. J. Prev. Cardiol. 23 (2), 116–124 (2016).

Asil, T., Uzunca, I., Utku, U. & Berberoglu, U. Monitoring of increased intracranial pressure resulting from cerebral edema with transcranial Doppler sonography in patients with middle cerebral artery infarction. J. Ultrasound Med. 22 (10), 1049–1053 (2003).

Hitsumoto, T. Relationships between the cardio-ankle vascular index and pulsatility index of the common carotid artery in patients with cardiovascular risk factors. J. Clin. Med. Res. 11 (8), 593–599 (2019).

Cho, S. J., Sohn, Y. H., Kim, G. W. & Kim, J. S. Blood flow velocity changes in the middle cerebral artery as an index of the chronicity of hypertension. J. Neurol. Sci. 150 (1), 77–80 (1997).

Bardelli, M., Jensen, G., Volkmann, R. & Aurell, M. Non-invasive ultrasound assessment of renal artery stenosis by means of the Gosling pulsatility index. J. Hypertens. 10 (9), 985–989 (1992).

Sasaki, N., Yamamoto, H., Ozono, R., Maeda, R. & Kihara, Y. Association of common carotid artery measurements with n-terminal pro b-type natriuretic peptide in elderly participants. Intern. Med. 59 (7), 917–925 (2020).

Rustempasic, N. & Gengo, M. Assesment of carotid stenosis with CT angiography and color doppler ultrasonography. Med. Arch. 73 (5), 321–325 (2019).

Vigen, T. et al. Carotid atherosclerosis is associated with middle cerebral artery pulsatility index. J. Neuroimaging 30 (2), 233–239 (2020).

Wong, N. D. et al. Atherosclerotic cardiovascular disease risk assessment: An American Society for Preventive Cardiology clinical practice statement. Am. J. Prev. Cardiol. 10 , 100335 (2022).

Bytyci, I., Shenouda, R., Wester, P. & Henein, M. Y. Carotid atherosclerosis in predicting coronary artery disease: A systematic review and meta-analysis. Arterioscler. Thromb. Vasc. Biol. 41 (4), e224–e237 (2021).

Dec-Gilowska, M. et al. Circulating endothelial microparticles and aortic stiffness in patients with type 2 diabetes mellitus. Medicina 55 (9), 596 (2019).

Climie, R. E. D. et al. Pulsatile interaction between the macro-vasculature and micro-vasculature: Proof-of-concept among patients with type 2 diabetes. Eur. J. Appl. Physiol. 118 (11), 2455–2463 (2018).

Soyoye, D. O. et al. Relationship between renal doppler indices and biochemical indices of renal function in type 2 diabetes mellitus. West Afr. J. Med. 35 (3), 189–194 (2018).

Fukuhara, T. & Hida, K. Pulsatility index at the cervical internal carotid artery as a parameter of microangiopathy in patients with type 2 diabetes. J. Ultrasound Med. 25 (5), 599–605 (2006).

Janssen, A. Pulsatility index is better than ankle-brachial doppler index for non-invasive detection of critical limb ischaemia in diabetes. Vasa 34 (4), 235–241 (2005).

Kozera, G. M. et al. Cerebral and skin microcirculatory dysfunction in type 1 diabetes. Postepy Dermatol. Alergol. 36 (1), 44–50 (2019).

Onmez, A., Gokosmanoglu, F., Baycelebi, G. & Arikan, A. A. Carotid Doppler ultrasonographic findings of dapagliflozin use in type 2 diabetic patients. Aging Male 23 (5), 1246–1250 (2020).

Park, J. S. et al. The effects of pioglitazone on cerebrovascular resistance in patients with type 2 diabetes mellitus. Metabolism 56 (8), 1081–1086 (2007).

Agha, M. S. & Alboudi, A. Arterial pulsatility as an index of cerebral microangiopathy in diabetes type 2. East Mediterr. Health J. 19 (Suppl 3), S198-203 (2014).

PubMed   Google Scholar  

Park, J. S. et al. Cerebral arterial pulsatility and insulin resistance in type 2 diabetic patients. Diabetes Res. Clin. Pract. 79 (2), 237–242 (2008).

Zou, C. et al. Differences between healthy adults and patients with type 2 diabetes mellitus in reactivity of toe microcirculation by ultrasound combined with a warm bath test. Med. (Baltimore) 96 (22), e7035 (2017).

Prenner, S. B. & Chirinos, J. A. Arterial stiffness in diabetes mellitus. Atherosclerosis 238 (2), 370–379 (2015).

Powell, J. T., Vine, N. & Crossman, M. On the accumulation of D-aspartate in elastin and other proteins of the ageing aorta. Atherosclerosis 97 (2–3), 201–208 (1992).

Sell, D. R. & Monnier, V. M. Molecular basis of arterial stiffening: role of glycation - a mini-review. Gerontology 58 (3), 227–237 (2012).

Schnider, S. L. & Kohn, R. R. Effects of age and diabetes mellitus on the solubility and nonenzymatic glucosylation of human skin collagen. J. Clin. Invest. 67 (6), 1630–1635 (1981).

Monnier, V. M. et al. Relation between complications of type I diabetes mellitus and collagen-linked fluorescence. N. Engl. J. Med. 314 (7), 403–408 (1986).

Sims, T. J., Rasmussen, L. M., Oxlund, H. & Bailey, A. J. The role of glycation cross-links in diabetic vascular stiffening. Diabetologia 39 (8), 946–951 (1996).

Kawashima, S. The two faces of endothelial nitric oxide synthase in the pathophysiology of atherosclerosis. Endothelium 11 (2), 99–107 (2004).

Du, X. et al. Insulin resistance reduces arterial prostacyclin synthase and eNOS activities by increasing endothelial fatty acid oxidation. J. Clin. Invest. 116 (4), 1071–1080 (2006).

Brillante, D. G., O’Sullivan, A. J., Johnstone, M. T. & Howes, L. G. Arterial stiffness and haemodynamic response to vasoactive medication in subjects with insulin-resistance syndrome. Clin. Sci. (Lond) 114 (2), 139–147 (2008).

Download references

Acknowledgements

We thank the staff in the district health station of Tamsui District, Sanzhi District, and Shimen District, New Taipei City, for their administrative support.

This work was supported by research grants from the Council of Science and Technology of Taiwan (MOST 111-2314-B-715-007 & NSTC 112-2314-B-715-007-MY3) and MacKay Medical College (MMC-RD-110-1B-P010 & MMC-RD-111-1B-P007). The funding agencies played no role in the research.

Author information

Authors and affiliations.

Department of Medicine, MacKay Medical College, No. 46, Sec. 3, Jhong-Jheng Rd., San-Jhih District, New Taipei City, Taiwan

Tzu-Wei Wu, Yih-Jer Wu, Chao-Liang Chou & Li-Yu Wang

Institute of Biomedical Sciences, MacKay Medical College, New Taipei City, Taiwan

Cardiovascular Center, Department of Internal Medicine, MacKay Memorial Hospital, Taipei, Taiwan

Department of Medical Research, MacKay Memorial Hospital, Taipei, Taiwan

Department of Neurology, MacKay Memorial Hospital, New Taipei City, Taiwan

Chao-Liang Chou & Shu-Xin Lu

Tamsui Health Station, Department of Health, New Taipei City Government, New Taipei City, Taiwan

Chun-Fang Cheng

You can also search for this author in PubMed   Google Scholar

Contributions

T.W.W. developed the study design, analyzed and interpreted data, and wrote the manuscript. Y.J.W. interpreted the results, contributed to the discussion, and revised the manuscript. C.L.C., C.F.C., and S.X.L. contributed to the study design, interpreted results, and discussion. L.Y.W. developed the study design, analyzed data, interpreted data, wrote the manuscript, and revised the manuscript. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Tzu-Wei Wu or Li-Yu Wang .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Wu, TW., Wu, YJ., Chou, CL. et al. Hemodynamic parameters and diabetes mellitus in community-dwelling middle-aged adults and elders: a community-based study. Sci Rep 14 , 12032 (2024). https://doi.org/10.1038/s41598-024-62866-7

Download citation

Received : 11 February 2024

Accepted : 22 May 2024

Published : 27 May 2024

DOI : https://doi.org/10.1038/s41598-024-62866-7

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Carotid blood flow
  • Case-control study
  • Community-based
  • Hemodynamics

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

case control study and cohort study

  • Open access
  • Published: 17 July 2017

Clarifying the distinction between case series and cohort studies in systematic reviews of comparative studies: potential impact on body of evidence and workload

  • Tim Mathes 1 &
  • Dawid Pieper 1  

BMC Medical Research Methodology volume  17 , Article number:  107 ( 2017 ) Cite this article

100 Citations

18 Altmetric

Metrics details

Distinguishing cohort studies from case series is difficult.

We propose a conceptualization of cohort studies in systematic reviews of comparative studies. The main aim of this conceptualization is to clarify the distinction between cohort studies and case series. We discuss the potential impact of the proposed conceptualization on the body of evidence and workload.

All studies with exposure-based sampling gather multiple exposures (with at least two different exposures or levels of exposure) and enable calculation of relative risks that should be considered cohort studies in systematic reviews, including non-randomized studies. The term “enables/can” means that a predefined analytic comparison is not a prerequisite (i.e., the absolute risks per group and/or a risk ratio are provided). Instead, all studies for which sufficient data are available for reanalysis to compare different exposures (e.g., sufficient data in the publication) are classified as cohort studies.

There are possibly large numbers of studies without a comparison for the exposure of interest but that do provide the necessary data to calculate effect measures for a comparison. Consequently, more studies could be included in a systematic review. Therefore, on the one hand, the outlined approach can increase the confidence in effect estimates and the strengths of conclusions. On the other hand, the workload would increase (e.g., additional data extraction and risk of bias assessment, as well as reanalyses).

Peer Review reports

Systematic reviews that include non-randomized studies often consider different observational study designs [ 1 ]. However, the distinction between different non-randomized study designs is difficult. One key design feature to classify observational study designs is to distinguish comparative from non-comparative studies [ 2 , 3 ]. The lack of a comparison group is of particular importance for distinguishing cohort studies from case series because in many definitions, they share a main design feature of having a follow-up period examining the exposed individuals over time [ 2 , 3 ]. The only difference between cohort studies and case series in many definitions is that cohort studies compare different groups (i.e., examine the association between exposure and outcome), while case series are uncontrolled [ 3 , 4 , 5 ]. Table 1 shows an example definition [ 3 ]. The problem with this definition is that vague terms, such as comparison and examination of association, might be interpreted as an analytic comparison of at least two exposures (i.e., interventions, risk factors or prognostic factors).

For example, imagine a study of 20 consecutive patients with a certain disease that can be treated in two different ways. A study that divides the 20 patients into two groups according to the treatment received and compares the outcomes of these groups (e.g., provides aggregated absolute risks per group or a risk ratio) would be probably classified as a cohort study (the example used in the following sections is denoted “study 1”). A sample of this study type is illustrated in Fig. 1 and Table 2 .

Cohort study (vague definition)

In contrast, a publication that describes the interventions received and outcomes for each patient/case separately would probably be classified as a case series (the example in the following sections is denoted “study 2”). An example of this study type is illustrated in Fig. 2 and Table 3 . In the medical literature, the data on exposure and outcomes are usually provided in either running text or spreadsheet formats [ 6 , 7 , 8 , 9 , 10 , 11 , 12 , 13 , 14 , 15 , 16 , 17 , 18 , 19 , 20 , 21 ]. A good example is the study by Wong et al. [ 10 ]. In this study, information on placental invasion (exposure) and blood loss (outcome) is separately provided for 40 pregnant women in a table. The study by Cheng et al. is an example of a study providing information in the running text (i.e., anticoagulation management [exposure] and recovery [outcome] for paediatric stroke) [ 6 ].

Case series (vague definition)

These examples illustrate that distinguishing between cohort studies and case series is difficult. Vague definitions are probably the reason for the common confusion between study designs. A recent study found that approximately 72% of cohort studies are mislabelled as case series [ 22 ]. Many systematic reviews of non-randomized studies included cohort studies but excluded case series (see examples in [ 23 , 24 , 25 , 26 , 27 , 28 ]). Therefore, the unclear distinction between case series and cohort studies can result in inconsistent study selection and unjustified exclusions from a systematic review. The risk of misclassification is particularly high because study authors also often mislabel their study or studies are not classified by their authors at all (see examples in [ 6 , 7 , 8 , 9 , 10 , 11 , 12 , 13 , 14 , 15 , 16 , 17 , 18 , 19 , 20 , 21 ]).

We propose a conceptualization of cohort studies in systematic reviews of comparative studies. The main objective of this conceptualization is to clarify the distinction between cohort studies and case series in systematic reviews, including non-randomized comparative studies. We discuss the potential impact of the proposed conceptualization on the body of evidence and workload.

Clarifying the distinction between case series and cohort studies (the solution)

In the following report, we propose a conceptualization for cohort studies and case series (e.g., sampling) for systematic reviews, including comparative non-randomized studies. Our proposal is based on a recent conceptualization of cohort studies and case series by Dekkers et al. [ 29 ]. The main feature of this conceptualization is that it is exclusively based on inherent design features and is not affected by the analysis.

Cohort studies of one exposure/one group

Dekkers et al. [ 29 ] defined cohort studies with one exposure as studies with exposure-based sampling that enable calculating absolute effects measures for a risk of outcome. This definition means that “the absence of a control group in an exposure-based study does not define a case series” [ 29 ]. The definition of cohort studies according to Dekkers et al. [ 29 ] is summarized in Table 4 .

Cohort studies of multiple exposures/more than one group

This idea can be easily extended to studies with more than one exposure. In this case, all studies with exposure-based sampling gathering multiple exposures (i.e., at least two different exposures, manifestations of exposures or levels of exposures) can be considered as (comparative) cohort studies (Fig. 3 ). The sampling is based on exposure, and there are different groups. Consequently, relative risks can be calculated [ 29 ]. The term “enables/can” implies that a predefined analytic comparison is not a prerequisite but that all studies with sufficient data to enable a reanalysis (e.g., in the publication, study reports, and supplementary material) would be classified as cohort studies.

Cohort study (deduced from Dekkers et al. [ 28 ])

In short, all studies that enable calculation of a relative risk to quantify a difference in outcomes between different groups should be considered cohort studies.

Case series

According to Dekkers et al. [ 29 ], the sampling of a case series is either based on exposure and outcome (e.g., all patients are treated and have an adverse event) or case series include patients with a certain outcome regardless of exposure (see Fig. 4 ). Consequently, no absolute risk and also no relative effect measures for an outcome can be calculated in a case series. Note that sampling in a case series does not need to be consecutive. Consecutiveness would increase the quality of the case series, but a non-consecutive series is also a case series [ 29 ].

Case series (Deckers et al. [ 28 ])

In short, for a case series, there are no absolute risks, and also, no risk ratios can be calculated. Consequently, a case series cannot be comparative. The definition of a case series by Dekkers et al. [ 29 ] is summarized in Table 4 .

It is noteworthy that the conceptualization also ensures a clear distinction of case series from other study designs that apply outcome-based sampling. Case series, case-control studies (including case-time-control), and self-controlled case-control designs (e.g., case-crossover) all have outcome-based sampling in common [ 29 ].

Case series have no control at all because only patients with a certain manifestation of outcomes are sampled (e.g., individuals with a disease or deceased individuals). In contrast, all case-control designs as well as self-controlled case-control designs have a control group. In case-control studies, the control group constitutes individuals with another manifestation of the outcome (e.g., healthy individuals or survivors). This outcome can be considered as two case series (i.e., case group and no case group).

Self-controlled case-control studies are characterized by an intra-individual comparison (each individual is their own control) [ 30 ]. Information is also sampled when patients are not exposed. Therefore, case-control designs as well as self-controlled case-control studies enable the calculation of risk ratios. This approach is not possible for a case series.

Illustrating example

Above, we illustrated that by using a vague definition, the classification of a study design might be influenced by the preparation and analysis of the study data. The proposed conceptualization is exclusively based on the inherent design features (e.g., sampling, exposure). After considering the example studies again using the proposed conceptualization, all studies would be classified as cohort studies because the relative risk can be calculated. This outcome becomes clear looking at Table 2 and Table 3 . If the patients in Table 3 are rearranged according the exposure and the data are reanalysed (i.e., calculation of absolute risk per group and relative risks to compare groups), Table 3 can be converted into Table 2 (and also, Fig. 2 can be converted to Fig. 3 ). In the study by Wong et al. [ 10 ], the mean blood loss in the group with placental invasion and in the group without placental invasion can be calculated and compared (e.g., relative risk with 95% confidence limits). In this study, the data on gestational age are also provided in the table. Therefore, it is even possible to adjust the results for gestational age (e.g., using a logistic regression).

Discussion (the impact)

Influence on the body of evidence.

The proposed conceptualization is exclusively based on inherent study design features; therefore, there is less room for misinterpretation compared to existing conceptualizations because analysis features, presentation of data and labelling of the study are not determined. Thus, the conceptualization ensures consistent study selection for systematic reviews.

The prerequisite of an analytical comparison in the publication can lead to the unjustified exclusion of relevant studies from a systematic review. Study 1 would likely be included, and Study 2 would be excluded from the systematic review. The only differences between Study 1 and Study 2 are the analysis and preparation of data. If the data source (e.g., chart review) and the reanalysis (calculation of effect measures and statistical tests) to compare the intervention and control group in Study 2 are performed exactly with the same approach as the existing analysis in Study 1, there can be no difference in the effect estimates between studies, and the studies are at the same risk of bias. Thus, the inclusion of Study 1 and the exclusion of Study 2 are contradictory to the requirement that systematic reviews identify all available evidence [ 31 ].

Considering that more studies would be eligible for inclusion and that the hierarchical paradigm of the levels of evidence is not valid per se, the proposed conceptualization can potentially enrich bodies of evidence and increase confidence in effect estimates.

Influence on workload

The additional inclusion of all studies that enable calculating relative risk for the comparison of interest might impact the workload of systematic reviews. There might be a considerable number of studies not performing a comparison already but that provide sufficient data for reanalysis. Usually the electronic search strategy for systematic reviews of non-randomized studies is not limited to certain study types because there are no sensitive search filters available yet [ 32 ]. Therefore, the search results do not usually include cohort studies as discussed above. However, in many abstracts it would be not directly clear if sufficient data for re-calculations are reported in the full text article (e.g., a table like Table 3 ). Consequently, many additional potentially relevant full-text studies have to be screened. Additionally, studies often assess various exposures (e.g., different baseline characteristics), and it might thus be difficult to identify relevant exposures. Considering the large amount of wrongly labelled studies, this approach can lead to additional screening effort [ 22 ].

As a result, more studies would be included in systematic reviews. All articles that provide potentially relevant data would have to be assessed in detail to decide whether reanalysis is feasible. For these data extractions, a risk of bias assessment would have to be performed. Challenges in the risk of bias assessment would arise because most assessment tools are constructed to assess a predefined control group [ 33 ]. For example, items regarding the adequacy of analysis (e.g., adjustment for confounders) cannot be assessed anymore. Effect measures must be calculated (e.g., risks by group and relative risk with a 95% confidence limit), and eventually further analyses (e.g., adjustments for confounders) might be necessary for studies that provide sufficient data. Moreover, advanced biometrical expertise would be necessary to judge the feasibility (i.e., determining the possibility to calculate relative risks and whether there are sufficient data to adjust for confounders) of a re-analysis and to conduct the reanalysis.

Promising areas of application

In the medical literature, it is likely that more retrospective mislabelled cohort studies (comparison planned after data collection) based on routinely collected data (e.g., chart review, review of radiology databases) than prospectively planned (i.e., comparisons planned before data collection) and wrongly labelled cohort studies can be found. Thus, it can be assumed that the wrongly labelled studies tend to have lower methodological quality than studies that already include a comparison. This aspect should be considered in decisions about including studies that must be reanalysed. In research areas in which randomized controlled trials or large planned prospective and well-conducted cohort studies can be expected (e.g., risk factors for widespread diseases), the approach is less promising for enriching the body of evidence. Consequently, in these areas, the additional effort might not be worthwhile.

Again, the conceptualization is particularly promising in research areas in which evidence is sparse because studies are difficult to conduct or populations are small or the event rates are low. These areas include rare diseases, adverse events/complications, sensitive groups (e.g., children or individuals with cognitive deficiencies) or rarely used interventions (e.g., costly innovations). In these areas, there might be no well-conducted studies at all [ 34 , 35 ]. Therefore, the proposed conceptualization in this report has great potential to increase confidence in effect estimates.

We proposed a conceptualization for cohort studies with multiple exposures that ensures a clear distinction from case series. In this conceptualization, all studies that contain sufficient data to conduct a reanalysis and not only studies with a pre-existing analytic comparison are classified as cohort studies and are considered appropriate for inclusion in systematic reviews. To the best of our knowledge, no systematic reviews exist that reanalyse (mislabelled) case series to create cohort studies. The outlined approach is a method that can potentially enrich the body of evidence and subsequently enhance confidence in effect estimates and the strengths of conclusions. However, the enrichment of the body of evidence should be balanced against the additional workload.

Ijaz S, Verbeek JH, Mischke C, Ruotsalainen J. Inclusion of nonrandomized studies in Cochrane systematic reviews was found to be in need of improvement. J Clin Epidemiol. 2014;67(6):645–53.

Article   PubMed   Google Scholar  

Ev E, Altman DG, Egger M, Pocock SJ, Gøtzsche PC, Vandenbroucke JP. Strengthening the reporting of observational studies in epidemiology (STROBE) statement: guidelines for reporting observational studies. BMJ. 2007;335(7624):806–8.

Article   Google Scholar  

Reeves BC, Deeks JJ, Higgins JP. 13 including non-randomized studies. Cochrane Handbook Syst Rev Interventions. 2008;1:391.

Google Scholar  

Hartling L, Bond K, Santaguida PL, Viswanathan M, Dryden DM. Testing a tool for the classification of study designs in systematic reviews of interventions and exposures showed moderate reliability and low accuracy. J Clin Epidemiol. 2011;64(8):861–71.

EPOC-specific resources for review authors: What study designs should be included in an EPOC review and what should they be called? [ http://epoc.cochrane.org/resources/epoc-resources-review-authors ]. Accessed 12 July 2017.

Cheng WW, Ko CH, Chan AK. Paediatric stroke: case series. Hong Kong Med J. 2002;8(3):216–20.

CAS   PubMed   Google Scholar  

Hernot S, Wadhera R, Kaintura M, Bhukar S, Pillai DS, Sehrawat U, George JS. Tracheocutaneous fistula closure: comparison of rhomboid flap repair with Z Plasty repair in a case series of 40 patients. Aesthet Plast Surg. 2016.

Stacchiotti S, Provenzano S, Dagrada G, Negri T, Brich S, Basso U, Brunello A, Grosso F, Galli L, Palassini E, et al. Sirolimus in advanced Epithelioid Hemangioendothelioma: a retrospective case-series analysis from the Italian rare cancer network database. Ann Surg Oncol. 2016;23(9):2735–44.

Sofiah S, Fung LYC. Placenta accreta: clinical risk factors, accuracy of antenatal diagnosis and effect on pregnancy outcome. Med J Malays. 2009;64(4):298–302.

CAS   Google Scholar  

Wong HS, Hutton J, Zuccollo J, Tait J, Pringle KC. The maternal outcome in placenta accreta: the significance of antenatal diagnosis and non-separation of placenta at delivery. N Z Med J. 2008;121(1277):30–8.

PubMed   Google Scholar  

Mayorandan S, Meyer U, Gokcay G, Segarra NG, de Baulny HO, van Spronsen F, Zeman J, de Laet C, Spiekerkoetter U, Thimm E, et al. Cross-sectional study of 168 patients with hepatorenal tyrosinaemia and implications for clinical practice. Orphanet J Rare Dis. 2014;9(1):107.

Article   PubMed   PubMed Central   Google Scholar  

Bartlett DC, Lloyd C, McKiernan PJ, Newsome PN. Early nitisinone treatment reduces the need for liver transplantation in children with tyrosinaemia type 1 and improves post-transplant renal function. J Inherit Metab Dis. 2014;37(5):745–52.

Article   CAS   PubMed   Google Scholar  

El-Karaksy H, Fahmy M, El-Raziky M, El-Koofy N, El-Sayed R, Rashed MS, El-Kiki H, El-Hennawy A, Mohsen N. Hereditary tyrosinemia type 1 from a single center in Egypt: clinical study of 22 cases. World J Pediatr. 2011;7(3):224–31.

Zeybek AC, Kiykim E, Soyucen E, Cansever S, Altay S, Zubarioglu T, Erkan T, Aydin A. Hereditary tyrosinemia type 1 in Turkey: twenty year single-center experience. Pediatr Int. 2015;57(2):281–9.

Helmy N, Akl Y, Kaddah S, El Hafiz HA, El Makhzangy H. A case series: Egyptian experience in using chemical pleurodesis as an alternative management in refractory hepatic hydrothorax. Arch Med Sci. 2010;6(3):336–42.

Niesen AD, Sprung J, Prakash YS, Watson JC, Weingarten TN. Case series: anesthetic management of patients with spinal and bulbar muscular atrophy (Kennedy's disease). Can J Anaesth. 2009;56(2):136–41.

de Mauroy JC, Journe A, Gagaliano F, Lecante C, Barral F, Pourret S. The new Lyon ARTbrace versus the historical Lyon brace: a prospective case series of 148 consecutive scoliosis with short time results after 1 year compared with a historical retrospective case series of 100 consecutive scoliosis; SOSORT award 2015 winner. Scoliosis. 2015;10:26.

Forner D, Phillips T, Rigby M, Hart R, Taylor M, Trites J. Submental island flap reconstruction reduces cost in oral cancer reconstruction compared to radial forearm free flap reconstruction: a case series and cost analysis. J Otolaryngol Head Neck Surg. 2016;45:11.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Kuhnt D, Bauer MHA, Sommer J, Merhof D, Nimsky C. Optic radiation fiber Tractography in Glioma patients based on high angular resolution diffusion imaging with compressed sensing compared with diffusion tensor imaging - initial experience. PLoS One. 2013;8(7):e70973.

Naesens R, Vlieghe E, Verbrugghe W, Jorens P, Ieven M. A retrospective observational study on the efficacy of colistin by inhalation as compared to parenteral administration for the treatment of nosocomial pneumonia associated with multidrug-resistant Pseudomonas Aeruginosa. BMC Infect Dis. 2011;11:317.

Toktas ZO, Konakci M, Yilmaz B, Eksi MS, Aksoy T, Yener Y, Koban O, Kilic T, Konya D. Pain control following posterior spine fusion: patient-controlled continuous epidural catheter infusion method yields better post-operative analgesia control compared to intravenous patient controlled analgesia method. A retrospective case series. Eur Spine J. 2016;25(5):1608–13.

Esene IN, Ngu J, Zoghby M, Solaroglu I, Sikod AM, Kotb A, Dechambenoit G, Husseiny H. Case series and descriptive cohort studies in neurosurgery: the confusion and solution. Childs Nerv Syst. 2014;30(8):1321–32.

Kellesarian SV, Yunker M, Ramakrishnaiah R, Malmstrom H, Kellesarian TV, Ros Malignaggi V, Javed F. Does incorporating zinc in titanium implant surfaces influence osseointegration? A systematic review. J Prosthet Dent. 2017;117(1):41–7.

Wijnands TF, Gortjes AP, Gevers TJ, Jenniskens SF, Kool LJ, Potthoff A, Ronot M, Drenth JP. Efficacy and safety of aspiration Sclerotherapy of simple hepatic cysts: a systematic review. AJR Am J Roentgenol. 2017;208(1):201–7.

Zapata LB, Oduyebo T, Whiteman MK, Houtchens MK, Marchbanks PA, Curtis KM. Contraceptive use among women with multiple sclerosis: a systematic review. Contraception. 2016;94(6):612–20.

Dogramaci EJ, Rossi-Fedele G. Establishing the association between nonnutritive sucking behavior and malocclusions: a systematic review and meta-analysis. J Am Dent Assoc. 2016;147(12):926–34. e926.

Kellesarian SV, Abduljabbar T, Vohra F, Gholamiazizi E, Malmstrom H, Romanos GE, Javed F. Does local Ibandronate and/or Pamidronate delivery enhance Osseointegration? A systematic review. J Prosthodont. 2016.

Crandall M, Eastman A, Violano P, Greene W, Allen S, Block E, Christmas AB, Dennis A, Duncan T, Foster S, et al. Prevention of firearm-related injuries with restrictive licensing and concealed carry laws: an eastern Association for the Surgery of trauma systematic review. J Trauma Acute Care Surg. 2016;81(5):952–60.

Dekkers OM, Egger M, Altman DG, Vandenbroucke JP. Distinguishing case series from cohort studies. Ann Intern Med. 2012;156(1_Part_1):37–40.

Petersen I, Douglas I, Whitaker H. Self controlled case series methods: an alternative to standard epidemiological study designs. BMJ. 2016;354.

Higgins JP, Green S. Cochrane handbook for systematic reviews of interventions, vol. 5: Wiley Online Library; 2008.

Marcano Belisario JS, Tudor Car L, Reeves TJA, Gunn LH, Car J. Search strategies to identify observational studies in MEDLINE and EMBASE. Cochrane Database Syst Rev. 2013;12.

Hayden JA, van der Windt DA, Cartwright JL, Cote P, Bombardier C. Assessing bias in studies of prognostic factors. Ann Intern Med. 2013;158(4):280–6.

Institute for Quality and Efficiency in Health Care (IQWIG): Newborn screening for severe combined immunodeficiency (S15–02). In . ; 2017.

Institute for Quality and Efficiency in Health Care (IQWIG): Newborn screening for tyrosinaemia type 1 (S15–01). In . ; 2017.

Download references

Acknowledgements

There was no external funding for the research or publication of this article.

Availability of data and materials

Not applicable.

Author information

Authors and affiliations.

Institute for Research in Operative Medicine, Chair of Surgical Research, Faculty of Health, School of Medicine, Witten/Herdecke University, Ostmerheimer Str. 200, 51109, Cologne, Germany

Tim Mathes & Dawid Pieper

You can also search for this author in PubMed   Google Scholar

Contributions

All authors have made substantial contributions to the work. Both authors read and approved the final manuscript.

Corresponding author

Correspondence to Tim Mathes .

Ethics declarations

Ethics approval and consent to participate.

Not applicable. No human data involved.

Consent for publication

Not applicable. The manuscript contains no individual person’s data.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/ ), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Cite this article.

Mathes, T., Pieper, D. Clarifying the distinction between case series and cohort studies in systematic reviews of comparative studies: potential impact on body of evidence and workload. BMC Med Res Methodol 17 , 107 (2017). https://doi.org/10.1186/s12874-017-0391-8

Download citation

Received : 17 January 2017

Accepted : 10 July 2017

Published : 17 July 2017

DOI : https://doi.org/10.1186/s12874-017-0391-8

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Defined Study Cohort
  • Systematic Review
  • Extract Additional Data
  • Bias Assessment
  • Placental Invasion

BMC Medical Research Methodology

ISSN: 1471-2288

case control study and cohort study

medRxiv

Misleading and avoidable: design-induced biases in observational studies evaluating cancer screening—the example of site-specific effectiveness of screening colonoscopy

  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Malte Braitmaier
  • ORCID record for Sarina Schwarz
  • ORCID record for Vanessa Didelez
  • ORCID record for Ulrike Haug
  • For correspondence: [email protected]
  • Info/History
  • Preview PDF

Objective Observational studies evaluating the effectiveness of cancer screening are often biased due to an inadequate design where I) the assessment of eligibility, II) the assignment to screening vs. no screening and III) the start of follow-up are not aligned at time zero (baseline). Such flaws can entail misleading results but are avoidable by designing the study following the principle of target trial emulation (TTE). We aimed to illustrate this by addressing the research question whether screening colonoscopy is more effective in the distal vs. the proximal colon.

Methods Based on a large German health care database (20% population coverage), we assessed the effect of screening colonoscopy in preventing distal and proximal CRC over 12 years of follow-up in 55–69-year-old persons at average CRC risk. We applied four different study designs and compared the results: cohort study with / without alignment at time zero, case control study with / without alignment at time zero.

Results In both analyses with alignment at time zero, screening colonoscopy showed a similar effectiveness in reducing the incidence of distal and proximal CRC (cohort analysis: 32% (95% CI: 27% - 37%) vs. 28% (95% CI: 20% - 35%); case-control analysis: 27% vs. 33%). Both analyses without alignment at time zero suggested a difference in site-specific performance: Incidence reduction regarding distal and proximal CRC, respectively, was 65% (95% CI: 61% - 68%) vs. 37% (95% CI: 31% - 43%) in the cohort analysis and 77% (95% CI: 67% - 84%) vs. 46% (95% CI: 25% - 61%) in the case-control analysis.

Conclusions Our study demonstrates that violations of basic design principles can substantially bias the results of observational studies on cancer screening. In our example, it falsely suggested a much stronger preventive effect of colonoscopy in the distal vs. the proximal colon. The difference disappeared when the same data were analyzed using a TTE approach, which is known to avoid such design-induced biases.

Competing Interest Statement

The authors have declared no competing interest.

Funding Statement

BIPS intramural funding

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

In Germany, the utilisation of health insurance data for scientific research is regulated by the Code of Social Law. All involved health insurance providers as well as the German Federal Office for Social Security and the Senator for Health, Women and Consumer Protection in Bremen as their responsible authorities approved the use of GePaRD data for this study. Informed consent for studies based on claims data is required by law unless obtaining consent appears unacceptable and would bias results, which was the case in this study. According to the Ethics Committee of the University of Bremen studies based on GePaRD are exempt from institutional review board review.

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.

The following points were changed in this revision: 1) Figure 1 showed results belonging to a sensitivity analysis instead of the main analysis. This was corrected with this revision. 2) An acknowledgement statement was included.

View the discussion thread.

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Reddit logo

Citation Manager Formats

  • EndNote (tagged)
  • EndNote 8 (xml)
  • RefWorks Tagged
  • Ref Manager
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Epidemiology
  • Addiction Medicine (324)
  • Allergy and Immunology (633)
  • Anesthesia (168)
  • Cardiovascular Medicine (2407)
  • Dentistry and Oral Medicine (289)
  • Dermatology (207)
  • Emergency Medicine (382)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (853)
  • Epidemiology (11803)
  • Forensic Medicine (10)
  • Gastroenterology (705)
  • Genetic and Genomic Medicine (3775)
  • Geriatric Medicine (350)
  • Health Economics (637)
  • Health Informatics (2410)
  • Health Policy (940)
  • Health Systems and Quality Improvement (905)
  • Hematology (342)
  • HIV/AIDS (787)
  • Infectious Diseases (except HIV/AIDS) (13350)
  • Intensive Care and Critical Care Medicine (769)
  • Medical Education (369)
  • Medical Ethics (105)
  • Nephrology (402)
  • Neurology (3528)
  • Nursing (199)
  • Nutrition (529)
  • Obstetrics and Gynecology (682)
  • Occupational and Environmental Health (670)
  • Oncology (1836)
  • Ophthalmology (540)
  • Orthopedics (222)
  • Otolaryngology (287)
  • Pain Medicine (234)
  • Palliative Medicine (67)
  • Pathology (447)
  • Pediatrics (1039)
  • Pharmacology and Therapeutics (426)
  • Primary Care Research (424)
  • Psychiatry and Clinical Psychology (3199)
  • Public and Global Health (6188)
  • Radiology and Imaging (1294)
  • Rehabilitation Medicine and Physical Therapy (751)
  • Respiratory Medicine (832)
  • Rheumatology (380)
  • Sexual and Reproductive Health (375)
  • Sports Medicine (324)
  • Surgery (407)
  • Toxicology (50)
  • Transplantation (173)
  • Urology (147)
  • Open access
  • Published: 28 May 2024

The interrelation between microbial immunoglobulin coating, vaginal microbiota, ethnicity, and preterm birth

  • H. J. Schuster 1 , 2 , 3 , 4 ,
  • A. C. Breedveld 2 , 5 ,
  • S. P. F. Matamoros 1 , 2 ,
  • R. van Eekelen 6 ,
  • R. C. Painter 4 , 7 ,
  • M. Kok 3 , 4 ,
  • P. J. Hajenius 3 , 4 ,
  • P. H. M. Savelkoul 1 , 2 , 8 ,
  • M. van Egmond 2 , 5 , 9 &
  • R. van Houdt 1 , 2  

Microbiome volume  12 , Article number:  99 ( 2024 ) Cite this article

Metrics details

Vaginal microbiota composition is associated with spontaneous preterm birth (sPTB), depending on ethnicity. Host-microbiota interactions are thought to play an important underlying role in this association between ethnicity, vaginal microbiota and sPTB.

In a prospective cohort of nulliparous pregnant women, we assessed vaginal microbiota composition, vaginal immunoglobulins (Igs), and local inflammatory markers. We performed a nested case–control study with 19 sPTB cases, matched based on ethnicity and midwifery practice to 19 term controls.

Of the 294 included participants, 23 pregnancies ended in sPTB. We demonstrated that Lactobacillus iners -dominated microbiota, diverse microbiota, and ethnicity were all independently associated with sPTB. Microbial Ig coating was associated with both microbiota composition and ethnicity, but a direct association with sPTB was lacking. Microbial IgA and IgG coating were lowest in diverse microbiota, especially in women of any ethnic minority. When correcting for microbiota composition, increased microbial Ig coating correlated with increased inflammation.

In these nulliparous pregnant women, vaginal microbiota composition is strongly associated with sPTB. Our results support that vaginal mucosal Igs might play a pivotal role in microbiota composition, microbiota-related inflammation, and vaginal community disparity within and between ethnicities. This study provides insight in host-microbe interaction, suggesting that vaginal mucosal Igs play an immunomodulatory role similar to that in the intestinal tract.

Video Abstract

An estimated 15 million babies are born preterm each year worldwide [ 1 ]. Preterm birth (PTB) is defined as birth before 37 completed weeks of gestation and is a major cause for perinatal mortality and neonatal morbidity [ 2 ]. Currently, the prevalence ranges from 5 to 18% across various countries [ 3 ]. PTB is usually specified based on its onset, which is either a spontaneous onset of labor (sPTB) or induction of labor or primary caesarean section for maternal or fetal indications [ 2 , 4 ]. The etiology of sPTB is multifactorial and remains poorly understood. The most important risk factor is a previous sPTB, but most sPTB occur in nulliparous women who lack obstetric history [ 5 ]. Several other risk factors have been identified including maternal characteristics such as ethnicity, socio-economic status (SES), body mass index, and maternal smoking, as well as characteristics of the pregnancy like fetal sex, a short mid-trimester cervical length, and intra-amniotic infection [ 6 , 7 , 8 , 9 ].

Vaginal microbiota play an important role during pregnancy as its composition and dynamics are hypothesized to have an association with sPTB [ 10 , 11 , 12 , 13 ]. Vaginal microbiota are either of low diversity, mainly dominated by a single Lactobacillus species, or consist of a diverse range of (facultative) anaerobic bacteria [ 14 ]. Higher diversity of the vaginal microbiota is related to bacterial vaginosis, a disease characterized by increased vaginal discharge and increased susceptibly for invading pathogens [ 15 , 16 ]. Infection and accompanying inflammatory responses are important risk factors for preterm labor or preterm prelabour rupture of membranes, and about 40% of sPTB is associated with infection [ 17 , 18 , 19 ].

Ethnicity is significantly associated with the vaginal microbiota composition. While both low and high diversity vaginal microbiota are found in women of all ethnicities, Lactobacillus crispatus -dominant vaginal microbiota is present more often in White European women while diverse vaginal microbiota is present more often in women with a sub-Saharan African descent [ 20 ]. The etiology of this association is thus far unknown.

Immunoglobulins (Igs) in the mucosal tissue of the female genital tract are key mediators of mucosal immunology and are important in the defense against infections in the reproductive tract [ 21 ]. IgA, the predominant antibody in the intestinal tract, can influence gut microbiota composition [ 22 , 23 , 24 ]. Deviations in Ig coating of intestinal bacteria have been associated with inflammatory bowel diseases [ 25 , 26 ]. In the vaginal tract, there is more IgG than IgA, in contrast to other mucosal surfaces [ 27 ]. The etiology of the differences in abundance and type of Igs between mucosal sites is not well understood. In a previous study, our group demonstrated increased microbial IgA coating of L. crispatus- dominant vaginal microbiota [ 28 ]. Ig coating of vaginal microbiota might play a role in the disparity in microbial community composition between ethnicities and might be associated with gynecological and obstetric diseases.

In this prospective cohort study, we collected vaginal swabs of nulliparous healthy pregnant women at antenatal booking in the first trimester to investigate vaginal microbiota composition and microbial immunoglobulin coating and studied the associations with ethnicity and sPTB. Furthermore, we performed a nested case–control study matching participants with sPTB to participants with uncomplicated term birth. For this subset, we measured unbound Igs and a broad set of inflammatory cytokines and chemokines in vaginal fluid.

Study design and participants

For this study, we used data and vaginal swab material from women included in the PROPELLOR cohort. The study protocol and methods are described in a previous publication [ 29 ]. In short, the study included nulliparous women ≥ 18 years who received antenatal care at participating midwifery practices in the Netherlands before 24 weeks of gestation and had a low-risk singleton pregnancy at their first visit. For this study, all participants of whom a vaginal swab and pregnancy outcome was available (no loss to follow-up) were included. Nulliparity was defined as never having had a pregnancy progress beyond 16 weeks of gestation [ 30 ]. At the first prenatal visit, usually between 8 and 12 weeks pregnancy, women were approached to participate in this study by their midwife. A self-administered vaginal swab (eSwab, Copan Diagnostics Inc., Murietta, USA) was collected at inclusion. Swabs with the original medium were stored at − 20 °C until transfer at a shipping temperature at − 20 °C to the central storage facility at − 80 °C, storage duration 4–6 years. Written informed consent was obtained from each participant. The study received ethical clearance through the institutional review board of the Academic Medical Center in Amsterdam, the Netherlands (registration number NL43414.018.13).

We analyzed the microbiota composition and microbial bound Igs for all participants. Due to limited funding, we did a nested case control selection of subjects to assess the role of additional inflammatory markers. The additional inflammatory markers included unbound immunoglobulins, cytokines, chemokines, and anti-microbial proteins. Because of financial constraints and missing meta-data, 19 out of 23 women with sPTB were matched based on their ethnicity and midwifery practice to 19 women with term birth. If there was not an appropriate control available within the same midwifery practice, a control was chosen from a midwifery practice in a similar socio-economic region.

Definitions

The primary outcome measure was sPTB, defined as spontaneous onset of labor or spontaneous preterm prelabor rupture of membranes. We differentiated between sPTB between 23 and 37 weeks of gestation and late second trimester loss between 16 and 22 weeks of gestation. Ethnicity was based on participant self-identification. SES was based on status scores from the Netherlands Institute for Social Research [ 31 ].

Vaginal microbiota and bioinformatics analysis

We pre-treated the vaginal swabs with lysozyme, mutanolysin (Sigma Aldrich, St. Louis, USA), lysostaphin (AMBI, New York, USA), proteinase K, and RNAse A (Thermo Fisher, Waltham, USA). DNA was extracted using the NucliSENS EasyMAG platform, according to manufacturer protocol (BioMérieux, Marcy l’Etoile, France). We used dual indexed universal primers (319F and 806R) for PCR amplification of the V3–V4 regions of the 16S rRNA genes, as described by Fadrosh et al. [ 32 ]. PCR products were normalized and pooled. We purified the samples with Agencourt AMPure XP magnetic beads (BeckmanCoulter, Fullerton, USA). Paired-end sequencing was performed on the Illumina MiSeq platform, according to manufacturer protocol (Illumina, San Diego, USA).

We de-multiplexed raw sequences and removed adapters, barcodes, and heterogeneity spacers using Cutadapt 3.5 [ 33 ]. Processing of de-multiplexed sequence data and taxonomic classification was performed using the software QIIME 2 version 2021.8 [ 34 ]. Forward and reverse reads were trimmed to 260 and 210 basepairs respectively based on a visible drop in average quality beyond these points (visualization performed with https://view.qiime2.org/ ). We included two unsampled swabs as negative controls. We deemed samples with a read count < 200 reads too similar to the controls and excluded these from analysis. Amplicon sequence variants (ASVs) were generated with DADA2 [ 35 ]. We used a pre-trained Naive Bayes classifier of the Silva v138 reference database to assign genus and species (if possible) to each ASV [ 36 ]. Because the Silva v138 reference database does not include Lactobacillus crispatus , one of the most abundant vaginal species, all Lactobacillus sequences without species assignment were further refined using the Nucleotide BLAST (BLASTn) function on the National Center for Biotechnology Information NCBI website. We manually identified sequences belonging to Candidatus Lachnocurva vaginae (formerly Bacterial Vaginosis Associated Bacterium (BVAB) 1), BVAB2 and TM7-H1. The data were not normalized or rarefied. The code for bioinformatics analyses is available in Supplementary material. Raw sequence reads were dehumanized for publication reasons. We grouped samples based on microbiota profile using VALENCIA centroid classification tool [ 37 ]. This tool divides vaginal microbiota into community state types (CSTs) based on their taxonomic composition, by calculating their similarity to a set of reference centroids. It identifies seven CSTS, 4 dominated by lactobacilli (CST I by L. crispatus , CST II by L. gasseri , CST III by L. iners , CST V by L. jensenii ), and three depleted of lactobacilli (CST IV-A with majority Candidatus Lachnocurva vaginae and Gardnerella vaginalis , CST IV-B with majority G. vaginalis and Atopobium vaginae , CST IV-C with low abundance of G. vaginalis and Candidatus Lachnocurva vaginae). For statistical power, we reduced these to 4 groups, combining CST II and CST V into one group and combining CST IV-A, CST IV-B, and CST IV-C into one group.

Microbial immunoglobulin coating and inflammatory markers

We determined microbial Ig coating as described in a previous publication using flow cytometry [ 28 ]. We calculated coating index by multiplying the percentage of bacteria with bound immunoglobulin with the median fluorescence intensity (MFI). We determined total IgA, IgA1, IgA2, secretory IgA (SIgA), and IgG levels in vaginal swabs by enzyme-linked immunosorbent, as described in a previous publication [ 28 ]. We measured vaginal cytokines, chemokines, and anti-microbial peptides with a Multiplex assay using a Bio-Plex 200 according to the manufacturer’s instructions, and human beta defensin-2 (HBD-2) in a separate assay according to manufacturer’s protocol with minor adaptions. We measured total vaginal protein concentration to correct for inter-participant variation according to manufacturer’s protocol. Additional methods and manufacturers can be found in Supplementary methods .

Statistical analysis

If necessary, data were log-transformed to derive normal distributions. Missing data were handled using multiple imputation creating 10 imputation datasets, except for the main determinants’ vaginal microbiota profile and immunoglobulin coating. All variables, including microbiota composition, microbial bound Igs, and midwifery practice, were used for imputation. Numerical results are based on pooled estimates over 10 imputation sets using Rubin’s rules [ 38 ].

For analyses on ethnicity, the variable was dichotomised between White European and non-White European. SES was divided into low and middle/high. We performed Firth’s correction logistic regression analysis for sPTB, calculating ORs and 95% CIs. We defined a priori which associations to estimate and confounders to use and, due to the limited number of events, accounted for the single most important confounder in multivariable logistic regression. Even so, overfitting (overestimations of associations due to small number of events) could be an issue. To reduce overfitting because of the low incidence of sPTB, we applied Firth’s correction for all ORs. Firth’s correction uses penalized likelihood which aims to shrink estimated associations that are overly optimistic [ 39 ]. For the association between individual taxa, the linear discriminant analysis effect size (LEfSe) algorithm was used [ 40 ]. For this analysis, only bacterial taxa were included with > 1% of total read count. This algorithm calculates the median relative abundance of all taxa and compared this between participants with and without sPTB. It uses factorial Kruskal–Wallis rank-sum test to detect differential abundances of bacterial taxa between these groups. The estimated effect size of the differentially abundant taxa was calculated using linear discriminant analysis, with a minimum threshold of 2.0. For the remaining analyses of the entire cohort, we used Student’s t -test and ANOVA with post hoc Bonferroni correction for continuous variables and the Chi-square test for categorical variables. For the nested case–control study, we calculated standardized β-coefficient by linear regression with adjustment for microbiota composition and post hoc Bonferroni correction.

Statistical analyses were performed using IBM SPSS statistics (version 28) and R version 3.3.2 (R Core Team (2016)) with the mice , miceadds , and logistf packages. A p -value of < 0.05 was considered statistically significant. Data were visualized using GraphPad Prism (version 9).

Role of the funding source

The study sponsors had no role in the study design, collection, analysis, and interpretation of the data, the writing of the report, and decision to submit this paper for publication.

Study population

A total of 294 participants were included in this study. Key demographics are shown in Table  1 . Of the participants, 189 (73.8%) self-identified as White European and 92 (31.3%) had a low SES. sPTB occurred in 23 (7.8%) participants, of which 18 (6.1%) between 23 and 37 weeks of gestation and in five (1.7%) between 16 and 22 weeks of gestation. A miscarriage < 16 weeks of gestation occurred in one (0.3%) participant and in eight (2.7%) participants birth was induced < 37 weeks of gestation for maternal or fetal indications. Key demographics of participants included in the nested case–control study are shown in Table S 1 .

Of the 294 samples, three samples were excluded because of a read count below the threshold. The remaining samples had an average of 28,195 reads per sample. 16 s rDNA sequence analysis identified community state types (CSTs), grouped into four clusters: dominated by L. crispatus (CST I, n  = 139), dominated by L. gasseri or L. jensenii (CST II/V, n  = 19), dominated by L. iners (CST III, n  = 70), and diverse microbiota (CST IV, n  = 63) (Fig.  1 A and Table S 2 ). L. iners- dominated (CST III) and diverse microbiota communities (CST IV) were the most common vaginal microbiota communities found in women experiencing sPTB, and logistic regression revealed an increased odds ratio (OR) for L. iners -dominated (CST III) and diverse microbiota (CST IV) compared to women with microbiota dominated by L. crispatus (CST I) (OR 5.2, 95% confidence interval (CI) 1.6–16.5 and OR 5.2, 95% CI 1.6–16.9, respectively, Fig.  1 B and Table  2 ). No sPTB occurred in participants with L. gasseri / L. jensenii- dominated microbiota (CST II/V), resulting in an inaccurate OR. With linear discriminant analysis effect size (LEfSe), the individual taxa L. iners , Finegoldia , and Prevotella amnii were associated with sPTB (Fig.  1 C).

figure 1

Vaginal microbiota and their association with spontaneous preterm birth (sPTB). A Distribution of vaginal community state types (CSTs) of all participants. Numbers represent total participants per group. B Distribution of community state types of participants with and without sPTB. Numbers represent total participants per group. C The linear discriminant analysis (LDA) score, calculated with linear discriminant analysis effect size (LEfSe) algorithm of the association of individual taxa with sPTB. The bar represents the effect size of the taxa associated with sPTB

Bacteria bound immunoglobulins

We measured immunoglobulin coating levels using flow cytometry, calculated coating index of IgA and IgG, and divided these in quartiles. IgA and IgG coating indices were not associated with sPTB (Table  2 ). However, immunoglobulin coating was associated with microbiota composition (IgA p  < 0.001, IgG p  < 0.001) (Fig.  2 ). IgA and IgG coating index was statistically significantly lower in diverse microbiota (CST IV) compared to L. crispatus (CST I) and L. iners (CST III)-dominated microbiota (IgA L. crispatus /CST I p  < 0.001 and L. iners /CST III p  = 0.005, IgG L. crispatus /CST I p  < 0.001 and L. iners /CST III p  < 0.001). Two samples with the lowest IgA and IgG coating had both diverse microbiota composition (CST IV), as depicted in Fig.  2 . In one sample, G. vaginalis was the predominant species and the other was dominated by Enterobacteriaceae. One of the samples with low IgA and IgG coating, with G. vaginalis as predominant species, also had lower than average read count (323 reads). Analysis without this sample showed attenuated, but still statistically significant results for IgA coating index (diverse microbiota/CST IV compared to L. crispatus /CST I p  < 0.001 and L. iners /CST III p  = 0.013). For IgG coating, the additional analyses showed the same results as analysis with all participants. We further investigated these results dividing the samples into sub-CSTs (Figure S1, Table S 3 ). Within the group of diverse microbiota, CST IV-A with high to moderate levels of Candidatus Lachnocurva vaginae and G. vaginalis had IgA and IgG levels similar to Lactobacillus- dominated CSTs. CST IV-C2 dominated by Enterococcus spp. showed the lowest IgA and IgG coating. Statistical tests were not possible due to small groups.

figure 2

Microbial immunoglobulin coating in different community state types. Bars represent mean with standard deviation. ** p  < 0.01, *** p  < 0.001

Self-identified ethnicity was associated with sPTB, with an increased risk of sPTB for non-White European women (OR 3.8, 95% CI 1.5–9.4, Table  2 ). When combining both ethnicity and microbiota composition in a regression model, having L. iners- dominated (CST III) or diverse vaginal microbiota (CST IV) and having a non-White European ethnicity remained statistically significantly associated with sPTB, with slightly attenuated adjusted ORs (aORs) (aOR 3.9, 95% CI 1.2–12.9, aOR 3.9, 95% CI 1.2–13.2; and aOR 2.6, 95% CI 1.0–6.5 respectively, Table  2 ). Several baseline and pregnancy characteristics were associated with sPTB (Table  2 ). After adjusting for ethnicity and microbiota profile, urinary tract infection during pregnancy and vaginal blood loss in the 1st or 2nd trimester remained associated with sPTB (aOR 4.0, 95% CI 1.3–12.9, and aOR 3.2, 95% CI 1.2–8.7, respectively).

Ethnicity was also associated with microbiota composition and microbial IgA coating. White European women most often had L. crispatus- dominated (CST I) microbiota ( n  = 121, 57.1%) and non-White European women had most often L. iners- dominated (CST III) microbiota ( n  = 30, 38.0%) ( p  < 0.001, Fig.  3 A). Non-White European participants had lower IgA coating compared to White European women when diverse microbiota was present ( p  < 0.001, Fig.  3 B). Analysis without the sample with low read count showed the same results.

figure 3

Associations with ethnicity. A Distribution of community state types (CSTs) in White European and non-White European. Numbers represent total participants per group. B Microbial IgA coating in White European and non-White European participants within various vaginal CSTs. Bars represent mean with standard deviation. *** p  < 0.001

Cytokines, chemokines, and anti-inflammatory peptides

In the nested case–control study, we determined the association between the measured unbound immunoglobulins, cytokines, chemokines, or peptides and sPTB, while adjusting for vaginal microbiota composition (Table S 4 ). None was statistically significantly associated with sPTB. We investigated whether microbial Ig coating and unbound Igs were associated with inflammation. Inflammatory cytokines and chemokines IL-1α, IL-1β, IL-2, IL-6, IL-8, CCL4, and CCL5 were positively associated with one or more of the microbial bound and unbound Igs (Table  3 ). Also, microbial bound and unbound Igs showed positive correlation (Table S 5 ).

Our study recapitulated well-known associations between ethnicity and vaginal microbiota composition, with non-White European women having more L. iners- dominated (CST III) and diverse microbiota (CST IV), and the association between diverse microbiota and sPTB [ 11 , 14 , 41 , 42 ]. The association between L. iners and sPTB is previously described, but also an association in the opposite direction with term birth is described [ 43 , 44 , 45 ]. Our study adds to the suspicion that L. iners is more foe that friend during pregnancy in nulliparous women [ 46 ]. Our study describes that diverse microbiota have decreased microbial IgA and IgG coating compared to Lactobacillus- dominated microbiota, and that non-White European women with diverse microbiota had lower microbial IgA coating compared to White European women with the same vaginal microbiota profile. With this triad of associations, we anticipated finding an association between decreased IgA coating and sPTB, as diverse microbiota profiles and non-White European participants were over-represented in sPTB cases. The absence of this association was therefore remarkable, and could be reconciled by our finding that lower microbial Ig coating was also associated with lower local inflammation (cytokines and chemokines). Taken together, our findings show that vaginal microbiota and the local immunomodulatory properties of immunoglobulins each play a part in the pathophysiology of preterm birth.

Several studies on sPTB and vaginal microbiota showed similar associations with diverse vaginal microbiota, L. iners , and Prevotella spp. [ 10 , 11 , 12 , 42 , 47 , 48 , 49 ]. Unique in our study is that the study population only comprises nulliparous women with low risk for sPTB. Parity harbors associations with microbiota composition and sPTB. Vaginal L. iners and Gardnerella dominance is associated with previous birth and the risk for recurrent sPTB is 30% [ 50 , 51 , 52 ]. Therefore, associations between vaginal microbiota and sPTB might be different for nulliparous and multiparous women. This is corroborated by absence of such associations in two recent studies investigating only women at risk for recurrent sPTB [ 53 , 54 ]. What makes this especially interesting is that the association between vaginal microbiota composition is present in the first trimester. Risk stratification for sPTB early in pregnancy is limited, especially for nulliparous women. Because obstetric history is a strong prognostic factor, prediction models have limited effect in nulliparous women, while this is the largest group at risk [ 5 , 55 ]. Several treatments can reduce the risk for sPTB, but these are mainly available to women with an increased risk based on obstetric history. Previous studies illustrated the additional value of vaginal microbiota markers to the prediction of sPTB [ 12 , 56 , 57 ]. Our study confirms this, especially identifying a very low risk for sPTB in White European women with L. crispatus- dominated (CST I) vaginal microbiota. While this requires further research, determining vaginal microbiota composition in pregnancy could help to provide treatment to nulliparous women with an increased risk for sPTB.

Another strength is that our cohort is large enough to include samples of various vaginal microbiota profiles. This allowed us to further investigate the association between immunoglobulin coating and microbiota profile, compared to our previous study [ 28 ]. Also, our study population is ethnically diverse, reflecting the population in large Dutch urban areas and provided us the possibility to investigate ethnic disparities.

A limitation of our study is that despite the size, there were only few sPTB cases, limiting our statistical power. Another limitation is that we did not sample over time and thus longitudinal research was not possible. Also, we found no differences in pro-inflammatory mediators in women with sPTB compared to women delivered at term.

In contrast to recent studies, that showed increased levels of pro-inflammatory mediators in vaginal fluid of women with sPTB, including IL-1β, IL-2, IL-6, IL-8, eotaxin, CCL4, and CCL5 [ 12 , 53 , 58 , 59 ]. The most likely explanation for this discrepancy could be the gestational age at collection of the vaginal fluid. In our study, material was collected in early pregnancy (8–12 weeks), while other studies only found statistical differences in samples collected later in pregnancy (> 20 weeks). In one study with multiple sampling moments, no differences were found in samples collected at a first time point between 12 and 16 weeks of gestation, while an increase in several pro-inflammatory mediators between the first and second time point (between 20 and 24 weeks of gestation) was associated with sPTB [ 53 ]. These results suggest that inflammatory markers are increasing during pregnancy and clear deviations are not yet found early in pregnancy.

Our study revealed remarkable results concerning microbial bound and unbound Igs. A previous study from our group demonstrated higher IgA coating in L. crispatus- dominated vaginal microbiota in non-pregnant healthy women. Due to the larger sample size of the current study, we were able to further elucidate the association between microbial Ig coating. We demonstrated that both IgA and IgG coating are increased in not only L. crispatus but all Lactobacillus dominated microbiota. The previous longitudinal study focussed on the changes over time and showed higher microbial IgA and IgG coating during menses. As pregnancy is hormonally very different without regular vaginal bleeding, we were interested in the vaginal microbial immunoglobulin coating during pregnancy. Unfortunately, we could not study this longitudinally during pregnancy in the current study. But we did confirm that microbial IgA and IgG coating is also high during pregnancy. Also, the increased the sample size made it possible to study differences in microbial Ig coating in women from different ethnicities.

The association between high microbial coating and high inflammation seems to contradict the association between diverse microbiota and low microbial coating. In previous studies, diverse Lactobacillus- depleted microbiota are associated with increased inflammation [ 12 , 53 ]. Therefore, one would expect to find an association between low microbial coating and high inflammation. However, these results are similar to earlier data from the gut, with IgA both associated with healthy, diversified microbiota, and with inflammatory diseases [ 25 , 60 , 61 , 62 , 63 ]. It has been suggested that low-affinity IgA contributes to healthy gut microbiota, while high-affinity IgA is involved in pathogen clearance [ 64 ]. Based on our results, we hypothesize that a similar regulation can take place in the female genital tract. The role of microbial IgG coating remains unclear, as IgG levels are very low in the intestinal tract, and research on vaginal microbial IgG coating especially in relation to Lactobacillus spp. is limited. A recent study demonstrated that unbound vaginal IgG levels were highest in diverse microbiota and increasing IgG levels during pregnancy were associated with sPTB [ 53 ]. It remains to be elucidated what the exact role of microbial bound and unbound Igs is in vaginal mucosa and genital tract related health outcomes.

Vaginal microbiota and related sPTB risk is associated with ethnicity [ 10 , 11 ]. Our results imply that ethnicity is also associated with immunoglobulin levels and microbial immunoglobulin binding in the vaginal mucosa. In serum, differential immunoglobulin levels between ethnicity have been identified in studies performed several decades ago. Increased levels of IgG and IgA have been found in Black compared to White populations [ 65 , 66 , 67 ]. Also, vaginal cytokine levels in White and Black women have been reported to differ and to be differentially influenced by vaginal microbiota composition [ 12 , 68 , 69 ]. The underlying mechanisms in different immunoglobulin levels and their relation to vaginal microbiota between ethnically diverse women has been understudied and remains open for further investigation both in the circulation and at mucosal surfaces.

Conclusions

In conclusion, while microbial immunoglobulin coating is associated with vaginal microbiota composition and ethnicity, it is not associated with sPTB. We did find a strong association between L. iners- dominated and diverse vaginal microbiota and sPTB in nulliparous women. In addition, we further explored the association between microbial Ig coating and vaginal microbiota composition, showing that diverse vaginal microbiota have lower IgA and IgG coating than Lactobacillus- dominated microbiota. Further research should investigate whether microbial immunoglobulin coating plays a role in maintaining a Lactobacillus- dominated microbiota profile and whether it is involved in the ethnic disparities of vaginal microbiota composition.

Availability of data and materials

Due to enhanced privacy legislation regarding the presence of human DNA sequences in publicly available datasets, we cannot make raw sequencing data as used in the analyses in this study publicly available. For publication purposes, human DNA reads were removed from the sequence files using the software HoCoRT and the human genome assembly GRCh38.p14 as reference [ 70 ]. As note, the amount of human DNA detected in each file was lower than 1% and should not affect the outcome of any subsequent analysis. The cleaned sequencing data is available under study accession number PRJEB71956, sample accession numbers ERS17760025 to ERS17760318. The read count of individual ASVs per sample is available in Table S 2 . The data that support the findings of this study, including the raw sequencing data, are available from the corresponding author on reasonable request. Extensive data and material availability is described in a previous publication  [ 29 ].

Howson CP, Kinney MV, McDougall L, Lawn JE, Born Too Soon Preterm Birth Action G. Born too soon: preterm birth matters. Reprod Health. 2013;10 Suppl 1:S1.

Article   PubMed   Google Scholar  

Goldenberg RL, Culhane JF, Iams JD, Romero R. Epidemiology and causes of preterm birth. Lancet. 2008;371(9606):75–84.

Article   PubMed   PubMed Central   Google Scholar  

Blencowe H, Cousens S, Oestergaard MZ, Chou D, Moller AB, Narwal R, et al. National, regional, and worldwide estimates of preterm birth rates in the year 2010 with time trends since 1990 for selected countries: a systematic analysis and implications. Lancet. 2012;379(9832):2162–72.

Stout MJ, Busam R, Macones GA, Tuuli MG. Spontaneous and indicated preterm birth subtypes: interobserver agreement and accuracy of classification. Am J Obstet Gynecol. 2014;211(5):530 e1-4.

Schaaf JM, Ravelli AC, Mol BW, Abu-Hanna A. Development of a prognostic model for predicting spontaneous singleton preterm birth. Eur J Obstet Gynecol Reprod Biol. 2012;164(2):150–5.

Iams JD, Goldenberg RL, Meis PJ, Mercer BM, Moawad A, Das A, et al. The length of the cervix and the risk of spontaneous premature delivery. National Institute of Child Health and Human Development Maternal Fetal Medicine Unit Network. N Engl J Med. 1996;334(9):567–72.

Article   CAS   PubMed   Google Scholar  

Liu P, Xu L, Wang Y, Zhang Y, Du Y, Sun Y, Wang Z. Association between perinatal outcomes and maternal pre-pregnancy body mass index. Obes Rev. 2016;17(11):1091–102.

Soneji S, Beltran-Sanchez H. Association of maternal cigarette smoking and smoking cessation with preterm birth. JAMA Netw Open. 2019;2(4):e192514.

Peelen M, Kazemier BM, Ravelli ACJ, de Groot CJM, van der Post JAM, Mol BWJ, et al. Ethnic differences in the impact of male fetal gender on the risk of spontaneous preterm birth. J Perinatol. 2021;41(9):2165–72.

Callahan BJ, DiGiulio DB, Goltsman DSA, Sun CL, Costello EK, Jeganathan P, et al. Replication and refinement of a vaginal microbial signature of preterm birth in two racially distinct cohorts of US women. Proc Natl Acad Sci U S A. 2017;114(37):9966–71.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Elovitz MA, Gajer P, Riis V, Brown AG, Humphrys MS, Holm JB, Ravel J. Cervicovaginal microbiota and local immune response modulate the risk of spontaneous preterm delivery. Nat Commun. 2019;10(1):1305.

Fettweis JM, Serrano MG, Brooks JP, Edwards DJ, Girerd PH, Parikh HI, et al. The vaginal microbiome and preterm birth. Nat Med. 2019;25(6):1012–21.

Peelen MJ, Luef BM, Lamont RF, de Milliano I, Jensen JS, Limpens J, et al. The influence of the vaginal microbiota on preterm birth: a systematic review and recommendations for a minimum dataset for future research. Placenta. 2019;79:30–9.

Ravel J, Gajer P, Abdo Z, Schneider GM, Koenig SS, McCulle SL, et al. Vaginal microbiome of reproductive-age women. Proc Natl Acad Sci U S A. 2011;108(Suppl 1):4680–7.

Bayigga L, Kateete DP, Anderson DJ, Sekikubo M, Nakanjako D. Diversity of vaginal microbiota in sub-Saharan Africa and its effects on HIV transmission and prevention. Am J Obstet Gynecol. 2019;220(2):155–66.

Wiesenfeld HC, Hillier SL, Krohn MA, Landers DV, Sweet RL. Bacterial vaginosis is a strong predictor of Neisseria gonorrhoeae and Chlamydia trachomatis infection. Clin Infect Dis. 2003;36(5):663–8.

Agrawal V, Hirsch E. Intrauterine infection and preterm labor. Semin Fetal Neonatal Med. 2012;17(1):12–9.

Green ES, Arck PC. Pathogenesis of preterm birth: bidirectional inflammation in mother and fetus. Semin Immunopathol. 2020;42(4):413–29.

Romero R, Espinoza J, Goncalves LF, Kusanovic JP, Friel LA, Nien JK. Inflammation in preterm and term labour and delivery. Semin Fetal Neonatal Med. 2006;11(5):317–26.

Borgdorff H, van der Veer C, van Houdt R, Alberts CJ, de Vries HJ, Bruisten SM, et al. The association between ethnicity and vaginal microbiota composition in Amsterdam, the Netherlands. PLoS One. 2017;12(7):e0181135.

Chen A, McKinley SA, Wang S, Shi F, Mucha PJ, Forest MG, Lai SK. Transient antibody-mucin interactions produce a dynamic molecular shield against viral invasion. Biophys J. 2014;106(9):2028–36.

Breedveld A, van Egmond M. IgA and FcαRI: pathological roles and therapeutic opportunities. Front Immunol. 2019;10(553).

Bunker JJ, Erickson SA, Flynn TM, Henry C, Koval JC, Meisel M, et al. Natural polyreactive IgA antibodies coat the intestinal microbiota. Science. 2017;358(6361).

Okai S, Usui F, Yokota S, Hori IY, Hasegawa M, Nakamura T, et al. High-affinity monoclonal IgA regulates gut microbiota and prevents colitis in mice. Nat Microbiol. 2016;1(9):16103.

Palm NW, de Zoete MR, Cullen TW, Barry NA, Stefanowski J, Hao L, et al. Immunoglobulin A coating identifies colitogenic bacteria in inflammatory bowel disease. Cell. 2014;158(5):1000–10.

Harmsen HJ, Pouwels SD, Funke A, Bos NA, Dijkstra G. Crohn’s disease patients have more IgG-binding fecal bacteria than controls. Clin Vaccine Immunol. 2012;19(4):515–21.

Usala SJ, Usala FO, Haciski R, Holt JA, Schumacher GF. IgG and IgA content of vaginal fluid during the menstrual cycle. J Reprod Med. 1989;34(4):292–4.

CAS   PubMed   Google Scholar  

Breedveld AC, Schuster HJ, van Houdt R, Painter RC, Mebius RE, van der Veer C, et al. Enhanced IgA coating of bacteria in women with Lactobacillus crispatus-dominated vaginal microbiota. Microbiome. 2022;10(1):15.

Schuster HJ, Peelen M, Hajenius PJ, van Beukering MDM, van Eekelen R, Schonewille M, et al. Risk factors for spontaneous preterm birth among healthy nulliparous pregnant women in the Netherlands, a prospective cohort study. Health Sci Rep. 2022;5(3):e585.

Heijneman MJ, Evers JLH, Massuger LFAG, Steegers EAP. Obstetrie en gynaecologie. De voortplanting van de mens. Maarssen: Elsevier; 2008.

Google Scholar  

Knol FA. Van hoog naar laag: van laag naar hoog. Sociaal en Cultureel planbureau. 1998.

Fadrosh DW, Ma B, Gajer P, Sengamalay N, Ott S, Brotman RM, Ravel J. An improved dual-indexing approach for multiplexed 16S rRNA gene sequencing on the Illumina MiSeq platform. Microbiome. 2014;2(1):6.

Martin M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet J. 2011;17(1):10–2.

Article   Google Scholar  

Bolyen E, Rideout JR, Dillon MR, Bokulich NA, Abnet CC, Al-Ghalith GA, et al. Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2. Nat Biotechnol. 2019;37(8):852–7.

Callahan BJ, McMurdie PJ, Rosen MJ, Han AW, Johnson AJ, Holmes SP. DADA2: High-resolution sample inference from Illumina amplicon data. Nat Methods. 2016;13(7):581–3.

Wang Q, Garrity GM, Tiedje JM, Cole JR. Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. Appl Environ Microbiol. 2007;73(16):5261–7.

France MT, Ma B, Gajer P, Brown S, Humphrys MS, Holm JB, et al. VALENCIA: a nearest centroid classification method for vaginal microbial communities based on composition. Microbiome. 2020;8(1):166.

Rubin D. Multiple imputation for nonresponse in surveys. New York: Wiley; 2004.

Wang X. Firth logistic regression for rare variant association tests. Front Genet. 2014;5:187.

Segata N, Izard J, Waldron L, Gevers D, Miropolsky L, Garrett WS, Huttenhower C. Metagenomic biomarker discovery and explanation. Genome Biol. 2011;12(6):R60.

Chang DH, Shin J, Rhee MS, Park KR, Cho BK, Lee SK, Kim BC. Vaginal microbiota profiles of native Korean women and associations with high-risk pregnancy. J Microbiol Biotechnol. 2020;30(2):248–58.

Tabatabaei N, Eren AM, Barreiro LB, Yotova V, Dumaine A, Allard C, Fraser WD. Vaginal microbiome in early pregnancy and subsequent risk of spontaneous preterm birth: a case-control study. BJOG. 2019;126(3):349–58.

Kindinger LM, Bennett PR, Lee YS, Marchesi JR, Smith A, Cacciatore S, et al. The interaction between vaginal microbiota, cervical length, and vaginal progesterone treatment for preterm birth risk. Microbiome. 2017;5(1):6.

Kumar M, Murugesan S, Singh P, Saadaoui M, Elhag DA, Terranegra A, et al. Vaginal microbiota and cytokine levels predict preterm delivery in Asian women. Front Cell Infect Microbiol. 2021;11:639665.

Payne MS, Newnham JP, Doherty DA, Furfaro LL, Pendal NL, Loh DE, Keelan JA. A specific bacterial DNA signature in the vagina of Australian women in midpregnancy predicts high risk of spontaneous preterm birth (the Predict1000 study). Am J Obstet Gynecol. 2021;224(2):206 e1-e23.

Petrova MI, Reid G, Vaneechoutte M, Lebeer S. Lactobacillus iners: Friend or Foe? Trends Microbiol. 2017;25(3):182–91.

Brown RG, Al-Memar M, Marchesi JR, Lee YS, Smith A, Chan D, et al. Establishment of vaginal microbiota composition in early pregnancy and its association with subsequent preterm prelabor rupture of the fetal membranes. Transl Res. 2019;207:30–43.

DiGiulio DB, Callahan BJ, McMurdie PJ, Costello EK, Lyell DJ, Robaczewska A, et al. Temporal and spatial variation of the human microbiota during pregnancy. Proc Natl Acad Sci U S A. 2015;112(35):11060–5.

Freitas AC, Bocking A, Hill JE, Money DM, Group VR. Increased richness and diversity of the vaginal microbiota and spontaneous preterm birth. Microbiome. 2018;6(1):117.

Nasioudis D, Forney LJ, Schneider GM, Gliniewicz K, France M, Boester A, et al. Influence of pregnancy history on the vaginal microbiome of pregnant women in their first trimester. Sci Rep. 2017;7(1):10201.

Kervinen K, Holster T, Saqib S, Virtanen S, Stefanovic V, Rahkonen L, et al. Parity and gestational age are associated with vaginal microbiota composition in term and late term pregnancies. 2021.

Phillips C, Velji Z, Hanly C, Metcalfe A. Risk of recurrent spontaneous preterm birth: a systematic review and meta-analysis. BMJ Open. 2017;7(6):e015402.

Chan D, Bennett PR, Lee YS, Kundu S, Teoh TG, Adan M, et al. Microbial-driven preterm labour involves crosstalk between the innate and adaptive immune response. Nat Commun. 2022;13(1):975.

Goodfellow L, Verwijs MC, Care A, Sharp A, Ivandic J, Poljak B, et al. Vaginal bacterial load in the second trimester is associated with early preterm birth recurrence: a nested case-control study. BJOG. 2021;128(13):2061–72.

Meertens LJE, van Montfort P, Scheepers HCJ, van Kuijk SMJ, Aardenburg R, Langenveld J, et al. Prediction models for the risk of spontaneous preterm birth based on maternal characteristics: a systematic review and independent external validation. Acta Obstet Gynecol Scand. 2018;97(8):907–20.

Flaviani F, Hezelgrave NL, Kanno T, Prosdocimi EM, Chin-Smith E, Ridout AE, et al. Cervicovaginal microbiota and metabolome predict preterm birth risk in an ethnically diverse cohort. JCI Insight. 2021;6(16).

Park S, Oh D, Heo H, Lee G, Kim SM, Ansari A, et al. Prediction of preterm birth based on machine learning using bacterial risk score in cervicovaginal fluid. Am J Reprod Immunol. 2021;86(3):e13435.

Amabebe E, Reynolds S, He X, Wood R, Stern V, Anumba DOC. Infection/inflammation-associated preterm delivery within 14 days of presentation with symptoms of preterm labour: a multivariate predictive model. PLoS ONE. 2019;14(9):e0222455.

Ashford K, Chavan NR, Wiggins AT, Sayre MM, McCubbin A, Critchfield AS, O’Brien J. Comparison of serum and cervical cytokine levels throughout pregnancy between preterm and term births. AJP Rep. 2018;8(2):e113–20.

Huus KE, Bauer KC, Brown EM, Bozorgmehr T, Woodward SE, Serapio-Palacios A, et al. Commensal bacteria modulate immunoglobulin A binding in response to host nutrition. Cell Host Microbe. 2020;27(6):909-21 e5.

Kau AL, Planer JD, Liu J, Rao S, Yatsunenko T, Trehan I, et al. Functional characterization of IgA-targeted bacterial taxa from undernourished Malawian children that produce diet-dependent enteropathy. Sci Transl Med. 2015;7(276):276ra24.

Hansen IS, Hoepel W, Zaat SAJ, Baeten DLP, den Dunnen J. Serum IgA immune complexes promote proinflammatory cytokine production by human macrophages, monocytes, and Kupffer cells through FcalphaRI-TLR cross-talk. J Immunol. 2017;199(12):4124–31.

Sterlin D, Fadlallah J, Adams O, Fieschi C, Parizot C, Dorgham K, et al. Human IgA binds a diverse array of commensal bacteria. J Exp Med. 2020;217(3).

Jackson MA, Pearson C, Ilott NE, Huus KE, Hegazy AN, Webber J, et al. Accurate identification and quantification of commensal microbiota bound by host immunoglobulins. Microbiome. 2021;9(1):33.

Maddison SE, Stewart CC, Farshy CE, Reimer CB. The relationship of race, sex, and age to concentrations of serum immunoglobulins expressed in international units in healthy adults in the USA. Bull World Health Organ. 1975;52(2):179–85.

CAS   PubMed   PubMed Central   Google Scholar  

Shulman G, Gilich GC, Andrew MJ. Serum immunoglobulins G, A and M in White and Black adults on the Witwatersrand. S Afr Med J. 1975;49(29):1160–4.

Grundbacher FJ. Heritability estimates and genetic and environmental correlations for the human immunoglobulins G, M, and A. Am J Hum Genet. 1974;26(1):1–12.

Ryckman KK, Williams SM, Krohn MA, Simhan HN. Racial differences in cervical cytokine concentrations between pregnant women with and without bacterial vaginosis. J Reprod Immunol. 2008;78(2):166–71.

Lennard K, Dabee S, Barnabas SL, Havyarimana E, Blakney A, Jaumdally SZ, et al. Microbial composition predicts genital tract inflammation and persistent bacterial vaginosis in South African adolescent females. Infect Immun. 2018;86(1).

Rumbavicius I, Rounge TB, Rognes T. HoCoRT: host contamination removal tool. BMC Bioinformatics. 2023;24(1):371.

Download references

This study was financed by ZonMw, the Netherlands Organization of Health Research and Development, project number VICI 91814650, and Amsterdam Reproduction and Development (AR&D 2016). The study sponsors had no role in the study design, collection, analysis, and interpretation of the data, the writing of the report, and decision to submit this paper for publication.

Author information

Authors and affiliations.

Amsterdam UMC location Vrije Universiteit Amsterdam, Medical Microbiology and Infection Control, Boelelaan 1117, Amsterdam, The Netherlands

H. J. Schuster, S. P. F. Matamoros, P. H. M. Savelkoul & R. van Houdt

Amsterdam institute for Immunology and Infectious Diseases, Amsterdam, The Netherlands

H. J. Schuster, A. C. Breedveld, S. P. F. Matamoros, P. H. M. Savelkoul, M. van Egmond & R. van Houdt

Amsterdam UMC location University of Amsterdam, Obstetrics and Gynecology, Meibergdreef 9, Amsterdam, The Netherlands

H. J. Schuster, M. Kok & P. J. Hajenius

Amsterdam Reproduction and Development, Amsterdam, The Netherlands

H. J. Schuster, R. C. Painter, M. Kok & P. J. Hajenius

Amsterdam UMC location Vrije Universiteit Amsterdam, Molecular Cell Biology and Immunology, Boeleaan 1117, Amsterdam, The Netherlands

A. C. Breedveld & M. van Egmond

Amsterdam UMC location Vrije Universiteit Amsterdam, Epidemiology and Data Science, Boelelaan 1117, Amsterdam, The Netherlands

R. van Eekelen

Amsterdam UMC location Vrije Universiteit Amsterdam, Obstetrics and Gynaecology, Boelelaan 1117, Amsterdam, The Netherlands

R. C. Painter

Maastricht University Medical Center+, Medical Microbiology, School of Nutrition and Translational Research in Metabolism (NUTRIM), Maastricht, The Netherlands

P. H. M. Savelkoul

Amsterdam UMC location Vrije Universiteit Amsterdam, Surgery, Boelelaan 1117, Amsterdam, The Netherlands

M. van Egmond

You can also search for this author in PubMed   Google Scholar

Contributions

M.K., P.J.H., and R.C.P. conceived and led the clinical study. M.v.E. and R.v.H. led immunology and microbiome data generation. H.J.S. and A.C.B. generated immunology data. H.J.S. and R.v.H. generated microbiome data. H.J.S., A.C.B., S.M., and R.v.E. performed data processing and formal analyses. H.J.S., A.C.B., R.C.P, P.H.M.S., M.v.E., and R.v.H. performed data interpretation. H.J.S. and A.C.B. wrote the first draft of the manuscript. All authors critically reviewed, read, and approved the final manuscript.

Corresponding author

Correspondence to H. J. Schuster .

Ethics declarations

Ethics approval and consent to participate.

The study received ethical clearance through the institutional review board of the Academic Medical Center in Amsterdam, the Netherlands (registration number NL43414.018.13). All participants provided written informed consent.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary material 1..

Supplementary methods.

Supplementary Material 2.

Table S1. Characteristics of participants with short cervix ( n =136). SD: standard deviation. IQR: interquartile range. *including miscarriage.

Supplementary Material 3.

Table S2. Read count and community state type per sample. 

Supplementary Material 4.

Table S3. Sexually transmitted infections. Multiple samples are available per participant and pathogens can be detected in some, but not all, samples from the participant. Therefore, results are presented as sexually transmitted infections (STIs) per participant in total and per sample individually.

Supplementary Material 5.

Table S4. Associations with spontaneous preterm birth.

Supplementary Material 6.

Table S5. Correlation between microbial bound and unbound immunoglobulins.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Schuster, H.J., Breedveld, A.C., Matamoros, S.P.F. et al. The interrelation between microbial immunoglobulin coating, vaginal microbiota, ethnicity, and preterm birth. Microbiome 12 , 99 (2024). https://doi.org/10.1186/s40168-024-01787-z

Download citation

Received : 22 December 2022

Accepted : 01 March 2024

Published : 28 May 2024

DOI : https://doi.org/10.1186/s40168-024-01787-z

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Vaginal microbiota
  • Spontaneous preterm birth
  • Immunoglobulins
  • Host-microbiota interaction
  • Nulliparous women

ISSN: 2049-2618

case control study and cohort study

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • BMC Med Res Methodol

Logo of bmcmrm

Clarifying the distinction between case series and cohort studies in systematic reviews of comparative studies: potential impact on body of evidence and workload

Institute for Research in Operative Medicine, Chair of Surgical Research, Faculty of Health, School of Medicine, Witten/Herdecke University, Ostmerheimer Str. 200, 51109 Cologne, Germany

Dawid Pieper

Associated data.

Not applicable.

Distinguishing cohort studies from case series is difficult.

We propose a conceptualization of cohort studies in systematic reviews of comparative studies. The main aim of this conceptualization is to clarify the distinction between cohort studies and case series. We discuss the potential impact of the proposed conceptualization on the body of evidence and workload.

All studies with exposure-based sampling gather multiple exposures (with at least two different exposures or levels of exposure) and enable calculation of relative risks that should be considered cohort studies in systematic reviews, including non-randomized studies. The term “enables/can” means that a predefined analytic comparison is not a prerequisite (i.e., the absolute risks per group and/or a risk ratio are provided). Instead, all studies for which sufficient data are available for reanalysis to compare different exposures (e.g., sufficient data in the publication) are classified as cohort studies.

There are possibly large numbers of studies without a comparison for the exposure of interest but that do provide the necessary data to calculate effect measures for a comparison. Consequently, more studies could be included in a systematic review. Therefore, on the one hand, the outlined approach can increase the confidence in effect estimates and the strengths of conclusions. On the other hand, the workload would increase (e.g., additional data extraction and risk of bias assessment, as well as reanalyses).

Systematic reviews that include non-randomized studies often consider different observational study designs [ 1 ]. However, the distinction between different non-randomized study designs is difficult. One key design feature to classify observational study designs is to distinguish comparative from non-comparative studies [ 2 , 3 ]. The lack of a comparison group is of particular importance for distinguishing cohort studies from case series because in many definitions, they share a main design feature of having a follow-up period examining the exposed individuals over time [ 2 , 3 ]. The only difference between cohort studies and case series in many definitions is that cohort studies compare different groups (i.e., examine the association between exposure and outcome), while case series are uncontrolled [ 3 – 5 ]. Table ​ Table1 1 shows an example definition [ 3 ]. The problem with this definition is that vague terms, such as comparison and examination of association, might be interpreted as an analytic comparison of at least two exposures (i.e., interventions, risk factors or prognostic factors).

Example definitions of cohort studies and case series [ 2 ]

For example, imagine a study of 20 consecutive patients with a certain disease that can be treated in two different ways. A study that divides the 20 patients into two groups according to the treatment received and compares the outcomes of these groups (e.g., provides aggregated absolute risks per group or a risk ratio) would be probably classified as a cohort study (the example used in the following sections is denoted “study 1”). A sample of this study type is illustrated in Fig. ​ Fig.1 1 and Table ​ Table2 2 .

An external file that holds a picture, illustration, etc.
Object name is 12874_2017_391_Fig1_HTML.jpg

Cohort study (vague definition)

Possible presentation of a study with a preexisting exposure based comparison (cohort study not requiring a reanalysis)

In contrast, a publication that describes the interventions received and outcomes for each patient/case separately would probably be classified as a case series (the example in the following sections is denoted “study 2”). An example of this study type is illustrated in Fig. ​ Fig.2 2 and Table ​ Table3. 3 . In the medical literature, the data on exposure and outcomes are usually provided in either running text or spreadsheet formats [ 6 – 21 ]. A good example is the study by Wong et al. [ 10 ]. In this study, information on placental invasion (exposure) and blood loss (outcome) is separately provided for 40 pregnant women in a table. The study by Cheng et al. is an example of a study providing information in the running text (i.e., anticoagulation management [exposure] and recovery [outcome] for paediatric stroke) [ 6 ].

An external file that holds a picture, illustration, etc.
Object name is 12874_2017_391_Fig2_HTML.jpg

Case series (vague definition)

Possible presentation of study without a preexisting exposure based comparison (cohort study requiring a reanalysis)

These examples illustrate that distinguishing between cohort studies and case series is difficult. Vague definitions are probably the reason for the common confusion between study designs. A recent study found that approximately 72% of cohort studies are mislabelled as case series [ 22 ]. Many systematic reviews of non-randomized studies included cohort studies but excluded case series (see examples in [ 23 – 28 ]). Therefore, the unclear distinction between case series and cohort studies can result in inconsistent study selection and unjustified exclusions from a systematic review. The risk of misclassification is particularly high because study authors also often mislabel their study or studies are not classified by their authors at all (see examples in [ 6 – 21 ]).

We propose a conceptualization of cohort studies in systematic reviews of comparative studies. The main objective of this conceptualization is to clarify the distinction between cohort studies and case series in systematic reviews, including non-randomized comparative studies. We discuss the potential impact of the proposed conceptualization on the body of evidence and workload.

Clarifying the distinction between case series and cohort studies (the solution)

In the following report, we propose a conceptualization for cohort studies and case series (e.g., sampling) for systematic reviews, including comparative non-randomized studies. Our proposal is based on a recent conceptualization of cohort studies and case series by Dekkers et al. [ 29 ]. The main feature of this conceptualization is that it is exclusively based on inherent design features and is not affected by the analysis.

Cohort studies of one exposure/one group

Dekkers et al. [ 29 ] defined cohort studies with one exposure as studies with exposure-based sampling that enable calculating absolute effects measures for a risk of outcome. This definition means that “the absence of a control group in an exposure-based study does not define a case series” [ 29 ]. The definition of cohort studies according to Dekkers et al. [ 29 ] is summarized in Table ​ Table4 4 .

Summary of the distinction proposed by Dekkers et al. [ 28 ]

Cohort studies of multiple exposures/more than one group

This idea can be easily extended to studies with more than one exposure. In this case, all studies with exposure-based sampling gathering multiple exposures (i.e., at least two different exposures, manifestations of exposures or levels of exposures) can be considered as (comparative) cohort studies (Fig. ​ (Fig.3). 3 ). The sampling is based on exposure, and there are different groups. Consequently, relative risks can be calculated [ 29 ]. The term “enables/can” implies that a predefined analytic comparison is not a prerequisite but that all studies with sufficient data to enable a reanalysis (e.g., in the publication, study reports, and supplementary material) would be classified as cohort studies.

An external file that holds a picture, illustration, etc.
Object name is 12874_2017_391_Fig3_HTML.jpg

Cohort study (deduced from Dekkers et al. [ 28 ])

In short, all studies that enable calculation of a relative risk to quantify a difference in outcomes between different groups should be considered cohort studies.

Case series

According to Dekkers et al. [ 29 ], the sampling of a case series is either based on exposure and outcome (e.g., all patients are treated and have an adverse event) or case series include patients with a certain outcome regardless of exposure (see Fig. ​ Fig.4). 4 ). Consequently, no absolute risk and also no relative effect measures for an outcome can be calculated in a case series. Note that sampling in a case series does not need to be consecutive. Consecutiveness would increase the quality of the case series, but a non-consecutive series is also a case series [ 29 ].

An external file that holds a picture, illustration, etc.
Object name is 12874_2017_391_Fig4_HTML.jpg

Case series (Deckers et al. [ 28 ])

In short, for a case series, there are no absolute risks, and also, no risk ratios can be calculated. Consequently, a case series cannot be comparative. The definition of a case series by Dekkers et al. [ 29 ] is summarized in Table ​ Table4 4 .

It is noteworthy that the conceptualization also ensures a clear distinction of case series from other study designs that apply outcome-based sampling. Case series, case-control studies (including case-time-control), and self-controlled case-control designs (e.g., case-crossover) all have outcome-based sampling in common [ 29 ].

Case series have no control at all because only patients with a certain manifestation of outcomes are sampled (e.g., individuals with a disease or deceased individuals). In contrast, all case-control designs as well as self-controlled case-control designs have a control group. In case-control studies, the control group constitutes individuals with another manifestation of the outcome (e.g., healthy individuals or survivors). This outcome can be considered as two case series (i.e., case group and no case group).

Self-controlled case-control studies are characterized by an intra-individual comparison (each individual is their own control) [ 30 ]. Information is also sampled when patients are not exposed. Therefore, case-control designs as well as self-controlled case-control studies enable the calculation of risk ratios. This approach is not possible for a case series.

Illustrating example

Above, we illustrated that by using a vague definition, the classification of a study design might be influenced by the preparation and analysis of the study data. The proposed conceptualization is exclusively based on the inherent design features (e.g., sampling, exposure). After considering the example studies again using the proposed conceptualization, all studies would be classified as cohort studies because the relative risk can be calculated. This outcome becomes clear looking at Table ​ Table2 2 and Table ​ Table3. 3 . If the patients in Table ​ Table3 3 are rearranged according the exposure and the data are reanalysed (i.e., calculation of absolute risk per group and relative risks to compare groups), Table ​ Table3 3 can be converted into Table ​ Table2 2 (and also, Fig. ​ Fig.2 2 can be converted to Fig. ​ Fig.3). 3 ). In the study by Wong et al. [ 10 ], the mean blood loss in the group with placental invasion and in the group without placental invasion can be calculated and compared (e.g., relative risk with 95% confidence limits). In this study, the data on gestational age are also provided in the table. Therefore, it is even possible to adjust the results for gestational age (e.g., using a logistic regression).

Discussion (the impact)

Influence on the body of evidence.

The proposed conceptualization is exclusively based on inherent study design features; therefore, there is less room for misinterpretation compared to existing conceptualizations because analysis features, presentation of data and labelling of the study are not determined. Thus, the conceptualization ensures consistent study selection for systematic reviews.

The prerequisite of an analytical comparison in the publication can lead to the unjustified exclusion of relevant studies from a systematic review. Study 1 would likely be included, and Study 2 would be excluded from the systematic review. The only differences between Study 1 and Study 2 are the analysis and preparation of data. If the data source (e.g., chart review) and the reanalysis (calculation of effect measures and statistical tests) to compare the intervention and control group in Study 2 are performed exactly with the same approach as the existing analysis in Study 1, there can be no difference in the effect estimates between studies, and the studies are at the same risk of bias. Thus, the inclusion of Study 1 and the exclusion of Study 2 are contradictory to the requirement that systematic reviews identify all available evidence [ 31 ].

Considering that more studies would be eligible for inclusion and that the hierarchical paradigm of the levels of evidence is not valid per se, the proposed conceptualization can potentially enrich bodies of evidence and increase confidence in effect estimates.

Influence on workload

The additional inclusion of all studies that enable calculating relative risk for the comparison of interest might impact the workload of systematic reviews. There might be a considerable number of studies not performing a comparison already but that provide sufficient data for reanalysis. Usually the electronic search strategy for systematic reviews of non-randomized studies is not limited to certain study types because there are no sensitive search filters available yet [ 32 ]. Therefore, the search results do not usually include cohort studies as discussed above. However, in many abstracts it would be not directly clear if sufficient data for re-calculations are reported in the full text article (e.g., a table like Table ​ Table3). 3 ). Consequently, many additional potentially relevant full-text studies have to be screened. Additionally, studies often assess various exposures (e.g., different baseline characteristics), and it might thus be difficult to identify relevant exposures. Considering the large amount of wrongly labelled studies, this approach can lead to additional screening effort [ 22 ].

As a result, more studies would be included in systematic reviews. All articles that provide potentially relevant data would have to be assessed in detail to decide whether reanalysis is feasible. For these data extractions, a risk of bias assessment would have to be performed. Challenges in the risk of bias assessment would arise because most assessment tools are constructed to assess a predefined control group [ 33 ]. For example, items regarding the adequacy of analysis (e.g., adjustment for confounders) cannot be assessed anymore. Effect measures must be calculated (e.g., risks by group and relative risk with a 95% confidence limit), and eventually further analyses (e.g., adjustments for confounders) might be necessary for studies that provide sufficient data. Moreover, advanced biometrical expertise would be necessary to judge the feasibility (i.e., determining the possibility to calculate relative risks and whether there are sufficient data to adjust for confounders) of a re-analysis and to conduct the reanalysis.

Promising areas of application

In the medical literature, it is likely that more retrospective mislabelled cohort studies (comparison planned after data collection) based on routinely collected data (e.g., chart review, review of radiology databases) than prospectively planned (i.e., comparisons planned before data collection) and wrongly labelled cohort studies can be found. Thus, it can be assumed that the wrongly labelled studies tend to have lower methodological quality than studies that already include a comparison. This aspect should be considered in decisions about including studies that must be reanalysed. In research areas in which randomized controlled trials or large planned prospective and well-conducted cohort studies can be expected (e.g., risk factors for widespread diseases), the approach is less promising for enriching the body of evidence. Consequently, in these areas, the additional effort might not be worthwhile.

Again, the conceptualization is particularly promising in research areas in which evidence is sparse because studies are difficult to conduct or populations are small or the event rates are low. These areas include rare diseases, adverse events/complications, sensitive groups (e.g., children or individuals with cognitive deficiencies) or rarely used interventions (e.g., costly innovations). In these areas, there might be no well-conducted studies at all [ 34 , 35 ]. Therefore, the proposed conceptualization in this report has great potential to increase confidence in effect estimates.

We proposed a conceptualization for cohort studies with multiple exposures that ensures a clear distinction from case series. In this conceptualization, all studies that contain sufficient data to conduct a reanalysis and not only studies with a pre-existing analytic comparison are classified as cohort studies and are considered appropriate for inclusion in systematic reviews. To the best of our knowledge, no systematic reviews exist that reanalyse (mislabelled) case series to create cohort studies. The outlined approach is a method that can potentially enrich the body of evidence and subsequently enhance confidence in effect estimates and the strengths of conclusions. However, the enrichment of the body of evidence should be balanced against the additional workload.

Acknowledgements

There was no external funding for the research or publication of this article.

Availability of data and materials

Authors’ contributions.

All authors have made substantial contributions to the work. Both authors read and approved the final manuscript.

Ethics approval and consent to participate

Not applicable. No human data involved.

Consent for publication

Not applicable. The manuscript contains no individual person’s data.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Tim Mathes, Phone: +0049 221, Phone: 98957-43, Email: [email protected] .

Dawid Pieper, Phone: +0049 221, Phone: 98957-40, Email: [email protected] .

COMMENTS

  1. Case-control and Cohort studies: A brief overview

    An overview of Case-control and Cohort studies: what are they, how are they different, and what are the pros and cons of each study design.

  2. Observational Studies: Cohort and Case-Control Studies

    Cohort studies and case-control studies are two primary types of observational studies that aid in evaluating associations between diseases and exposures. In this review article, we describe these study designs, methodological issues, and provide examples from the plastic surgery literature. Keywords: observational studies, case-control study ...

  3. An Introduction to the Fundamentals of Cohort and Case-Control Studies

    In a case-control study, a number of cases and noncases (controls) are identified, and the occurrence of one or more prior exposures is compared between groups to evaluate drug-outcome associations ( Figure 1 ). A case-control study runs in reverse relative to a cohort study. 21 As such, study inception occurs when a patient experiences ...

  4. What Is a Case-Control Study?

    Case-control studies are a type of observational study often used in fields like medical research, environmental health, or epidemiology. While most observational studies are qualitative in nature, case-control studies can also be quantitative, and they often are in healthcare settings. Case-control studies can be used for both exploratory and ...

  5. 6.3

    A case-cohort study is similar to a nested case-control study in that the cases and non-cases are within a parent cohort; cases and non-cases are identified at time t1, after baseline. In a case-cohort study, the cohort members were assessed for risk factors at any time prior to t1. Non-cases are randomly selected from the parent cohort ...

  6. The case for case-cohort: An applied epidemiologist's guide to re

    The two study designs have similar statistical precision for addressing a singular research question, but case-cohort studies have broader efficiency and superior flexibility. Despite this, case-cohort designs are comparatively underutilized in the epidemiologic literature.

  7. A Practical Overview of Case-Control Studies in Clinical Practice

    Case-control studies are one of the major observational study designs for performing clinical research. The advantages of these study designs over other study designs are that they are relatively quick to perform, economical, and easy to design and implement. Case-control studies are particularly appropriate for studying disease outbreaks, rare ...

  8. Case Cohort Study

    The main difference between a nested case-control study and a case-cohort study is the way in which controls are chosen. Generally, the main advantage of case-cohort design over nested case-control design is that the same control group can be used for comparison with different case groups in a case-cohort study.

  9. Introduction to Epidemiological Studies

    The basic epidemiological study designs are cross-sectional, case-control, and cohort studies. Cross-sectional studies provide a snapshot of a population by determining both exposures and outcomes at one time point. Cohort studies identify the study groups based on the exposure and, then, the researchers follow up study participants to measure ...

  10. What's the difference between a case-control study and a cohort study?

    A case-control study differs from a cohort study because cohort studies are more longitudinal in nature and do not necessarily require a control group.

  11. Observational studies: cohort and case-control studies

    Cohort studies and case-control studies are two primary types of observational studies that aid in evaluating associations between diseases and exposures. In this review article, the authors describe these study designs and methodologic issues, and provide examples from the plastic surgery literature.

  12. Observational research methods. Research design II: cohort, cross

    Cohort, cross sectional, and case-control studies are collectively referred to as observational studies. Often these studies are the only practicable method of studying various problems, for example, studies of aetiology, instances where a randomised controlled trial might be unethical, or if the condition to be studied is rare. Cohort studies are used to study incidence, causes, and prognosis ...

  13. Observational research methods—Cohort studies, cross sectional studies

    Cohort, cross sectional, and case-control studies are collectively referred to as observational studies. Observational studies are often the only practicable method of answering questions of aetiology, the natural history and treatment of rare conditions and instances where a randomised controlled trial might be unethical.

  14. Case Control Studies

    A case-control study is a type of observational study commonly used to look at factors associated with diseases or outcomes.[1] The case-control study starts with a group of cases, which are the individuals who have the outcome of interest. The researcher then tries to construct a second group of individuals called the controls, who are similar to the case individuals but do not have the ...

  15. What Is a Cohort Study?

    A cohort study is a type of observational study that follows a group of participants over a period of time, examining how certain factors (like exposure to a given risk factor) affect their health outcomes. The individuals in the cohort have a characteristic or lived experience in common, such as birth year or geographic area.

  16. Designing and Conducting Analytic Studies in the Field

    Cohort, case-control, and case-case studies are the types of analytic studies that field epidemiologists use most often. They are best used as mechanisms for evaluating—quantifying and testing—hypotheses identified in earlier phases of the investigation.

  17. A Practical Overview of Case-Control Studies in Clinical Practice

    Case-control studies are one of the major observational study designs for performing clinical research. The advantages of these study designs over other study designs are that they are relatively quick to perform, economical, and easy to design and implement. Case-control studies are particularly appropriate for studying disease outbreaks, rare diseases, or outcomes of interest. This article ...

  18. 7.2.1

    A case-cohort study is similar to a nested case-control study in that the cases and non-cases are within a parent cohort; cases and non-cases are identified at time t 1, after baseline. In a case-cohort study, the cohort members were assessed for risk factors at any time prior to t 1. Non-cases are randomly selected from the parent cohort ...

  19. A case-control and cohort study to determine the relationship between

    We conducted a case-control and a cohort study in an inner city primary and secondary care setting to examine whether ethnic background affects the risk of hospital admission with severe COVID-19 and/or in-hospital mortality.

  20. Research Design: Case-Control Studies

    Earlier articles in this series described classifications in research design, 1 prospective and retrospective studies, cross-sectional and longitudinal studies, 2 and cohort studies. 3 This article considers a research design that is often used in present-day research in medicine and psychiatry: the case-control study.

  21. Hemodynamic parameters and diabetes mellitus in community ...

    This case-control study enrolled 417 DM patients and 3475 non-DM controls from a community-based cohort.

  22. Clarifying the distinction between case series and cohort studies in

    Distinguishing cohort studies from case series is difficult.We propose a conceptualization of cohort studies in systematic reviews of comparative studies. The main aim of this conceptualization is to clarify the distinction between cohort studies and case series. We discuss the potential impact of the proposed conceptualization on the body of evidence and workload.All studies with exposure ...

  23. Misleading and avoidable: design-induced biases in observational

    We applied four different study designs and compared the results: cohort study with / without alignment at time zero, case control study with / without alignment at time zero.

  24. The interrelation between microbial immunoglobulin coating, vaginal

    In a prospective cohort of nulliparous pregnant women, we assessed vaginal microbiota composition, vaginal immunoglobulins (Igs), and local inflammatory markers. We performed a nested case-control study with 19 sPTB cases, matched based on ethnicity and midwifery practice to 19 term controls.

  25. Overview: Cohort Study Designs

    Since cohort studies are observational, study participants are monitored, and study interventions are not provided. This paper describes the prospective and retrospective cohort designs, examines the strengths and weaknesses, and discusses methods to report the results.

  26. The Tryptophan Index Is Associated with Risk of Ischemic Stroke: A

    Methods: We performed a nested case-control study within a community-based cohort in eastern China over the period 2013 to 2018. The analysis included 321 cases of ischemic stroke and 321 controls matched by sex and date of birth.

  27. Serum heavy metals and breast cancer risk: A case-control study nested

    We aimed to investigate the association between heavy metals and BC risk in a case-control study nested within the Florence section of the EPIC (European Prospective Investigation into Cancer and nutrition) cohort. We included 150 BC cases and an equal number of controls individually matched to cases by age and year of enrolment.

  28. Clarifying the distinction between case series and cohort studies in

    We propose a conceptualization of cohort studies in systematic reviews of comparative studies. The main aim of this conceptualization is to clarify the distinction between cohort studies and case series. We discuss the potential impact of the proposed conceptualization on the body of evidence and workload.