Cochrane Methods Bias

Cochrane handbook.

The Cochrane Handbook for Systematic Reviews of Interventions

The Cochrane Handbook provides guidance for authors on how to conduct a systematic review (including Cochrane Reviews). The Handbook covers all aspects such as preparing a review, searching for studies, assessing risk of bias in included studies, analysing data and undertaking meta-analyses, and interpreting results and drawing conclusions.

The current version is available here .

  • - Google Chrome

Intended for healthcare professionals

  • My email alerts
  • BMA member login
  • Username * Password * Forgot your log in details? Need to activate BMA Member Log In Log in via OpenAthens Log in via your institution

Home

Search form

  • Advanced search
  • Search responses
  • Search blogs
  • RoB 2: a revised tool...

RoB 2: a revised tool for assessing risk of bias in randomised trials

  • Related content
  • Peer review
  • Jelena Savović , senior research fellow 1 3 ,
  • Matthew J Page , research fellow 4 ,
  • Roy G Elbers , senior research associate 1 ,
  • Natalie S Blencowe 1 2 ,
  • Isabelle Boutron , professor 5 6 7 ,
  • Christopher J Cates , senior clinical research fellow 8 ,
  • Hung-Yuan Cheng 1 2 ,
  • Mark S Corbett , research fellow 9 ,
  • Sandra M Eldridge , professor 10 ,
  • Jonathan R Emberson , professor 11 ,
  • Miguel A Hernán , professor 12 ,
  • Sally Hopewell , associate professor 13 ,
  • Asbjørn Hróbjartsson , professor 14 15 16 ,
  • Daniela R Junqueira , research associate 17 ,
  • Peter Jüni , professor 18 ,
  • Jamie J Kirkham , professor 19 ,
  • Toby Lasserson , senior editor 20 ,
  • Tianjing Li , associate professor 21 ,
  • Alexandra McAleenan , senior research associate 1 ,
  • Barnaby C Reeves , professorial research fellow 2 22 ,
  • Sasha Shepperd , professor 23 ,
  • Ian Shrier , investigator 24 ,
  • Lesley A Stewart , professor 9 ,
  • Kate Tilling , professor 1 2 25 ,
  • Ian R White , professor 26 ,
  • Penny F Whiting , associate professor 1 3 ,
  • Julian P T Higgins , professor 1 2 3
  • 1 Population Health Sciences, Bristol Medical School, University of Bristol, Bristol BS8 2BN, UK
  • 2 NIHR Bristol Biomedical Research Centre, Bristol, UK
  • 3 NIHR CLAHRC West, University Hospitals Bristol NHS Foundation Trust, Bristol, UK
  • 4 School of Public Health and Preventive Medicine, Monash University, Melbourne, Australia
  • 5 METHODS team, Epidemiology and Biostatistics Centre, INSERM UMR 1153, Paris, France
  • 6 Paris Descartes University, Paris, France
  • 7 Cochrane France, Paris, France
  • 8 Population Health Research Institute, St George’s, University of London, London, UK
  • 9 Centre for Reviews and Dissemination, University of York, York, UK
  • 10 Pragmatic Clinical Trials Unit, Centre for Primary Care and Public Health, Queen Mary University of London, UK
  • 11 MRC Population Heath Research Unit, Clinical Trial Service Unit and Epidemiological Studies Unit, Nuffield Department of Population Health, University of Oxford, Oxford, UK
  • 12 Departments of Epidemiology and Biostatistics, Harvard T H Chan School of Public Health, Harvard-MIT Division of Health Sciences of Technology, Boston, MA, USA
  • 13 Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford, UK
  • 14 Centre for Evidence-Based Medicine Odense, Odense University Hospital, Odense, Denmark
  • 15 Department of Clinical Research, University of Southern Denmark, Odense, Denmark
  • 16 Open Patient data Explorative Network, Odense University Hospital, Odense, Denmark
  • 17 Department of Emergency Medicine, Faculty of Medicine and Dentistry, University of Alberta, Edmonton, Alberta, Canada
  • 18 Applied Health Research Centre, Li Ka Shing Knowledge Institute, St Michael’s Hospital, Department of Medicine and Institute of Health Policy, Management and Evaluation, University of Toronto, Toronto, Ontario, Canada
  • 19 Centre for Biostatistics, University of Manchester, Manchester, UK
  • 20 Editorial and Methods Department, Cochrane Central Executive, London, UK
  • 21 Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, USA
  • 22 Translational Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK
  • 23 Nuffield Department of Population Health, University of Oxford, Oxford, UK
  • 24 Centre for Clinical Epidemiology, Lady Davis Institute, Jewish General Hospital, McGill University, Montreal, Quebec, Canada
  • 25 MRC Integrative Epidemiology Unit, University of Bristol, Bristol, UK
  • 26 MRC Clinical Trials Unit, University College London, London, UK
  • Correspondence to: J A C Sterne jonathan.sterne{at}bristol.ac.uk (or @jonathanasterne on Twitter)
  • Accepted 25 June 2019

Assessment of risk of bias is regarded as an essential component of a systematic review on the effects of an intervention. The most commonly used tool for randomised trials is the Cochrane risk-of-bias tool. We updated the tool to respond to developments in understanding how bias arises in randomised trials, and to address user feedback on and limitations of the original tool.

An evaluation of the risk of bias in each study included in a systematic review documents potential flaws in the evidence summarised and contributes to the certainty in the overall evidence. 1 The Cochrane tool for assessing risk of bias in randomised trials (RoB tool) 2 has been widely used in both Cochrane and other systematic reviews, with over 40 000 citations in Google Scholar.

Many innovative characteristics of the original RoB tool have been widely accepted. It replaced the notion of assessing study quality with that of assessing risk of bias (we define bias as a systematic deviation from the effect of intervention that would be observed in a large randomised trial without any flaws). Quality is not well defined and can include study characteristics (such as performing a sample size calculation) that are not inherently related to bias in the study’s results. The RoB tool considers biases arising at different stages of a trial (known as bias domains), which were chosen on the basis of both empirical evidence and theoretical considerations. Assessments of risk of bias are supported by quotes from sources describing the trial (eg, trial protocol, registration record, results report) or by justifications written by the assessor.

After nearly a decade of experience of using the RoB tool, potential improvements have been identified. A formal evaluation found some bias domains to be confusing at times, with assessment of bias due to incomplete outcome data and selective reporting of outcomes causing particular difficulties, and confusion over whether studies that were not blinded should automatically be considered to be at high risk of bias. 3 More guidance on incorporating risk-of-bias assessments into meta-analyses and review conclusions is also needed. 4 5 A review of comments and user practice found that both Cochrane and non-Cochrane systematic reviews often implemented the RoB tool in non-standard ways. 6 Few trials are assessed as at low risk of bias, and judgments of unclear risk of bias are common. 6 7 Empirical studies have found only moderate reliability of risk-of-bias judgments. 8

We developed a revised risk-of-bias assessment tool to address these issues, incorporate advances in assessment of risk of bias used in other recently developed tools, 9 10 and integrate recent developments in estimation of intervention effects from randomised trials. 11

Summary points

Assessment of risk of bias is regarded as an essential component of a systematic review on the effects of an intervention; the most commonly used tool for assessing risk of bias in randomised trials is the Cochrane risk-of-bias tool, which was introduced in 2008

Potential improvements to the Cochrane risk-of-bias tool were identified on the basis of reviews of the literature, user experience and feedback, approaches used in other risk-of-bias tools, and recent developments in estimation of intervention effects from randomised trials

We developed and piloted a revised tool for assessing risk of bias in randomised trials (RoB 2)

Bias is assessed in five distinct domains. Within each domain, users of RoB 2 answer one or more signalling questions. These answers lead to judgments of “low risk of bias,” “some concerns,” or “high risk of bias”

The judgments within each domain lead to an overall risk-of-bias judgment for the result being assessed, which should enable users of RoB 2 to stratify meta-analyses according to risk of bias

Development of the revised RoB tool

We followed the principles adopted for the development of the original RoB tool and for the ROBINS-I tool for assessing risk of bias in non-randomised studies of interventions. 2 9 A core group coordinated development of the tool, including recruitment of collaborators, preparation and revision of documents, and administrative support.

Preliminary work included a review of how the original tool was used in practice, 6 a systematic review and meta-analysis of meta-epidemiological studies of empirical evidence for biases associated with characteristics of randomised trials, 12 and a cross sectional study of how selective outcome reporting was assessed in Cochrane reviews. 13 We also drew on a systematic review of the theoretical and conceptual literature on types of bias in epidemiology, which sought papers and textbooks presenting classifications or definitions of biases, and organised these into a coherent framework (paper in preparation).

The core group developed an initial proposal and presented it, together with the latest empirical evidence of biases in randomised trials, at a meeting in August 2015 attended by 24 contributors. Meeting participants agreed on the methodological principles underpinning the new tool and the bias domains to be addressed, and formed working groups for each domain. The groups were tasked with developing signalling questions (reasonably factual questions with yes/no answers that inform risk-of-bias judgments), together with guidance for answering these questions and broad considerations for how to judge the risk of bias for the domain.

The materials prepared by the working groups were assembled and edited by the core team, and the resulting draft was piloted by experienced and novice systematic reviewers during a three day event in February 2016, with 17 participants present and 10 participants contributing remotely. Issues identified in the pilot were recorded and addressed in a new draft discussed at a second development meeting in April 2016, also attended by 24 contributors. Subsequently, working groups developed criteria for reaching domain level, risk-of-bias judgments based on answers to signalling questions, and expanded the guidance. The core team designed algorithms to match the criteria, which were checked by the working groups. The resulting revision was tested in another round of piloting by 10 systematic review authors in mid-2016.

A complete draft of version 2 of the RoB tool (RoB 2), together with detailed guidance, was posted at www.riskofbias.info in October 2016, coinciding with the Cochrane Colloquium in Seoul, South Korea. Feedback was invited through direct contact with the development group. Several review teams subsequently piloted the draft tool and provided feedback. Further modifications—particularly improvements in wording and clarity, splitting compound signalling questions, adding new questions, and addressing methodological issues—were made on the basis of feedback from training events (including webinars) conducted between 2016 and 2019, as well as individual feedback from users worldwide.

Version 2 of the Cochrane tool for assessing risk of bias in randomised trials (RoB 2)

RoB 2 provides a framework for assessing the risk of bias in a single estimate of an intervention effect reported from a randomised trial. The effect assessed is a comparison of two interventions, which we refer to as the experimental and comparator interventions, for a specific outcome or endpoint. The process of making a RoB 2 assessment is summarised in figure 1 . Preliminary considerations (box 1) include specifying which result is being assessed, specifying how this result is being interpreted (see “The intervention effect of interest” below), and listing the sources of information used to inform the assessment. Review authors should contact trial authors in order to obtain information that is omitted from published and online sources, so far as this is feasible. Note that risk-of-bias assessments might be needed for results relating to multiple outcomes from the included trials.

Fig 1

Summary of the process of assessing risk of bias in a systematic review of randomised trials, using version 2 of the Cochrane risk-of-bias tool

  • Download figure
  • Open in new tab
  • Download powerpoint

RoB 2 tool: preliminary considerations

For the purposes of this assessment, define the interventions being compared:

Experimental intervention:

Comparator intervention:

Specify which outcome is being assessed for risk of bias

Specify the numerical result being assessed. (In case of multiple alternative analyses being presented, specify the numerical result (eg, risk ratio 1.52 (95% confidence interval 0.83 to 2.77) or a reference (eg, to a table, figure, or paragraph) that uniquely defines the result being assessed.)

Is the review team’s aim for this result (check one):

To assess the effect of assignment to intervention (the intention-to-treat effect)?

To assess the effect of adhering to intervention (the per protocol effect)?

If the aim is to assess the effect of adhering to intervention, select the deviations from intended intervention that should be addressed (at least one must be checked):

Occurrence of non-protocol interventions

Failures in implementing the intervention that could have affected the outcome

Non-adherence to their assigned intervention by trial participants

Which of the following sources were obtained to help inform the risk-of-bias assessment?

Journal article(s)

Trial protocol

Statistical analysis plan

Non-commercial trial registry record (eg, ClinicalTrials.gov record)

Company owned trial registry record (eg, GlaxoSmithKline Clinical Study Register record)

Grey literature (eg, unpublished thesis)

Conference abstract(s) about the trial

Regulatory document (eg, clinical study report, drug approval package)

Research ethics application

Grant database summary (eg, NIH RePORTER or Research Councils UK Gateway to Research)

Personal communication with triallist

Personal communication with the sponsor

RoB 2 is structured into five bias domains, listed in table 1 . The domains were selected to address all important mechanisms by which bias can be introduced into the results of a trial, based on a combination of empirical evidence and theoretical considerations. We did not include domains for features that would be expected to operate indirectly, through the included bias domains. 14 15 For this reason, we excluded some trial features, such as funding source and single centre versus multicentre status, which have been associated empirically with trial effect estimates from trials.

Version 2 of the Cochrane risk-of-bias assessment tool for randomised trials: bias domains, signalling questions, response options, and risk-of-bias judgments

  • View inline

We label the domains using descriptions of the causes of bias addressed, avoiding terms used in the original RoB tool (such as “selection bias” and “performance bias”) because they are used inconsistently or not known by many people outside Cochrane. 16 Each domain is mandatory, and no others can be added, although we have developed versions of RoB 2 that deal with additional issues that arise in trials with cluster randomised or crossover designs ( www.riskofbias.info ). Within each domain, the assessment comprises:

• A series of signalling questions

• A judgment about risk of bias for the domain, facilitated by an algorithm that maps responses to signalling questions to a proposed judgment

• Free text boxes to justify responses to the signalling questions and risk-of-bias judgments

• Optional free text boxes to predict (and explain) the likely direction of bias.

Table 2 lists the most important changes made in RoB 2, compared with the original Cochrane RoB tool.

Major changes in version 2 of the Cochrane risk-of-bias assessment tool, compared with the original tool

Signalling questions

Signalling questions aim to elicit information relevant to an assessment of risk of bias ( table 1 ). The questions seek to be reasonably factual in nature. The response options are “yes,” “probably yes,” “probably no,” “no,” and “no information.” To maximise their simplicity and clarity, signalling questions are phrased such that a yes answer might indicate either lower or higher risk of bias, depending on the most natural way to ask the question. The online supplementary material in the web appendix includes elaborations providing guidance on how to answer each question.

Responses of “yes” and “probably yes” have the same implications for risk of bias, as do responses of “no” and “probably no.” “Yes” and “no” typically imply that firm evidence is available; the “probably” responses typically imply that a judgment has been made. Where there is a need to distinguish between “some concerns” and “high risk of bias,” this is dealt with by using an additional signalling question, rather than by making a distinction between responses “probably yes” and “yes,” or between “probably no” and “no.” The “no information” response should be used only when insufficient details are available to allow a different response, and when, in the absence of these details, it would be unreasonable to respond “probably yes” or “probably no.” For example, in the context of a large trial run by an experienced clinical trials unit, absence of specific information about generation of the randomisation sequence, in a paper published in a journal with rigorously enforced word count limits, is likely to result in a response of “probably yes” rather than “no information” to the signalling question about sequence generation (the rationale for the response should be provided in the free text box). Some signalling questions are answered only if the response to a previous question indicates that they are required.

The intervention effect of interest

Assessments for the domain “bias due to deviations from intended interventions” differ according to whether review authors are interested in quantifying the effect of assignment to the interventions at baseline regardless of whether the interventions are received during follow-up (intention-to-treat effect), or the effect of adhering to intervention as specified in the trial protocol (per protocol effect). These effects will differ if some patients do not receive their assigned intervention or deviate from the assigned intervention after baseline. Each effect might be of interest. 11 For example, the effect of assignment to intervention might be appropriate to inform a health policy question about whether to recommend an intervention (eg, a screening programme) in a particular health system, whereas the effect of adhering to intervention more directly informs a care decision by an individual patient (eg, whether to be screened). Changes to an intervention that are consistent with the trial protocol (even if not explicitly discussed in the protocol), such as cessation of a drug because of toxicity or switch to second line chemotherapy because of progression of cancer, do not cause bias and should not be considered to be deviations from intended intervention.

The effect of assignment to intervention should be estimated by an intention-to-treat analysis that includes all randomised participants. 17 However, estimates of per protocol effects commonly used in reports of randomised trials are problematic and might be seriously biased. 18 These estimates include those from naive per protocol analyses restricted to individuals who adhered to their assigned intervention, and as-treated analyses in which participants are analysed according to the intervention they received, even if their assigned group is different. These approaches are problematic because prognostic factors could influence whether individuals receive their allocated intervention. Data from a randomised trial can be used to derive an unbiased estimate of the effect of adhering to intervention. 19 20 However, the validity of appropriate methods depends on strong assumptions, and published applications are relatively rare to date. For trials comparing interventions that are sustained over time, appropriate methods also require measurement of and adjustment for the values of prognostic factors, both before and after randomisation, that predict deviations from intervention. 11 For these reasons, most systematic reviews are likely to estimate the effect of assignment rather than adherence to intervention.

Risk-of-bias judgments

The risk-of-bias judgments for each domain are “low risk of bias,” “some concerns,” or “high risk of bias.” Judgments are based on, and summarise, the answers to signalling questions. Review authors should interpret “risk of bias” as “risk of material bias”: concerns should be expressed only about issues likely to have a notable effect on the result being assessed.

An important innovation in RoB 2 is the inclusion of algorithms that map responses to signalling questions to a proposed risk-of-bias judgment for each domain (see online supplementary material in the web appendix). Review authors can override these proposed judgments if they feel it is appropriate to do so.

Free text boxes alongside the signalling questions and judgments allow assessors to provide support for the responses. Brief direct quotations from the texts of the study reports (including trial protocols) should be used whenever possible, supplemented by any information obtained from authors when contacted. Reasons for any judgments that do not follow the algorithms should be provided. RoB 2 includes optional judgments of the direction of the bias for each domain and overall. If review authors do not have a clear rationale for judging the likely direction of the bias, they should not guess it.

Overall risk of bias for the result

The response options for an overall risk-of-bias judgment are the same as for individual domains. Table 3 shows the approach to mapping bias judgments within domains to an overall judgment for the result. The overall risk of bias generally corresponds to the worst risk of bias in any of the domains. However, if a study is judged to have “some concerns” about risk of bias for multiple domains, it might be judged as at high risk of bias overall. Figure 2 shows a forest plot that displays domain specific risk of bias and overall risk of bias, with the meta-analysis stratified by overall risk of bias.

Approach to reaching an overall risk-of-bias judgment for a specific result

Fig 2

Example forest plot showing results of a risk-of-bias assessment in a systematic review of randomised trials, using version 2 of the Cochrane risk-of-bias tool. Studies are stratified by overall risk of bias

We have substantially revised the Cochrane tool for assessing risk of bias in the results of randomised trials, in order to address limitations identified since it was published in 2008 and to incorporate improvements that aim to increase the reliability of assessments. RoB 2 is based on wide consultation within and outside Cochrane, extensive piloting, and integration of feedback based on user experience. Assessments are made in five bias domains, within which answers to signalling questions address a broader range of issues than in the original RoB tool. These issues include whether post-randomisation deviations from intervention caused bias in trials in which blinding was either not feasible or not implemented and whether outcome data were missing for reasons likely to lead to bias. Assessment of selective reporting is focused on a reported result for an outcome, rather than selective non-reporting of other outcomes that were measured in the trial. RoB 2 also incorporates recent developments in estimation of intervention effects from randomised trials: we distinguish bias in the effect of assignment to interventions from bias in the effect of adhering to intervention as specified in the trial protocol. 11

RoB 2 assessments relate to the risk of bias in a single estimate of intervention effect for a single outcome or endpoint, rather than for a whole trial. This specificity is because the risk of bias is outcome specific for domains such as bias in measurement of the outcome, and could be specific to a particular estimate (eg, when both intention-to-treat and per protocol analyses have been conducted). We recommend that overall RoB 2 judgments of risk of bias for individual results should be the primary means of distinguishing stronger from weaker evidence in the context of a meta-analysis (or other synthesis) of randomised trials. The overall judgments should also influence the strength of conclusions drawn from a systematic review (potentially as part of a GRADE assessment). 21 We strongly encourage stratification by overall risk-of-bias judgment as a default meta-analysis strategy, as shown in figure 2 . To facilitate this, we suggest that software for systematic review preparation provides data fields for risk-of-bias assessments. We are preparing an interactive web tool for completing RoB 2 assessments, which we hope will interface well with other systematic review software.

In RoB 2, judgments about risk of bias are derived by algorithms on the basis of answers to specific signalling questions. The added structure provided by the signalling questions aims to make the assessment easier and more efficient to use, as well as to improve agreement between assessors. We believe this approach to be more straightforward than the direct judgments about risk of bias required in the original RoB tool. The algorithms include explicit mappings for situations where there is no information to answer a signalling question, which do not necessarily map to a negative assessment of the trial. For example, when randomisation methods are described and are adequate, the response to the signalling question about baseline imbalances between intervention groups leads to low risk of bias either when such imbalances are compatible with chance, or when there is no information about baseline imbalances. We removed the option of an “unclear” judgment in favour of a graded set of response options (from “low” to “some concerns” to “high”). We envisage that systematic reviews will report the domain level judgments and overall risk-of-bias judgments in tables or figures contained in the main review text. In addition, we encourage reporting of answers to signalling questions, together with direct quotes from papers and free text justification of the answers, in an appendix.

We expect the refinements we have made to the RoB tool to lead to a greater proportion of trial results being assessed as at low risk of bias, because our algorithms map some circumstances to a low risk of bias when users of the previous tool would typically have assessed them to be at unclear (or even high) risk of bias. This potential difference in judgments in RoB 2 compared with the original tool is particularly the case for unblinded trials, where risk of bias in the effect of assignment to intervention due to deviations from intended interventions might be low despite many users of the original RoB tool assigning a high risk of bias in the corresponding domain. We believe that judgments of low risk of bias should be readily achievable for a randomised trial, a study design that is scientifically strong, well understood, and often well implemented in practice. We hope that RoB 2 will be useful to systematic review authors and those making use of reviews, by providing a coherent framework for understanding and identifying trials at risk of bias. This framework might also help those designing, conducting, and reporting randomised trials to achieve the most reliable findings possible.

Acknowledgments

We dedicate this work to Douglas G Altman, whose contributions were of fundamental importance to development of risk-of-bias assessment in systematic reviews.

We thank Henning Keinke Andersen, Nancy Berkman, Mike Campbell, Rachel Churchill, Mike Clarke, Nicky Cullum, Francois Curtin, Amy Drahota, Bruno Giraudeau, Jeremy Grimshaw, Sharea Ijaz, Yoon Loke, Geraldine Macdonald, Richard Morris, Mona Nasser, Nishith Patel, Jani Ruotsalainen, Stephen Senn, Holger Schünemann, Nandi Siegfried, Jayne Tierney, and Sunita Vohra for contributing to discussions; and Andrew Beswick, Julia Bidonde, Angela Busch, staff at Cochrane Argentina, Karen Dawe, Franco De Crescenzo, Kristine Egberts, Clovis Mariano Faggion Jr, Clare French, Lina Gölz, Valerie Hoffman, Joni Jackson, Tim Jones, Kayleigh Kew, Elsa Marques, Silvia Minozzi, Theresa Moore, Rebecca Normansell, Rosanne Freak-Poli, Sarah Lensen, José López-López, Marlies Manders, Luke McGuinness, Spyros Papageorgiou, Melissa Randall, Phil Riley, Claudia Smeets, Meera Viswanathan, and Tanya Walsh for contributing to piloting of earlier drafts of the RoB 2 tool.

Contributors: JACS, JS, and JPTH conceived the project. JACS, JPTH, JS, MJP, and RGE oversaw the project. JACS, JS, AH, IB, BCR, and JJK led working groups. All authors contributed to development of RoB 2 and to writing associated guidance. JACS, JS, and JPTH wrote the first draft of the manuscript. All authors reviewed and commented on drafts of the manuscript. The authors are epidemiologists, statisticians, systematic reviewers, trialists, and health services researchers, many of whom are involved with Cochrane systematic reviews, methods groups, and training events. Development of RoB 2 was informed by relevant methodological literature, previously published tools for assessing methodological quality of randomised trials, systematic reviews of such tools and relevant literature, and by the authors’ experience of developing tools to assess risk of bias in randomised and non-randomised studies, diagnostic test accuracy studies, and systematic reviews. All authors contributed to development of RoB 2 and to writing associated guidance. All authors reviewed and commented on drafts of the manuscript. JACS and JPTH will act as guarantors. The corresponding author attests that all listed authors meet authorship criteria and that no others meeting the criteria have been omitted.

Funding: The development of the RoB 2 tool was supported by the UK Medical Research Council (MRC) Network of Hubs for Trials Methodology Research (MR/L004933/2- N61), with the support of the host MRC ConDuCT-II Hub (Collaboration and innovation for Difficult and Complex randomised controlled Trials In Invasive procedures, MR/K025643/1), by MRC research grant MR/M025209/1, and by a grant from the Cochrane Collaboration. JACS, SME, and JPTH are National Institute for Health Research (NIHR) senior investigators. JACS, NSB, H-YC, BCR, and JPTH are supported by NIHR Bristol Biomedical Research Centre at University Hospitals Bristol NHS Foundation Trust and the University of Bristol. JACS, JPTH, and KT are members of the MRC Integrative Epidemiology Unit at the University of Bristol (supported by MRC grant MC_UU_00011/3) and the University of Bristol. JS, PFW, and JPTH are supported by the NIHR Collaboration for Leadership in Applied Health Research and Care West (CLAHRC West) at University Hospitals Bristol NHS Foundation Trust. PJ is a tier 1 Canada Research Chair in Clinical Epidemiology of Chronic Diseases supported by the Canada Research Chairs Programme. MJP was supported by an Early Career Fellowship from the Australian National Health and Medical Research Council (NHMRC 1088535). IRW was supported by the MRC Programme MC_UU_12023/21. The views expressed in this article are those of the authors and do not necessarily represent those of the UK National Health Service, NIHR, MRC, or Department of Health and Social Care, or the Australian NHMRC. Development of the RoB tool, writing the paper and the decision to submit for publication were independent of all research funders.

Competing interests: All authors have completed the ICMJE uniform disclosure form at www.icmje.org/coi_disclosure.pdf and declare: support from the MRC and Wellcome Trust for the submitted work; JACS reports grants from the MRC and NIHR during the conduct of the study; JS reports grants from the MRC, Cochrane Collaboration, and NIHR during the conduct of the study; CJC is one of the coordinating editors of Cochrane Airways and has responsibility for training Cochrane authors in the UK in Cochrane methodology (including the assessment of risks of bias); JRE reports grants from the MRC and Boehringer Ingelheim, outside the submitted work; MAH reports grants from the National Institutes of Health (NIH) during the conduct of the study; PJ serves as an unpaid member of the steering group of cardiovascular trials funded by Astra Zeneca, Biotronik, Biosensors, St Jude Medical, and The Medicines Company, has received research grants to the institution from Astra Zeneca, Biotronik, Biosensors International, Eli Lilly, and The Medicines Company, and honorariums to the institution for participation in advisory boards from Amgen, but has not received personal payments by any pharmaceutical company or device manufacturer; JJK reports personal fees from BMJ outside the submitted work; TLa is an employee of Cochrane; TLi reports grants from the NIH National Eye Institute, and the Patient-Centered Outcomes Research Institute during the conduct of the study; AM reports grants from MRC and Cancer Research UK during the conduct of the study; KT reports grants from MRC during the conduct of the study and personal fees from CHDI outside the submitted work; JPTH reports grants from the MRC and Cochrane Collaboration during the conduct of the study; the authors declare no other relationships or activities that could appear to have influenced the submitted work.

Provenance and peer review: Not commissioned, externally peer reviewed.

Patient and public involvement: Patients and the public were not involved in this methodological research. We plan to disseminate the research widely, including to community participants in Cochrane.

  • Higgins JPT ,
  • Altman DG ,
  • Gøtzsche PC ,
  • Cochrane Bias Methods Group ,
  • Cochrane Statistical Methods Group
  • Savović J ,
  • Sterne JAC ,
  • Hopewell S ,
  • Boutron I ,
  • Katikireddi SV ,
  • Petticrew M
  • Jørgensen L ,
  • Paludan-Müller AS ,
  • Laursen DR ,
  • Dechartres A ,
  • Trinquart L ,
  • Hartling L ,
  • Hernán MA ,
  • Reeves BC ,
  • Whiting PF ,
  • Rutjes AW ,
  • Westwood ME ,
  • QUADAS-2 Group
  • Clayton G ,
  • Sterne JA ,
  • Hróbjartsson A ,
  • Higgins JPT
  • Lexchin J ,
  • Mintzes B ,
  • Schroll JB ,
  • Yavchitz A ,
  • Mansournia MA ,
  • Higgins JP ,
  • Fergusson D ,
  • Hernández-Díaz S
  • Murray EJ ,
  • Lundgren JD ,
  • INSIGHT Strategic Timing of AntiRetroviral Treatment (START) study group
  • Guyatt GH ,
  • GRADE Working Group

cochrane systematic review biases

Our site uses cookies to improve your experience. You can find out more about our use of cookies in About Cookies, including instructions on how to turn off cookies if you wish to do so. By continuing to browse this site you agree to us using cookies as described in About Cookies .

The Cochrane Library

Trusted evidence. Informed decisions. Better health.

Scolaris Search Portlet Scolaris Search Portlet

Scolaris language selector scolaris language selector.

Select your preferred language for Cochrane Reviews and other content. Sections without translation will be in English.

Select your preferred language for the Cochrane Library website.

Scolaris Content Language Banner Portlet Scolaris Content Language Banner Portlet

Scolaris content display scolaris content display, assessing risk of bias in randomised clinical trials included in cochrane reviews: the why is easy, the how is a challenge.

  • Asbjørn Hróbjartsson
  • Isabelle Boutron
  • Lucy Turner
  • Douglas G Altman
  • David Moher
  • on behalf of the Cochrane Bias Methods Group

Version published: 30 April 2013

TH Foto‐Werbung/Science Photo Library

Randomised clinical trials are often inadequately reported and may be inadequately conducted. Any associated biases could impact seriously on the findings and conclusion of a systematic review. Authors of systematic reviews thus need to assess the risk of bias in included randomised clinical trials. In this 20th Anniversary editorial, we look at the evolution of guidance on appraising studies included in Cochrane Reviews.

Assessing the methodological ‘quality’ of included trials was addressed from the earliest days of The Cochrane Collaboration, although the phrase ‘risk of bias’ came into use later. In 1994 one of the first editions of the Cochrane Collaboration Handbook recommended that reviewers should routinely assess the adequacy of allocation concealment, and that they could consider assessing blinding and attrition, based on a seminal empirical study by Schulz and colleagues. [ 1 ] Over the next decade several Cochrane Review Groups developed different recommendations for assessing risk of bias. Of 50 Cochrane Review Groups surveyed in 2007, 41 recommended using specific trial characteristics to assess risk of bias and nine either recommended using a quality scale or made this optional. Most groups suggested assessing the randomisation procedure (including concealment of allocation), blinding, and attrition. [ 2 ]

In 2008 the Cochrane risk of bias tool was released with Review Manager 5.0, following three years of development. It included six characteristics: ‘generation of the allocation sequence’, ‘concealment of the allocation sequence’, ‘blinding’, ‘incomplete outcome data’, ‘selective outcome reporting’, and ‘other bias’. Selective outcome reporting was included based on a landmark paper documenting a tendency for statistically significant trial outcomes to be selected for reporting. [ 3 ] Subsequent research has replicated this finding. In 2011 a revised version of risk of bias tool split blinding into ‘blinding of participants and personnel’ and ‘blinding of outcome assessment’. [ 4 ]

Now is a good time to reflect on two decades of assessing risk of bias. The risk of bias tool provides a standardised approach, based on items selected on both theoretical and empirical grounds, and following broad consultations with clinical research methodologists. Furthermore, a more appropriate terminology has been developed emphasising ‘risk of bias’ instead of ‘methodological quality’, and the initial approach mainly based on a single trial characteristic to bias has matured into a structured multidimensional approach. [ 4 ]

The risk of bias tool is a comparatively recent development that still likely needs refinement. The modest inter‐rater agreement rates [ 5 ] will hopefully be improved by modifications to the questions and enhanced training courses. However, authors of Cochrane Reviews tend to be reluctant in designating an overall risk of bias for each trial or outcome and also reluctant to incorporate the risk of bias assessment in analyses and conclusions. The next version of Review Manager, scheduled for release by the end of 2014, will enable authors to see the risk of bias table jointly with the forest plot, thus facilitating a cohesive interpretation of effects and risk of bias for each outcome.

However, risk of bias assessment has more fundamental challenges. Empirical analyses of bias in randomised trials typically rely on meta‐epidemiological studies. [ 1 , 6 ] Such studies involve comparisons within several meta‐analyses of the estimated treatment effects in trials with the characteristic present (such as adequate allocation generation) and trials without the characteristic (such as inadequate or unclear allocation generation). The risk of confounding in such comparisons is pronounced, as compared trials may differ for other reasons, such as allocation concealment, type of outcome, blinding, and the trial's sample size. Furthermore, reporting inadequacies in trial publications is an additional concern. It thus remains an open question whether inadequate allocation generation is truly causally linked to bias or whether it is an indirect marker of other factors associated with bias.

To establish reliable causal relationships of bias in observational studies may be even more difficult than to establish reliable causal relations in epidemiology in general. The assessment of risk of bias is to a large extent based on common sense and theoretical considerations, with an empirical basis of observational studies with a considerable risk of confounding.

This highlights a peculiar circularity. Meta‐analysis of randomised clinical trials is the core methodology for reliable estimates of treatment effects, and is thus the core methodology for evidence‐based medicine. This is partly based on the reasonable view that randomised trials are more reliable than observational studies in assessing effects of health care interventions. Still, the empirical evidence underlying the assessment of risk of bias in trials – an assessment necessary for ensuring that biased trials do not lead to biased systematic reviews – is based on observational studies.

An increasing number of meta‐epidemiological studies report associations between a trial characteristic and exaggerated treatment effects: funding status, [ 7 ] number of centres participating in a trial, [ 8 ] early stopping of a trial, [ 9 ] and developing country status. [ 10 ] For many of these characteristics it is unclear whether they represent a unique bias, confounding, publication bias, spurious findings, or a combination of these and/or other unknown factors. It is, nonetheless, helpful to be aware of such associations, sometimes called meta‐bias. [ 11 ]

Funding status is a major concern in randomised trials. The exaggerated effects reported for industry trials [ 7 ] may to some extent be explained as a result of publication bias or other characteristics included in the risk of bias tool, for example selective outcome reporting. However, companies that stand to gain financially by a positive result have substantial conflicts of interest when they control the planning, funding, conduct, and reporting of a trial. It is not clear that the risk of bias tool in its present version addresses this problem adequately.

It is important to assess risk of bias in randomised clinical trials included in systematic reviews. In the last 20 years risk of bias assessment in Cochrane Reviews has been refined several times. For the next two decades and beyond the process is likely to continue. The why is easy, the how is a challenge.

Information

  • Cochrane Database of Systematic Reviews
  • 30 April 2013

Article Metrics

Nordic Cochrane Centre, Copenhagen, Denmark

[email protected]

Centre d'Epidémiologie Clinique, Hôpital Hôtel‐Dieu and Université Paris Descartes, Paris, France

Ottawa Hospital Research Institute, Ottawa, Canada

Centre for Statistics in Medicine, Oxford, UK

  • on behalf of the Cochrane Bias Methods Group More by this author on the Cochrane Library

Declarations of interest

The authors have completed the Unified Competing Interest form at www.icmje.org/coi_disclosure.pdf (available upon request) and declares (1) no receipt of payment or support in kind for any aspect of the article; (2) no financial relationships with any entities that have an interest related to the submitted work; and (3) that DM is a member of The Cochrane Library Oversight Committee, but no other relationships or activities that could be perceived as having influenced, or giving the appearance of potentially influencing, what was written in the submitted work.

  • Schulz KF, Chalmers I, Hayes RJ, Altman DG. Empirical evidence of bias. Dimensions of methodological quality associated with estimates of treatment effects in controlled trials . JAMA 1995;273(5):408–12. doi.org/10.1001/jama.1995.03520290060030
  • Lundh A, Gøtzsche PC. Recommendations by Cochrane Review Groups for assessment of the risk of bias in studies . BMC Medical Research Methodology 2008;8:22. doi.org/10.1186/1471‐2288‐8‐22
  • Chan AW, Hróbjartsson A, Haahr MT, Gøtzsche PC, Altman DG. Empirical evidence for selective reporting of outcomes in randomized trials: comparison of protocols to published articles . JAMA 2004;291(20):2457–65. doi.org/10.1001/jama.291.20.2457
  • Higgins JPT, Altman DG, Gøtzsche PC, Jüni P, Moher D, Oxman AD, et al. The Cochrane Collaboration's tool for assessing risk of bias in randomised trials . BMJ 2011;343:d5928. doi.org/10.1136/bmj.d5928
  • Hartling L, Ospina M, Liang Y, Dryden DM, Hooton N, Krebs Seida J, et al. Risk of bias versus quality assessment of randomised controlled trials: cross sectional study . BMJ 2009;339:b4012. doi.org/10.1136/bmj.b4012
  • Savović J, Jones HE, Altman DG, Harris RJ, Jüni P, Pildal J, et al. Influence of reported study design characteristics on intervention effect estimates from randomized, controlled trials . Annals of Internal Medicine 2012;157(6):429–38. annals.org/article.aspx?articleid=1359238
  • Lundh A, Sismondo S, Lexchin J, Busuioc OA, Bero L. Industry sponsorship and research outcome . Cochrane Database of Systematic Reviews 2012;12:MR000033. doi.org/10.1002/14651858.MR000033.pub2
  • Dechartres A, Boutron I, Trinquart L, Charles P, Ravaud P. Single‐center trials show larger treatment effects than multicenter trials: evidence from a meta‐epidemiologic study . Annals of Internal Medicine 2011;155(1):39–51. annals.org/article.aspx?articleid=747012
  • Panagiotou OA, Contopoulos‐Ioannidis DG, Ioannidis JP. Comparative effect sizes in randomised trials from less developed and more developed countries: meta‐epidemiological assessment . BMJ 2013;346:f707. doi.org/10.1136/bmj.f707
  • Bassler D, Briel M, Montori VM, Lane M, Glasziou P, Zhou Q, et al. Stopping randomized trials early for benefit and estimation of treatment effects: systematic review and meta‐regression analysis . JAMA 2010;303(12):1180–7. doi.org/10.1001/jama.2010.310
  • Goodman S, Dickersin K. Metabias: a challenge for comparative effectiveness research . Annals of Internal Medicine 2011;155(1):61–2. annals.org/article.aspx?articleid=747016

Copy or download citation

All sections are selected by default , please select the sections you do not wish to print or use the select or deselect all button to add or remove sections.

Sign In Sign In

Search for your institution's name below to login via Shibboleth

Previously accessed institutions

If you have a Wiley Online Library institutional username and password, enter them here.

Institutional users

Other access options.

  • Individual access - via Wiley Online Library

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • v.343; 2011

The Cochrane Collaboration’s tool for assessing risk of bias in randomised trials

Julian p t higgins.

1 MRC Biostatistics Unit, Institute of Public Health, Cambridge CB2 0SR, UK

Douglas G Altman

2 Centre for Statistics in Medicine, University of Oxford, Oxford, UK

Peter C Gøtzsche

3 The Nordic Cochrane Centre, Rigshospitalet and University of Copenhagen, Denmark

Peter Jüni

4 Institute of Social and Preventive Medicine, University of Bern, Switzerland

David Moher

5 Clinical Epidemiology Program, Ottawa Hospital Research Institute, Ottawa, Ontario, Canada

6 Department of Epidemiology and Community Medicine, Faculty of Medicine, University of Ottawa, Canada

Andrew D Oxman

7 Preventive and International Health Care Unit, Norwegian Knowledge Centre for the Health Services, Oslo, Norway

Jelena Savović

8 Department of Social Medicine, University of Bristol, Bristol, UK

Kenneth F Schulz

9 FHI, Research Triangle Park, North Carolina, USA

Laura Weeks

Jonathan a c sterne, associated data.

Flaws in the design, conduct, analysis, and reporting of randomised trials can cause the effect of an intervention to be underestimated or overestimated. The Cochrane Collaboration’s tool for assessing risk of bias aims to make the process clearer and more accurate

Randomised trials, and systematic reviews of such trials, provide the most reliable evidence about the effects of healthcare interventions. Provided that there are enough participants, randomisation should ensure that participants in the intervention and comparison groups are similar with respect to both known and unknown prognostic factors. Differences in outcomes of interest between the different groups can then in principle be ascribed to the causal effect of the intervention. 1

Causal inferences from randomised trials can, however, be undermined by flaws in design, conduct, analyses, and reporting, leading to underestimation or overestimation of the true intervention effect (bias). 2 However, it is usually impossible to know the extent to which biases have affected the results of a particular trial.

Systematic reviews aim to collate and synthesise all studies that meet prespecified eligibility criteria 3 using methods that attempt to minimise bias. To obtain reliable conclusions, review authors must carefully consider the potential limitations of the included studies. The notion of study “quality” is not well defined but relates to the extent to which its design, conduct, analysis, and presentation were appropriate to answer its research question. Many tools for assessing the quality of randomised trials are available, including scales (which score the trials) and checklists (which assess trials without producing a score). 4 5 6 7 Until recently, Cochrane reviews used a variety of these tools, mainly checklists. 8 In 2005 the Cochrane Collaboration’s methods groups embarked on a new strategy for assessing the quality of randomised trials. In this paper we describe the collaboration’s new risk of bias assessment tool, and the process by which it was developed and evaluated.

Development of risk assessment tool

In May 2005, 16 statisticians, epidemiologists, and review authors attended a three day meeting to develop the new tool. Before the meeting, JPTH and DGA compiled an extensive list of potential sources of bias in clinical trials. The items on the list were divided into seven areas: generation of the allocation sequence; concealment of the allocation sequence; blinding; attrition and exclusions; other generic sources of bias; biases specific to the trial design (such as crossover or cluster randomised trials); and biases that might be specific to a clinical specialty. For each of the seven areas, a nominated meeting participant prepared a review of the empirical evidence, a discussion of specific issues and uncertainties, and a proposed set of criteria for assessing protection from bias as adequate, inadequate, or unclear, supported by examples.

During the meeting decisions were made by informal consensus regarding items that were truly potential biases rather than sources of heterogeneity or imprecision. Potential biases were then divided into domains, and strategies for their assessment were agreed, again by informal consensus, leading to the creation of a new tool for assessing potential for bias. Meeting participants also discussed how to summarise assessments across domains, how to illustrate assessments, and how to incorporate assessments into analyses and conclusions. Minutes of the meeting were transcribed from an audio recording in conjunction with written notes.

After the meeting, pairs of authors developed detailed criteria for each included item in the tool and guidance for assessing the potential for bias. Documents were shared and feedback requested from the whole working group (including six who could not attend the meeting). Several email iterations took place, which also incorporated feedback from presentations of the proposed guidance at various meetings and workshops within the Cochrane Collaboration and from pilot work by selected review teams in collaboration with members of the working group. The materials were integrated by the co-leads into comprehensive guidance on the new risk of bias tool. This was published in February 2008 and adopted as the recommended method throughout the Cochrane Collaboration. 9

Evaluation phase

A three stage project to evaluate the tool was initiated in early 2009. A series of focus groups was held in which review authors who had used the tool were asked to reflect on their experiences. Findings from the focus groups were then fed into the design of questionnaires for use in three online surveys of review authors who had used the tool, review authors who had not used the tool (to explore why not), and editorial teams within the collaboration. We held a meeting to discuss the findings from the focus groups and surveys and to consider revisions to the first version of the risk of bias tool. This was attended by six participants from the 2005 meeting and 17 others, including statisticians, epidemiologists, coordinating editors and other staff of Cochrane review groups, and the editor in chief of the Cochrane Library .

The risk of bias tool

At the 2005 workshop the participants agreed the seven principles on which the new risk of bias assessment tool was based (box).

Principles for assessing risk of bias

1. do not use quality scales.

Quality scales and resulting scores are not an appropriate way to appraise clinical trials. They tend to combine assessments of aspects of the quality of reporting with aspects of trial conduct, and to assign weights to different items in ways that are difficult to justify. Both theoretical considerations 10 and empirical evidence 11 suggest that associations of different scales with intervention effect estimates are inconsistent and unpredictable

2. Focus on internal validity

The internal validity of a study is the extent to which it is free from bias. It is important to separate assessment of internal validity from that of external validity (generalisability or applicability) and precision (the extent to which study results are free from random error). Applicability depends on the purpose for which the study is to be used and is less relevant without internal validity. Precision depends on the number of participants and events in a study. A small trial with low risk of bias may provide very imprecise results, with a wide confidence interval. Conversely, the results of a large trial may be precise (narrow confidence interval) but have a high risk of bias if internal validity is poor

3. Assess the risk of bias in trial results, not the quality of reporting or methodological problems that are not directly related to risk of bias

The quality of reporting, such as whether details were described or not, affects the ability of systematic review authors and users of medical research to assess the risk of bias but is not directly related to the risk of bias. Similarly, some aspects of trial conduct, such as obtaining ethical approval or calculating sample size, are not directly related to the risk of bias. Conversely, results of a trial that used the best possible methods may still be at risk of bias. For example, blinding may not be feasible in many non-drug trials, and it would not be reasonable to consider the trial as low quality because of the absence of blinding. Nonetheless, many types of outcome may be influenced by participants’ knowledge of the intervention received, and so the trial results for such outcomes may be considered to be at risk of bias because of the absence of blinding, despite this being impossible to achieve

4. Assessments of risk of bias require judgment

Assessment of whether a particular aspect of trial conduct renders its results at risk of bias requires both knowledge of the trial methods and a judgment about whether those methods are likely to have led to a risk of bias. We decided that the basis for bias assessments should be made explicit, by recording the aspects of the trial methods on which the judgment was based and then the judgment itself

5. Choose domains to be assessed based on a combination of theoretical and empirical considerations

Empirical studies show that particular aspects of trial conduct are associated with bias. 2 12 However, these studies did not include all potential sources of bias. For example, available evidence does not distinguish between different aspects of blinding (of participants, health professionals, and outcome assessment) and is very limited with regard to how authors dealt with incomplete outcome data. There may also be topic specific and design specific issues that are relevant only to some trials and reviews. For example, in a review containing crossover trials it might be appropriate to assess whether results were at risk of bias because there was an insufficient “washout” period between the two treatment periods

6. Focus on risk of bias in the data as represented in the review rather than as originally reported

Some papers may report trial results that are considered as at high risk of bias, for which it may be possible to derive a result at low risk of bias. For example, a paper that inappropriately excluded certain patients from analyses might report the intervention groups and outcomes for these patients, so that the omitted participants can be reinstated

7. Report outcome specific evaluations of risk of bias

Some aspects of trial conduct (for example, whether the randomised allocation was concealed at the time the participant was recruited) apply to the trial as a whole. For other aspects, however, the risk of bias is inherently specific to different outcomes within the trial. For example, all cause mortality might be ascertained through linkages to death registries (low risk of bias), while recurrence of cancer might have been assessed by a doctor with knowledge of the allocated intervention (high risk of bias)

The risk of bias tool covers six domains of bias: selection bias, performance bias, detection bias, attrition bias, reporting bias, and other bias. Within each domain, assessments are made for one or more items, which may cover different aspects of the domain, or different outcomes. Table 1 ​ 1 shows the recommended list of items. These are discussed in more detail in the appendix on bmj.com.

  Cochrane Collaboration’s tool for assessing risk of bias (adapted from Higgins and Altman 13 )

Bias domainSource of biasSupport for judgmentReview authors’ judgment (assess as low, unclear or high risk of bias)
Selection biasRandom sequence generationDescribe the method used to generate the allocation sequence in sufficient detail to allow an assessment of whether it should produce comparable groupsSelection bias (biased allocation to interventions) due to inadequate generation of a randomised sequence
Allocation concealmentDescribe the method used to conceal the allocation sequence in sufficient detail to determine whether intervention allocations could have been foreseen before or during enrolmentSelection bias (biased allocation to interventions) due to inadequate concealment of allocations before assignment
Performance biasBlinding of participants and personnel*Describe all measures used, if any, to blind trial participants and researchers from knowledge of which intervention a participant received. Provide any information relating to whether the intended blinding was effectivePerformance bias due to knowledge of the allocated interventions by participants and personnel during the study
Detection biasBlinding of outcome assessment*Describe all measures used, if any, to blind outcome assessment from knowledge of which intervention a participant received. Provide any information relating to whether the intended blinding was effectiveDetection bias due to knowledge of the allocated interventions by outcome assessment
Attrition biasIncomplete outcome data*Describe the completeness of outcome data for each main outcome, including attrition and exclusions from the analysis. State whether attrition and exclusions were reported, the numbers in each intervention group (compared with total randomised participants), reasons for attrition or exclusions where reported, and any reinclusions in analyses for the reviewAttrition bias due to amount, nature, or handling of incomplete outcome data
Reporting biasSelective reportingState how selective outcome reporting was examined and what was foundReporting bias due to selective outcome reporting
Other biasAnything else, ideally prespecifiedState any important concerns about bias not covered in the other domains in the toolBias due to problems not covered elsewhere

*Assessments should be made for each main outcome or class of outcomes.

For each item in the tool, the assessment of risk of bias is in two parts. The support for judgment provides a succinct free text description or summary of the relevant trial characteristic on which judgments of risk of bias are based and aims to ensure transparency in how judgments are reached. For example, the item about concealment of the randomised allocation sequence would provide details of what measures were in place, if any, to conceal the sequence. Information for these descriptions will often come from a single published trial report but may be obtained from a mixture of trial reports, protocols, published comments on the trial, and contacts with the investigators. The support for the judgment should provide a summary of known facts, including verbatim quotes where possible. The source of this information should be stated, and when there is no information on which to base a judgment, this should be stated.

The second part of the tool involves assigning a judgment of high, low, or unclear risk of material bias for each item. We define material bias as bias of sufficient magnitude to have a notable effect on the results or conclusions of the trial, recognising the subjectivity of any such judgment. Detailed criteria for making judgments about the risk of bias from each of the items in the tool are available in the Cochrane Handbook. 13 If insufficient detail is reported of what happened in the trial, the judgment will usually be unclear risk of bias. A judgment of unclear risk should also be made if what happened in the trial is known but the associated risk of bias is unknown—for example, if participants take additional drugs of unknown effectiveness as a result of them being aware of their intervention assignment. We recommend that judgments be made independently by at least two people, with any discrepancies resolved by discussion in the first instance.

Some of the items in the tool, such as methods for randomisation, require only a single assessment for each trial included in the review. For other items, such as blinding and incomplete outcome data, two or more assessments may be used because they generally need to be made separately for different outcomes (or for the same outcome at different time points). However, we recommend that review authors limit the number of assessments used by grouping outcomes—for example, as subjective or objective for the purposes of assessing blinding of outcome assessment or as “patient reported at 6 months” or “patient reported at 12 months” for assessing risk of bias due to incomplete outcome data.

Evaluation of initial implementation

The first (2008) version of the tool was slightly different from the one we present here. The 2008 version did not categorise biases by the six domains (selection bias, performance bias, etc); had a single assessment for blinding; and expressed risk of bias in the format ‘”yes,” “no,” or “unclear” (referring to lack of a risk) rather than as low, high, or unclear risk. The 2010 evaluation of the initial version found wide acceptance of the need for the risk of bias tool, with a consensus that it represents an improvement over methods previously recommended by the Collaboration or widely used in systematic reviews.

Participants in the focus groups noted that the tool took longer to complete than previous methods. Of 187 authors surveyed, 88% took longer than 10 minutes to complete the new tool, 44% longer than 20 minutes, and 7% longer than an hour, but 83% considered the time taken acceptable. There was consensus that classifying items in the tool according to categories of bias (selection bias, performance bias, etc) would help users, so we introduced these. There was also consensus that assessment of blinding should be separated into blinding of participants and health professionals (performance bias) and blinding of outcome assessment (detection bias) and that the phrasing of the judgments about risk should be changed to low, high, and unclear risk. The domains reported to be the most difficult to assess were risk of bias due to incomplete outcome data and selective reporting of outcomes. There was agreement that improved training materials and availability of worked examples would increase the quality and reliability of bias assessments.

Presentation of assessments

Results of an assessment of risk of bias can be presented in a table, in which judgments for each item in each trial are presented alongside their descriptive justification. Table 2 ​ 2 presents an example of a risk of bias table for one trial included in a Cochrane review of therapeutic monitoring of antiretrovirals for people with HIV. 14 Risks of bias due to blinding and incomplete outcome data were assessed across all outcomes within each included study, rather than separately for different outcomes as will be more appropriate in some situations.

 Example of risk of bias table from a Cochrane review 14

BiasAuthors’ judgmentSupport for judgment
Random sequence generation (selection bias)Low riskQuote: “Randomization was one to one with a block of size 6. The list of randomization was obtained using the SAS procedure plan at the data statistical analysis centre”
Allocation concealment (selection bias)Unclear riskThe randomisation list was created at the statistical data centre, but further description of allocation is not included
Blinding of participants and researchers (performance bias)High riskOpen label
Blinding of outcome assessment (detection bias)High riskOpen label
Incomplete outcome data (attrition bias)Low riskLosses to follow-up were disclosed and the analyses were conducted using, firstly, a modified intention to treat analysis in which missing=failures and, secondly, on an observed basis. Although the authors describe an intention to treat analysis, the 139 participants initially randomised were not all included; five were excluded (four withdrew and one had lung cancer diagnosed). This is a reasonable attrition and not expected to affect results. Adequate sample size of 60 per group was achieved
Selective reporting (reporting bias)Low riskAll prespecified outcomes were reported
Other biasUnclear riskNo description of the uptake of the therapeutic drug monitoring recommendations by physicians, which could result in performance bias

Presenting risk of bias tables for every study in a review can be cumbersome, and we suggest that illustrations are used to summarise the judgments in the main systematic review document. The figure ​ figure provides an example. Here the judgments apply to all meta-analyses in the review. An alternative would be to illustrate the risk of bias for a particular meta-analysis (or for a particular outcome if a statistical synthesis is not undertaken), showing the proportion of information that comes from studies at low, unclear, or high risk of bias for each item in the tool, among studies contributing information to that outcome.

An external file that holds a picture, illustration, etc.
Object name is higj860676.f1_default.jpg

Fig 1 Example presentation of risk of bias assessments for studies in a Cochrane review of therapeutic monitoring of antiretroviral drugs in people with HIV 14

Summary assessment of risk of bias

To draw conclusions about the overall risk of bias within or across trials it is necessary to summarise assessments across items in the tool for each outcome within each trial. In doing this, review authors must decide which domains are most important in the context of the review, ideally when writing their protocol. For example, for highly subjective outcomes such as pain, blinding of participants is critical. The way that summary judgments of risk of bias are reached should be explicit and should be informed by empirical evidence of bias when it exists, likely direction of bias, and likely magnitude of bias. Table 3 ​ 3 provides a suggested framework for making summary assessments of the risk of bias for important outcomes within and across trials.

 Approach to formulating summary assessments of risk of bias for each important outcome (across domains) within and across trials (adapted from Higgins and Altman 13 )

Risk of biasInterpretationWithin a trialAcross trials
Low risk of biasBias, if present, is unlikely to alter the results seriouslyLow risk of bias for all key domainsMost information is from trials at low risk of bias
Unclear risk of biasA risk of bias that raises some doubt about the resultsLow or unclear risk of bias for all key domainsMost information is from trials at low or unclear risk of bias
High risk of biasBias may alter the results seriouslyHigh risk of bias for one or more key domainsThe proportion of information from trials at high risk of bias is sufficient to affect the interpretation of results

Assessments of risk of bias and synthesis of results

Summary assessments of the risk of bias for an outcome within each trial should inform the meta-analysis. The two preferable analytical strategies are to restrict the primary meta-analysis to studies at low risk of bias or to present meta-analyses stratified according to risk of bias. The choice between these strategies should be based on the context of the particular review and the balance between the potential for bias and the loss of precision when studies at high or unclear risk of bias are excluded. Meta-regression can be used to compare results from studies at high and low risk of bias, but such comparisons lack power, 15 and lack of a significant difference should not be interpreted as implying the absence of bias.

A third strategy is to present a meta-analysis of all studies while providing a summary of the risk of bias across studies. However, this runs the risk that bias is downplayed in the discussion and conclusions of a review, so that decisions continue to be based, at least in part, on flawed evidence. This risk could be reduced by incorporating summary assessments into broader, but explicit, measures of the quality of evidence for each important outcome, for example using the GRADE system. 16 This can help to ensure that judgments about the risk of bias, as well as other factors affecting the quality of evidence (such as imprecision, heterogeneity, and publication bias), are considered when interpreting the results of systematic reviews. 17 18

Discrepancies between the results of different systematic reviews examining the same question 19 20 and between meta-analyses and subsequent large trials 21 have shown that the results of meta-analyses can be biased, which may be partly caused by biased results in the trials they include. We believe our risk of bias tool is one of the most comprehensive approaches to assessing the potential for bias in randomised trials included in systematic reviews or meta-analyses. Inclusion of details of trial conduct, on which judgments of risk of bias are based, provides greater transparency than previous approaches, allowing readers to decide whether they agree with the judgments made. There is continuing uncertainty, and great variation in practice, over how to assess potential for bias in specific domains within trials, how to summarise bias assessments across such domains, and how to incorporate bias assessments into meta-analyses.

A recent study has found that the tool takes longer to complete than other tools (the investigators took a mean of 8.8 minutes per person for a single predetermined outcome using our tool compared with 1.5 minutes for a previous rating scale for quality of reporting). 22 The reliability of the tool has not been extensively studied, although the same authors observed that larger effect sizes were observed on average in studies rated as at high risk of bias compared with studies at low risk of bias. 22

By explicitly incorporating judgments into the tool, we acknowledge that agreements between assessors may not be as high as for some other tools. However, we also explicitly target the risk of bias rather than reported characteristics of the trial. It would be easier to assess whether a drop-out rate exceeds 20% than whether a drop-out rate of 21% introduces an important risk of bias, but there is no guarantee that results from a study with a drop-out rate lower than 20% are at low risk of bias. Preliminary evidence suggests that incomplete outcome data and selective reporting are the most difficult items to assess; kappa measures of agreement of 0.32 (fair) and 0.13 (slight) respectively have been reported for these. 22 It is important that guidance and training materials continue to be developed for all aspects of the tool, but particularly these two.

We hope that widespread adoption and implementation of the risk of bias tool, both within and outside the Cochrane Collaboration, will facilitate improved appraisal of evidence by healthcare decision makers and patients and ultimately lead to better healthcare. Improved understanding of the ways in which flaws in trial conduct may bias their results should also lead to better trials and more reliable evidence. Risk of bias assessments should continue to evolve, taking into account any new empirical evidence and the practical experience of authors of systematic reviews.

Summary points

  • Systematic reviews should carefully consider the potential limitations of the studies included
  • The Cochrane Collaboration has developed a new tool for assessing risk of bias in randomised trials
  • The tool separates a judgment about risk of bias from a description of the support for that judgment, for a series of items covering different domains of bias

Web Extra. Further details on the items included in risk assessment tool

Contributors: All authors contributed to the drafting and editing of the manuscript. JPTH, DGA, PCG, PJ, DM, ADO, KFS and JACS contributed to the chapter in the Cochrane Handbook for Systematic Reviews of Interventions on which the paper is based. JPTH will act as guarantor.

Development meeting participants (May 2005) : Doug Altman (co-lead), Gerd Antes, Chris Cates, Jon Deeks, Peter Gøtzsche, Julian Higgins (co-lead), Sally Hopewell, Peter Jüni (organising committee), Steff Lewis, Philippa Middleton, David Moher (organising committee), Andy Oxman, Ken Schulz (organising committee), Nandi Siegfried, Jonathan Sterne, Simon Thompson.

Other contributors to tool development : Hilda Bastian, Rachelle Buchbinder, Iain Chalmers, Miranda Cumpston, Sally Green, Peter Herbison, Victor Montori, Hannah Rothstein, Georgia Salanti, Guido Schwarzer, Ian Shrier, Jayne Tierney, Ian White and Paula Williamson.

Evaluation meeting participants (March 2010) : Doug Altman (organising committee), Elaine Beller, Sally Bell-Syer, Chris Cates, Rachel Churchill, June Cody, Jonathan Cook, Christian Gluud, Julian Higgins (organising committee), Sally Hopewell, Hayley Jones, Peter Jűni, Monica Kjeldstrøm, Toby Lasserson, Allyson Lipp, Lara Maxwell, Joanne McKenzie, Craig Ramsey, Barney Reeves, Jelena Savović (co-lead), Jonathan Sterne (co-lead), David Tovey, Laura Weeks (organising committee).

Other contributors to tool evaluation : Isabelle Boutron, David Moher (organising committee), Lucy Turner.

Funding: The development and evaluation of the risk of bias tool was funded in part by The Cochrane Collaboration. The views expressed in this article are those of the authors and not necessarily those of The Cochrane Collaboration or its registered entities, committees or working groups.. JPTH was also funded by MRC grant number U.1052.00.011. DGA was funded by Cancer Research UK grant number C-5592. DM was funded by a University Research Chair (University of Ottawa). The Canadian Institutes of Health Research provides financial support to the Cochrane Bias Methods Group.

Competing interests: All authors have completed the ICJME unified disclosure form at www.icmje.org/coi_disclosure.pdf (available on request from the corresponding author) and declare support from the Cochrane Collaboration for the development and evaluation of the tool described; they have no financial relationships with any organisations that might have an interest in the submitted work in the previous three years and no other relationships or activities that could appear to have influenced the submitted work.

Provenance and peer review: Not commissioned; externally peer reviewed.

Cite this as: BMJ 2011;343:d5928

Jump to navigation

Home

Cochrane Training

Chapter 8: assessing risk of bias in a randomized trial.

Julian PT Higgins, Jelena Savović, Matthew J Page, Roy G Elbers, Jonathan AC Sterne

Key Points:

  • This chapter details version 2 of the Cochrane risk-of-bias tool for randomized trials (RoB 2), the recommended tool for use in Cochrane Reviews.
  • RoB 2 is structured into a fixed set of domains of bias, focusing on different aspects of trial design, conduct and reporting.
  • Each assessment using the RoB 2 tool focuses on a specific result from a randomized trial.
  • Within each domain, a series of questions (‘signalling questions’) aim to elicit information about features of the trial that are relevant to risk of bias.
  • A judgement about the risk of bias arising from each domain is proposed by an algorithm, based on answers to the signalling questions. Judgements can be ‘Low’, or ‘High’ risk of bias, or can express ‘Some concerns’.
  • Answers to signalling questions and judgements about risk of bias should be supported by written justifications.
  • The overall risk of bias for the result is the least favourable assessment across the domains of bias. Both the proposed domain-level and overall risk-of-bias judgements can be overridden by the review authors, with justification.

Cite this chapter as: Higgins JPT, Savović J, Page MJ, Elbers RG, Sterne JAC. Chapter 8: Assessing risk of bias in a randomized trial. In: Higgins JPT, Thomas J, Chandler J, Cumpston M, Li T, Page MJ, Welch VA (editors). Cochrane Handbook for Systematic Reviews of Interventions version 6.4 (updated August 2023). Cochrane, 2023. Available from www.training.cochrane.org/handbook .

8.1 Introduction

Cochrane Reviews include an assessment of the risk of bias in each included study (see Chapter 7 for a general discussion of this topic). When randomized trials are included, the recommended tool is the revised version of the Cochrane tool, known as RoB 2, described in this chapter. The RoB 2 tool provides a framework for assessing the risk of bias in a single result (an estimate of the effect of an experimental intervention compared with a comparator intervention on a particular outcome) from any type of randomized trial.

The RoB 2 tool is structured into domains through which bias might be introduced into the result. These domains were identified based on both empirical evidence and theoretical considerations. This chapter summarizes the main features of RoB 2 applied to individually randomized parallel-group trials. It describes the process of undertaking an assessment using the RoB 2 tool, summarizes the important issues for each domain of bias, and ends with a list of the key differences between RoB 2 and the earlier version of the tool. Variants of the RoB 2 tool specific to cluster-randomized trials and crossover trials are summarized in Chapter 23 .

The full guidance document for the RoB 2 tool is available at www.riskofbias.info : it summarizes the empirical evidence underlying the tool and provides detailed explanations of the concepts covered and guidance on implementation.

8.2 Overview of RoB 2

8.2.1 selecting which results to assess within the review.

Before starting an assessment of risk of bias, authors will need to select which specific results from the included trials to assess. Because trials usually contribute multiple results to a systematic review, several risk-of-bias assessments may be needed for each trial, although it is unlikely to be feasible to assess every result for every trial in the review. It is important not to select results to assess based on the likely judgements arising from the assessment. An approach that focuses on the main outcomes of the review (the results contributing to the review’s ‘Summary of findings’ table) may be the most appropriate approach (see also Chapter 7, Section 7.3.2 ).

8.2.2 Specifying the nature of the effect of interest: ‘intention-to-treat’ effects versus ‘per-protocol’ effects

Assessments for one of the RoB 2 domains, ‘Bias due to deviations from intended interventions’, differ according to whether review authors are interested in quantifying:

  • the effect of assignment to the interventions at baseline, regardless of whether the interventions are received as intended (the ‘intention-to-treat effect’); or
  • the effect of adhering to the interventions as specified in the trial protocol (the ‘per-protocol effect’) (Hernán and Robins 2017).

If some patients do not receive their assigned intervention or deviate from the assigned intervention after baseline, these effects will differ, and will each be of interest. For example, the estimated effect of assignment to intervention would be the most appropriate to inform a health policy question about whether to recommend an intervention in a particular health system (e.g. whether to instigate a screening programme, or whether to prescribe a new cholesterol-lowering drug), whereas the estimated effect of adhering to the intervention as specified in the trial protocol would be the most appropriate to inform a care decision by an individual patient (e.g. whether to be screened, or whether to take the new drug). Review authors should define the intervention effect in which they are interested, and apply the risk-of-bias tool appropriately to this effect.

The effect of principal interest should be specified in the review protocol: most systematic reviews are likely to address the question of assignment rather than adherence to intervention. On occasion, review authors may be interested in both effects of interest.

The effect of assignment to intervention should be estimated by an intention-to-treat (ITT) analysis that includes all randomized participants (Fergusson et al 2002). The principles of ITT analyses are (Piantadosi 2005, Menerit 2012):

  • analyse participants in the intervention groups to which they were randomized, regardless of the interventions they actually received; and
  • include all randomized participants in the analysis, which requires measuring all participants’ outcomes.

An ITT analysis maintains the benefit of randomization: that, on average, the intervention groups do not differ at baseline with respect to measured or unmeasured prognostic factors. Note that the term ‘intention-to-treat’ does not have a consistent definition and is used inconsistently in study reports (Hollis and Campbell 1999, Gravel et al 2007, Bell et al 2014).

Patients and other stakeholders are often interested in the effect of adhering to the intervention as described in the trial protocol (the ‘per-protocol effect’), because it relates most closely to the implications of their choice between the interventions. However, two approaches to estimation of per-protocol effects that are commonly used in randomized trials may be seriously biased. These are:

  • ‘as-treated’ analyses in which participants are analysed according to the intervention they actually received, even if their randomized allocation was to a different treatment group; and
  • naïve ‘per-protocol’ analyses restricted to individuals who adhered to their assigned interventions.

Each of these analyses is problematic because prognostic factors may influence whether individuals adhere to their assigned intervention. If deviations are present, it is still possible to use data from a randomised trial to derive an unbiased estimate of the effect of adhering to intervention (Hernán and Robins 2017). However, appropriate methods require strong assumptions and published applications of such methods are relatively rare to date. When authors wish to assess the risk of bias in the estimated effect of adhering to intervention, use of results based on modern statistical methods may be at lower risk of bias than results based on ‘as-treated’ or naïve per-protocol analyses.

Trial authors often estimate the effect of intervention using more than one approach. They may not explain the reasons for their choice of analysis approach, or whether their aim is to estimate the effect of assignment or adherence to intervention. We recommend that when the effect of interest is that of assignment to intervention, the trial result included in meta-analyses, and assessed for risk of bias, should be chosen according to the following order of preference:

  • the result corresponding to a full ITT analysis, as defined above;
  • the result corresponding to an analysis (sometimes described as a ‘modified intention-to-treat’ (mITT) analysis) that adheres to ITT principles except that participants with missing outcome data are excluded (see Section 8.4.2 ; such an analysis does not prevent bias due to missing outcome data, which is addressed in the corresponding domain of the risk-of-bias assessment);
  • a result corresponding to an ‘as-treated’ or naïve ‘per-protocol’ analysis, or an analysis from which eligible trial participants were excluded.

8.2.3 Domains of bias and how they are addressed

The domains included in RoB 2 cover all types of bias that are currently understood to affect the results of randomized trials. These are:

  • bias arising from the randomization process;
  • bias due to deviations from intended interventions;
  • bias due to missing outcome data;
  • bias in measurement of the outcome; and
  • bias in selection of the reported result.

Each domain is required, and no additional domains should be added. Table 8.2.a summarizes the issues addressed within each bias domain.

For each domain, the tool comprises:

  • a series of ‘signalling questions’;
  • a judgement about risk of bias for the domain, which is facilitated by an algorithm that maps responses to the signalling questions to a proposed judgement;
  • free text boxes to justify responses to the signalling questions and risk-of-bias judgements; and
  • an option to predict (and explain) the likely direction of bias.

The signalling questions aim to provide a structured approach to eliciting information relevant to an assessment of risk of bias. They seek to be reasonably factual in nature, but some may require a degree of judgement. The response options are:

  • Probably yes;
  • Probably no;
  • No information.

To maximize their simplicity and clarity, the signalling questions are phrased such that a response of ‘Yes’ may indicate either a low or high risk of bias, depending on the most natural way to ask the question. Responses of ‘Yes’ and ‘Probably yes’ have the same implications for risk of bias, as do responses of ‘No’ and ‘Probably no’. The definitive responses (‘Yes’ and ‘No’) would typically imply that firm evidence is available in relation to the signalling question; the ‘Probably’ versions would typically imply that a judgement has been made. Although not required, if review authors wish to calculate measures of agreement (e.g. kappa statistics) for the answers to the signalling questions, we recommend treating ‘Yes’ and ‘Probably yes’ as the same response, and ‘No’ and ‘Probably no’ as the same response.

The ‘No information’ response should be used only when both (1) insufficient details are reported to permit a response of ‘Yes’, ‘Probably yes’, ‘No’ or ‘Probably no’, and (2) in the absence of these details it would be unreasonable to respond ‘Probably yes’ or ‘Probably no’ given the circumstances of the trial. For example, in the context of a large trial run by an experienced clinical trials unit for regulatory purposes, if specific information about the randomization methods is absent, it may still be reasonable to respond ‘Probably yes’ rather than ‘No information’ to the signalling question about allocation sequence concealment.

The implications of a ‘No information’ response to a signalling question differ according to the purpose of the question. If the question seeks to identify evidence of a problem, then ‘No information’ corresponds to no evidence of that problem. If the question relates to an item that is expected to be reported (such as whether any participants were lost to follow-up), then the absence of information leads to concerns about there being a problem.

A response option ‘Not applicable’ is available for signalling questions that are answered only if the response to a previous question implies that they are required.

Signalling questions should be answered independently: the answer to one question should not affect answers to other questions in the same or other domains other than through determining which subsequent questions are answered.

Once the signalling questions are answered, the next step is to reach a risk-of-bias judgement , and assign one of three levels to each domain:

  • Low risk of bias;
  • Some concerns; or
  • High risk of bias.

The RoB 2 tool includes algorithms that map responses to signalling questions to a proposed risk-of-bias judgement for each domain (see the full documentation at www.riskofbias.info for details). The algorithms include specific mappings of each possible combination of responses to the signalling questions (including responses of ‘No information’) to judgements of low risk of bias, some concerns or high risk of bias.

Use of the word ‘judgement’ is important for the risk-of-bias assessment. The algorithms provide proposed judgements, but review authors should verify these and change them if they feel this is appropriate. In reaching final judgements, review authors should interpret ‘risk of bias’ as ‘risk of material bias’. That is, concerns should be expressed only about issues that are likely to affect the ability to draw reliable conclusions from the study.

A free text box alongside the signalling questions and judgements provides space for review authors to present supporting information for each response. In some instances, when the same information is likely to be used to answer more than one question, one text box covers more than one signalling question. Brief, direct quotations from the text of the study report should be used whenever possible. It is important that reasons are provided for any judgements that do not follow the algorithms. The tool also provides space to indicate all the sources of information about the study obtained to inform the judgements (e.g. published papers, trial registry entries, additional information from the study authors).

RoB 2 includes optional judgements of the direction of the bias for each domain and overall. For some domains, the bias is most easily thought of as being towards or away from the null. For example, high levels of switching of participants from their assigned intervention to the other intervention may have the effect of reducing the observed difference between the groups, leading to the estimated effect of adhering to intervention (see Section 8.2.2 ) being biased towards the null. For other domains, the bias is likely to favour one of the interventions being compared, implying an increase or decrease in the effect estimate depending on which intervention is favoured. Examples include manipulation of the randomization process, awareness of interventions received influencing the outcome assessment and selective reporting of results. If review authors do not have a clear rationale for judging the likely direction of the bias, they should not guess it and can leave this response blank.

Table 8.2.a Bias domains included in version 2 of the Cochrane risk-of-bias tool for randomized trials, with a summary of the issues addressed

Whether:

Whether:

(see Section ):

(see Section ):

Whether:

Whether:

Whether:

* For the precise wording of signalling questions and guidance for answering each one, see the full risk-of-bias tool at .

8.2.4 Reaching an overall risk-of-bias judgement for a result

The response options for an overall risk-of-bias judgement are the same as for individual domains. Table 8.2.b shows the approach to mapping risk-of-bias judgements within domains to an overall judgement for the outcome.

Judging a result to be at a particular level of risk of bias for an individual domain implies that the result has an overall risk of bias at least this severe. Therefore, a judgement of ‘High’ risk of bias within any domain should have similar implications for the result, irrespective of which domain is being assessed. In practice this means that if the answers to the signalling questions yield a proposed judgement of ‘High’ risk of bias, the assessors should consider whether any identified problems are of sufficient concern to warrant this judgement for that result overall. If this is not the case, the appropriate action would be to override the proposed default judgement and provide justification. ‘Some concerns’ in multiple domains may lead review authors to decide on an overall judgement of ‘High’ risk of bias for that result or group of results.

Once an overall judgement has been reached for an individual trial result, this information will need to be presented in the review and reflected in the analysis and conclusions. For discussion of the presentation of risk-of-bias assessments and how they can be incorporated into analyses, see Chapter 7 . Risk-of-bias assessments also feed into one domain of the GRADE approach for assessing certainty of a body of evidence, as discussed in Chapter 14 .

Table 8.2.b Reaching an overall risk-of-bias judgement for a specific outcome

Low risk of bias

The trial is judged to be at for this result.

Some concerns

The trial is judged to raise in at least one domain for this result, but not to be at high risk of bias for any domain.

High risk of bias

The trial is judged to be at in at least one domain for this result.

Or

The trial is judged to have for in a way that substantially lowers confidence in the result.

8.3 Bias arising from the randomization process

If successfully accomplished, randomization avoids the influence of either known or unknown prognostic factors (factors that predict the outcome, such as severity of illness or presence of comorbidities) on the assignment of individual participants to intervention groups. This means that, on average, each intervention group has the same prognosis before the start of intervention. If prognostic factors influence the intervention group to which participants are assigned then the estimated effect of intervention will be biased by ‘confounding’, which occurs when there are common causes of intervention group assignment and outcome. Confounding is an important potential cause of bias in intervention effect estimates from observational studies, because treatment decisions in routine care are often influenced by prognostic factors.

To randomize participants into a study, an allocation sequence that specifies how participants will be assigned to interventions is generated, based on a process that includes an element of chance. We call this allocation sequence generation . Subsequently, steps must be taken to prevent participants or trial personnel from knowing the forthcoming allocations until after recruitment has been confirmed. This process is often termed allocation sequence concealment .

Knowledge of the next assignment (e.g. if the sequence is openly posted on a bulletin board) can enable selective enrolment of participants on the basis of prognostic factors. Participants who would have been assigned to an intervention deemed to be ‘inappropriate’ may be rejected. Other participants may be directed to the ‘appropriate’ intervention, which can be accomplished by delaying their entry into the trial until the desired allocation appears. For this reason, successful allocation sequence concealment is a vital part of randomization.

Some review authors confuse allocation sequence concealment with blinding of assigned interventions during the trial. Allocation sequence concealment seeks to prevent bias in intervention assignment by preventing trial personnel and participants from knowing the allocation sequence before and until assignment. It can always be successfully implemented, regardless of the study design or clinical area (Schulz et al 1995, Jüni et al 2001). In contrast, blinding seeks to prevent bias after assignment (Jüni et al 2001, Schulz et al 2002) and cannot always be implemented. This is often the situation, for example, in trials comparing surgical with non-surgical interventions.

8.3.1 Approaches to sequence generation

Randomization with no constraints is called simple randomization or unrestricted randomization . Sometimes blocked randomization (restricted randomization) is used to ensure that the desired ratio of participants in the experimental and comparator intervention groups (e.g. 1:1) is achieved (Schulz and Grimes 2002, Schulz and Grimes 2006). This is done by ensuring that the numbers of participants assigned to each intervention group is balanced within blocks of specified size (e.g. for every 10 consecutively entered participants): the specified number of allocations to experimental and comparator intervention groups is assigned in random order within each block. If the block size is known to trial personnel and the intervention group is revealed after assignment, then the last allocation within each block can always be predicted. To avoid this problem multiple block sizes may be used, and randomly varied (random permuted blocks).

Stratified randomization , in which randomization is performed separately within subsets of participants defined by potentially important prognostic factors, such as disease severity and study centres, is also common. In practice, stratified randomization is usually performed together with blocked randomization. The purpose of combining these two procedures is to ensure that experimental and comparator groups are similar with respect to the specified prognostic factors other than intervention. If simple (rather than blocked) randomization is used in each stratum, then stratification offers no benefit, but the randomization is still valid.

Another approach that incorporates both general concepts of stratification and restricted randomization is minimization . Minimization algorithms assign the next intervention in a way that achieves the best balance between intervention groups in relation to a specified set of prognostic factors. Minimization generally includes a random element (at least for participants enrolled when the groups are balanced with respect to the prognostic factors included in the algorithm) and should be implemented along with clear strategies for allocation sequence concealment. Some methodologists are cautious about the acceptability of minimization, while others consider it to be an attractive approach (Brown et al 2005, Clark et al 2016).

8.3.2 Allocation sequence concealment and failures of randomization

If future assignments can be anticipated, leading to a failure of allocation sequence concealment, then bias can arise through selective enrolment of participants into a study, depending on their prognostic factors. Ways in which this can happen include:

  • knowledge of a deterministic assignment rule, such as by alternation, date of birth or day of admission;
  • knowledge of the sequence of assignments, whether randomized or not (e.g. if a sequence of random assignments is posted on the wall); and
  • ability to predict assignments successfully, based on previous assignments.

The last of these can occur when blocked randomization is used and assignments are known to the recruiter after each participant is enrolled into the trial. It may then be possible to predict future assignments for some participants, particularly when blocks are of a fixed size and are not divided across multiple recruitment centres (Berger 2005).

Attempts to achieve allocation sequence concealment may be undermined in practice. For example, unsealed allocation envelopes may be opened, while translucent envelopes may be held against a bright light to reveal the contents (Schulz et al 1995, Schulz 1995, Jüni et al 2001). Personal accounts suggest that many allocation schemes have been deduced by investigators because the methods of concealment were inadequate (Schulz 1995).

The success of randomization in producing comparable groups is often examined by comparing baseline values of important prognostic factors between intervention groups. Corbett and colleagues have argued that risk-of-bias assessments should consider whether participant characteristics are balanced between intervention groups (Corbett et al 2014). The RoB 2 tool includes consideration of situations in which baseline characteristics indicate that something may have gone wrong with the randomization process. It is important that baseline imbalances that are consistent with chance are not interpreted as evidence of risk of bias. Chance imbalances are not a source of systematic bias, and the RoB 2 tool does not aim to identify imbalances in baseline variables that have arisen due to chance.

8.4 Bias due to deviations from intended interventions

This domain relates to biases that arise when there are deviations from the intended interventions. Such differences could be the administration of additional interventions that are inconsistent with the trial protocol, failure to implement the protocol interventions as intended, or non-adherence by trial participants to their assigned intervention. Biases that arise due to deviations from intended interventions are sometimes referred to as performance biases.

The intended interventions are those specified in the trial protocol. It is often intended that interventions should change or evolve in response to the health of, or events experienced by, trial participants. For example, the investigators may intend that:

  • in a trial of a new drug to control symptoms of rheumatoid arthritis, participants experiencing severe toxicities should receive additional care and/or switch to an alternative drug;
  • in a trial of a specified cancer drug regimen, participants whose cancer progresses should switch to a second-line intervention; or
  • in a trial comparing surgical intervention with conservative management of stable angina, participants who progress to unstable angina receive surgical intervention.

Unfortunately, trial protocols may not fully specify the circumstances in which deviations from the initial intervention should occur, or distinguish changes to intervention that are consistent with the intentions of the investigators from those that should be considered as deviations from the intended intervention. For example, a cancer trial protocol may not define progression, or specify the second-line drug that should be used in patients who progress (Hernán and Scharfstein 2018). It may therefore be necessary for review authors to document changes that are and are not considered to be deviations from intended intervention. Similarly, for trials in which the comparator intervention is ‘usual care’, the protocol may not specify interventions consistent with usual care or whether they are expected to be used alongside the experimental intervention. Review authors may therefore need to document what departures from usual care will be considered as deviations from intended intervention.

8.4.1 Non-protocol interventions

Non-protocol interventions that trial participants might receive during trial follow up and that are likely to affect the outcome of interest can lead to bias in estimated intervention effects. If possible, review authors should specify potential non-protocol interventions in advance (at review protocol writing stage). Non-protocol interventions may be identified through the expert knowledge of members of the review group, via reviews of the literature, and through discussions with health professionals.

8.4.2 The role of the effect of interest

As described in Section 8.2.2 , assessments for this domain depend on the effect of interest. In RoB 2, the only deviations from the intended intervention that are addressed in relation to the effect of assignment to the intervention are those that:

  • are inconsistent with the trial protocol;
  • arise because of the experimental context; and
  • influence the outcome.

For example, in an unblinded study participants may feel unlucky to have been assigned to the comparator group and therefore seek the experimental intervention, or other interventions that improve their prognosis. Similarly, monitoring patients randomized to a novel intervention more frequently than those randomized to standard care would increase the risk of bias, unless such monitoring was an intended part of the novel intervention. Deviations from intervention that do not arise because of the experimental context, such as a patient’s choice to stop taking their assigned medication.

To examine the effect of adhering to the interventions as specified in the trial protocol, it is important to specify what types of deviations from the intended intervention will be examined. These will be one or more of:

  • how well the intervention was implemented;
  • how well participants adhered to the intervention (without discontinuing or switching to another intervention);
  • whether non-protocol interventions were received alongside the intended intervention and (if so) whether they were balanced across intervention groups; and
  • if such deviations are present, review authors should consider whether appropriate statistical methods were used to adjust for their effects.

8.4.3 The role of blinding

Bias due to deviations from intended interventions can sometimes be reduced or avoided by implementing mechanisms that ensure the participants, carers and trial personnel (i.e. people delivering the interventions) are unaware of the interventions received. This is commonly referred to as ‘blinding’, although in some areas (including eye health) the term ‘masking’ is preferred. Blinding, if successful, should prevent knowledge of the intervention assignment from influencing contamination (application of one of the interventions in participants intended to receive the other), switches to non-protocol interventions or non-adherence by trial participants.

Trial reports often describe blinding in broad terms, such as ‘double blind’. This term makes it difficult to know who was blinded (Schulz et al 2002). Such terms are also used inconsistently (Haahr and Hróbjartsson 2006). A review of methods used for blinding highlights the variety of methods used in practice (Boutron et al 2006).

Blinding during a trial can be difficult or impossible in some contexts, for example in a trial comparing a surgical with a non-surgical intervention. Non-blinded (‘open’) trials may take other measures to avoid deviations from intended intervention, such as treating patients according to strict criteria that prevent administration of non-protocol interventions.

Lack of blinding of participants, carers or people delivering the interventions may cause bias if it leads to deviations from intended interventions. For example, low expectations of improvement among participants in the comparator group may lead them to seek and receive the experimental intervention. Such deviations from intended intervention that arise due to the experimental context can lead to bias in the estimated effects of both assignment to intervention and of adhering to intervention.

An attempt to blind participants, carers and people delivering the interventions to intervention group does not ensure successful blinding in practice. For many blinded drug trials, the side effects of the drugs allow the possible detection of the intervention being received for some participants, unless the study compares similar interventions, for example drugs with similar side effects, or uses an active placebo (Boutron et al 2006, Bello et al 2017, Jensen et al 2017).

Deducing the intervention received, for example among participants experiencing side effects that are specific to the experimental intervention, does not in itself lead to a risk of bias. As discussed, cessation of a drug intervention because of toxicity will usually not be considered a deviation from intended intervention. See the elaborations that accompany the signalling questions in the full guidance at www.riskofbias.info for further discussion of this issue.

Risk of bias in this domain may differ between outcomes, even if the same people were aware of intervention assignments during the trial. For example, knowledge of the assigned intervention may affect behaviour (such as number of clinic visits), while not having an important impact on physiology (including risk of mortality).

Blinding of outcome assessors, to avoid bias in measuring the outcome, is considered separately, in the ‘Bias in measurement of outcomes’ domain. Bias due to differential rates of dropout (withdrawal from the study) is considered in the ‘Bias due to missing outcome data’ domain.

8.4.4 Appropriate analyses

For the effect of assignment to intervention, an appropriate analysis should follow the principles of ITT (see Section 8.2.2 ). Some authors may report a ‘modified intention-to-treat’ (mITT) analysis in which participants with missing outcome data are excluded. Such an analysis may be biased because of the missing outcome data: this is addressed in the domain ‘Bias due to missing outcome data’. Note that the phrase ‘modified intention-to-treat’ is used in different ways, and may refer to inclusion of participants who received at least one dose of treatment (Abraha and Montedori 2010); our use of the term refers to missing data rather than to adherence to intervention.

Inappropriate analyses include ‘as-treated’ analyses, naïve ‘per-protocol’ analyses, and other analyses based on post-randomization exclusion of eligible trial participants on whom outcomes were measured (Hernán and Hernandez-Diaz 2012) (see also Section 8.2.2 ).

For the effect of adhering to intervention, appropriate analysis approaches are described by Hernán and Robins (Hernán and Robins 2017). Instrumental variable approaches can be used in some circumstances to estimate the effect of intervention among participants who received the assigned intervention.

8.5 Bias due to missing outcome data

Missing measurements of the outcome may lead to bias in the intervention effect estimate. Possible reasons for missing outcome data include (National Research Council 2010):

  • participants withdraw from the study or cannot be located (‘loss to follow-up’ or ‘dropout’);
  • participants do not attend a study visit at which outcomes should have been measured;
  • participants attend a study visit but do not provide relevant data;
  • data or records are lost or are unavailable for other reasons; and
  • participants can no longer experience the outcome, for example because they have died.

This domain addresses risk of bias due to missing outcome data, including biases introduced by procedures used to impute, or otherwise account for, the missing outcome data.

Some participants may be excluded from an analysis for reasons other than missing outcome data. In particular, a naïve ‘per-protocol’ analysis is restricted to participants who received the intended intervention. Potential bias introduced by such analyses, or by other exclusions of eligible participants for whom outcome data are available, is addressed in the domain ‘Bias due to deviations from intended interventions’ (see Section 8.4 ).

The ITT principle of measuring outcome data on all participants (see Section 8.2.2 ) is frequently difficult or impossible to achieve in practice. Therefore, it can often only be followed by making assumptions about the missing outcome values. Even when an analysis is described as ITT, it may exclude participants with missing outcome data and be at risk of bias (such analyses may be described as ‘modified intention-to-treat’ (mITT) analyses). Therefore, assessments of risk of bias due to missing outcome data should be based on the issues addressed in the signalling questions for this domain, and not on the way that trial authors described the analysis.

8.5.1 When do missing outcome data lead to bias?

Analyses excluding individuals with missing outcome data are examples of ‘complete-case’ analyses (analyses restricted to individuals in whom there were no missing values of included variables). To understand when missing outcome data lead to bias in such analyses, we need to consider:

  • the true value of the outcome in participants with missing outcome data: this is the value of the outcome that should have been measured but was not; and
  • the missingness mechanism , which is the process that led to outcome data being missing.

Whether missing outcome data lead to bias in complete case analyses depends on whether the missingness mechanism is related to the true value of the outcome. Equivalently, we can consider whether the measured (non-missing) outcomes differ systematically from the missing outcomes (the true values in participants with missing outcome data). For example, consider a trial of cognitive behavioural therapy compared with usual care for depression. If participants who are more depressed are less likely to return for follow-up, then whether a measurement of depression is missing depends on its true value which implies that the measured depression outcomes will differ systematically from the true values of the missing depression outcomes.

The specific situations in which a complete case analysis suffers from bias (when there are missing data) are discussed in detail in the full guidance for the RoB 2 tool at www.riskofbias.info . In brief:

  • missing outcome data will not lead to bias if missingness in the outcome is unrelated to its true value, within each intervention group;
  • missing outcome data will lead to bias if missingness in the outcome depends on both the intervention group and the true value of the outcome; and
  • missing outcome data will often lead to bias if missingness is related to its true value and, additionally, the effect of the experimental intervention differs from that of the comparator intervention.

8.5.2 When is the amount of missing outcome data small enough to exclude bias?

It is tempting to classify risk of bias according to the proportion of participants with missing outcome data.

Unfortunately, there is no sensible threshold for ‘small enough’ in relation to the proportion of missing outcome data.

In situations where missing outcome data lead to bias, the extent of bias will increase as the amount of missing outcome data increases. There is a tradition of regarding a proportion of less than 5% missing outcome data as ‘small’ (with corresponding implications for risk of bias), and over 20% as ‘large’. However, the potential impact of missing data on estimated intervention effects depends on the proportion of participants with missing data, the type of outcome and (for dichotomous outcome) the risk of the event. For example, consider a study of 1000 participants in the intervention group where the observed mortality is 2% for the 900 participants with outcome data (18 deaths). Even though the proportion of data missing is only 10%, if the mortality rate in the 100 missing participants is 20% (20 deaths), the overall true mortality of the intervention group would be nearly double (3.8% vs 2%) that estimated from the observed data.

8.5.3 Judging risk of bias due to missing outcome data

It is not possible to examine directly whether the chance that the outcome is missing depends on its true value: judgements of risk of bias will depend on the circumstances of the trial. Therefore, we can only be sure that there is no bias due to missing outcome data when: (1) the outcome is measured in all participants; (2) the proportion of missing outcome data is sufficiently low that any bias is too small to be of importance; or (3) sensitivity analyses (conducted by either the trial authors or the review authors) confirm that plausible values of the missing outcome data could make no important difference to the estimated intervention effect.

Indirect evidence that missing outcome data are likely to cause bias can come from examining: (1) differences between the proportion of missing outcome data in the experimental and comparator intervention groups; and (2) reasons that outcome data are missing.

If the effects of the experimental and comparator interventions on the outcome are different, and missingness in the outcome depends on its true value, then the proportion of participants with missing data is likely to differ between the intervention groups. Therefore, differing proportions of missing outcome data in the experimental and comparator intervention groups provide evidence of potential bias.

Trial reports may provide reasons why participants have missing data. For example, trials of haloperidol to treat dementia reported various reasons such as ‘lack of efficacy’, ‘adverse experience’, ‘positive response’, ‘withdrawal of consent’ and ‘patient ran away’, and ‘patient sleeping’ (Higgins et al 2008). It is likely that some of these (e.g. ‘lack of efficacy’ and ‘positive response’) are related to the true values of the missing outcome data. Therefore, these reasons increase the risk of bias if the effects of the experimental and comparator interventions differ, or if the reasons are related to intervention group (e.g. ‘adverse experience’).

In practice, our ability to assess risk of bias will be limited by the extent to which trial authors collected and reported reasons that outcome data were missing. The situation most likely to lead to bias is when reasons for missing outcome data differ between the intervention groups: for example if participants who became seriously unwell withdrew from the comparator group while participants who recovered withdrew from the experimental intervention group.

Trial authors may present statistical analyses (in addition to or instead of complete case analyses) that attempt to address the potential for bias caused by missing outcome data. Approaches include single imputation (e.g. assuming the participant had no event; last observation carried forward), multiple imputation and likelihood-based methods (see Chapter 10 , Section 10.12.2). Imputation methods are unlikely to remove or reduce the bias that occurs when missingness in the outcome depends on its true value, unless they use information additional to intervention group assignment to predict the missing values. Review authors may attempt to address missing data using sensitivity analyses, as discussed in Chapter 10, Section 10.12.3 .

8.6 Bias in measurement of the outcome

Errors in measurement of outcomes can bias intervention effect estimates. These are often referred to as measurement error (for continuous outcomes), misclassification (for dichotomous or categorical outcomes) or under-ascertainment/over-ascertainment (for events). Measurement errors may be differential or non-differential in relation to intervention assignment:

  • Differential measurement errors are related to intervention assignment. Such measures are systematically different between experimental and comparator intervention groups and are less likely when outcome assessors are blinded to intervention assignment.
  • Non-differential measurement errors are unrelated to intervention assignment.

This domain relates primarily to differential errors. Non-differential measurement errors are not addressed in detail.

Risk of bias in this domain depends on the following five considerations.

1. Whether the method of measuring the outcome is appropriate. Outcomes in randomized trials should be assessed using appropriate outcome measures. For example, portable blood glucose machines used by trial participants may not reliably measure below 3.1mmol, leading to an inability to detect differences in rates of severe hypoglycaemia between an insulin intervention and placebo, and under-representation of the true incidence of this adverse effect. Such a measurement would be inappropriate for this outcome.

2. Whether measurement or ascertainment of the outcome differs, or could differ, between intervention groups. The methods used to measure or ascertain outcomes should be the same across intervention groups. This is usually the case for pre-specified outcomes, but problems may arise with passive collection of outcome data, as is often the case for unexpected adverse effects. For example, in a placebo-controlled trial, severe headaches occur more frequently in participants assigned to a new drug than those assigned to placebo. These lead to more MRI scans being done in the experimental intervention group, and therefore to more diagnoses of symptomless brain tumours, even though the drug does not increase the incidence of brain tumours. Even for a pre-specified outcome measure, the nature of the intervention may lead to methods of measuring the outcome that are not comparable across intervention groups. For example, an intervention involving additional visits to a healthcare provider may lead to additional opportunities for outcome events to be identified, compared with the comparator intervention.

3. Who is the outcome assessor. The outcome assessor can be:

  • the participant, when the outcome is a participant-reported outcome such as pain, quality of life, or self-completed questionnaire;
  • the intervention provider, when the outcome is the result of a clinical examination, the occurrence of a clinical event or a therapeutic decision such as decision to offer a surgical intervention; or
  • an observer not directly involved in the intervention provided to the participant, such as an adjudication committee, or a health professional recording outcomes for inclusion in disease registries.

4. Whether the outcome assessor is blinded to intervention assignment. Blinding of outcome assessors is often possible even when blinding of participants and personnel during the trial is not feasible. However, it is particularly difficult for participant-reported outcomes: for example, in a trial comparing surgery with medical management when the outcome is pain at 3 months. The potential for bias cannot be ignored even if the outcome assessor cannot be blinded.

5. Whether the assessment of outcome is likely to be influenced by knowledge of intervention received. For trials in which outcome assessors were not blinded, the risk of bias will depend on whether the outcome assessment involves judgement, which depends on the type of outcome. We describe most situations in Table 8.6.a .

Table 8.6.a Considerations of risk of bias in measurement of the outcome for different types of outcomes

Participant-reported outcomes

Reports coming directly from participants about how they function or feel in relation to a health condition or intervention, without interpretation by anyone else. They include any evaluation obtained directly from participants through interviews, self-completed questionnaires or hand-held devices.

Pain, nausea and health-related quality of life.

The , even if a blinded interviewer is questioning the participant and completing a questionnaire on their behalf.

The outcome assessment is by knowledge of intervention received, leading to a judgement of at least ‘Some concerns’. Review authors will need to judge whether it is that participants’ reporting of the outcome was influenced by knowledge of intervention received, in which case risk of bias is considered high.

Observer-reported outcomes not involving judgement

Outcomes reported by an external observer (e.g. an intervention provider, independent researcher, or radiologist) that do involve any judgement from the observer.

All-cause mortality or the result of an automated test.

The .

The assessment of outcome is usually by knowledge of intervention received.

Observer-reported outcomes involving some judgement

Outcomes reported by an external observer (e.g. an intervention provider, independent researcher, or radiologist) that involve some judgement.

Assessment of an X-ray or other image, clinical examination and clinical events other than death (e.g. myocardial infarction) that require judgements on clinical definitions or medical records.

The .

The assessment of outcome is by knowledge of intervention received, leading to a judgement of at least ‘Some concerns’. Review authors will need to judge whether it is likely that assessment of the outcome was influenced by knowledge of intervention received, in which case risk of bias is considered high.

Outcomes that reflect decisions made by the intervention provider

Outcomes that reflect decisions made by the intervention provider, where recording of the decisions does not involve any judgement, but where the decision itself can be influenced by knowledge of intervention received.

Hospitalization, stopping treatment, referral to a different ward, performing a caesarean section, stopping ventilation and discharge of the participant.

The .

Assessment of outcome is usually by knowledge of intervention received, if the care provider is aware of this. This is particularly important when preferences or expectations regarding the effect of the experimental intervention are strong.

Composite outcomes

Combination of multiple end points into a single outcome. Typically, participants who have experienced any of a specified set of endpoints are considered to have experienced the composite outcome. Composite endpoints can also be constructed from continuous outcome measures.

Major adverse cardiac and cerebrovascular events.

Any of the above.

Assessment of risk of bias for composite outcomes should take into account the frequency or contribution of each component and the risk of bias due to the most influential components.

8.7 Bias in selection of the reported result

This domain addresses bias that arises because the reported result is selected (based on its direction, magnitude or statistical significance) from among multiple intervention effect estimates that were calculated by the trial authors. Consideration of risk of bias requires distinction between:

  • an outcome domain : this is a state or endpoint of interest, irrespective of how it is measured (e.g. presence or severity of depression);
  • a specific outcome measurement (e.g. measurement of depression using the Hamilton rating scale 6 weeks after starting intervention); and
  • an outcome analysis : this is a specific result obtained by analysing one or more outcome measurements (e.g. the difference in mean change in Hamilton rating scale scores from baseline to 6 weeks between experimental and comparator groups).

This domain does not address bias due to selective non-reporting (or incomplete reporting) of outcome domains that were measured and analysed by the trial authors (Kirkham et al 2010). For example, deaths of trial participants may be recorded by the trialists, but the reports of the trial might contain no data for deaths, or state only that the effect estimate for mortality was not statistically significant. Such bias puts the result of a synthesis at risk because results are omitted based on their direction, magnitude or statistical significance. It should therefore be addressed at the review level, as part of an integrated assessment of the risk of reporting bias (Page and Higgins 2016). For further guidance, see Chapter 7 and Chapter 13 .

Bias in selection of the reported result typically arises from a desire for findings to support vested interests or to be sufficiently noteworthy to merit publication. It can arise for both harms and benefits, although the motivations may differ. For example, in trials comparing an experimental intervention with placebo, trialists who have a preconception or vested interest in showing that the experimental intervention is beneficial and safe may be inclined to be selective in reporting efficacy estimates that are statistically significant and favourable to the experimental intervention, along with harm estimates that are not significantly different between groups. In contrast, other trialists may selectively report harm estimates that are statistically significant and unfavourable to the experimental intervention if they believe that publicizing the existence of a harm will increase their chances of publishing in a high impact journal.

This domain considers:

1. Whether the trial was analysed in accordance with a pre-specified plan that was finalized before unblinded outcome data were available for analysis. We strongly encourage review authors to attempt to retrieve the pre-specified analysis intentions for each trial (see Chapter 7, Section 7.3.1 ). Doing so allows for the identification of any outcome measures or analyses that have been omitted from, or added to, the results report, post hoc. Review authors should ideally ask the study authors to supply the study protocol and full statistical analysis plan if these are not publicly available. In addition, if outcome measures and analyses mentioned in an article, protocol or trial registration record are not reported, study authors could be asked to clarify whether those outcome measures were in fact analysed and, if so, to supply the data.

Trial protocols should describe how unexpected adverse outcomes (that potentially reflect unanticipated harms) will be collected and analysed. However, results based on spontaneously reported adverse outcomes may lead to concerns that these were selected based on the finding being noteworthy.

For some trials, the analysis intentions will not be readily available. It is still possible to assess the risk of bias in selection of the reported result. For example, outcome measures and analyses listed in the methods section of an article can be compared with those reported. Furthermore, outcome measures and analyses should be compared across different papers describing the trial.

2. Selective reporting of a particular outcome measurement (based on the results) from among estimates for multiple measurements assessed within an outcome domain. Examples include:

  • reporting only one or a subset of time points at which the outcome was measured;
  • use of multiple measurement instruments (e.g. pain scales) and only reporting data for the instrument with the most favourable result;
  • having multiple assessors measure an outcome domain (e.g. clinician-rated and patient-rated depression scales) and only reporting data for the measure with the most favourable result; and
  • reporting only the most favourable subscale (or a subset of subscales) for an instrument when measurements for other subscales were available.

3. Selective reporting of a particular analysis (based on the results) from multiple analyses estimating intervention effects for a specific outcome measurement. Examples include:

  • carrying out analyses of both change scores and post-intervention scores adjusted for baseline and reporting only the more favourable analysis;
  • multiple analyses of a particular outcome measurement with and without adjustment for prognostic factors (or with adjustment for different sets of prognostic factors);
  • a continuously scaled outcome converted to categorical data on the basis of multiple cut-points; and
  • effect estimates generated for multiple composite outcomes with full reporting of just one or a subset.

Either type of selective reporting will lead to bias if selection is based on the direction, magnitude or statistical significance of the effect estimate.

Insufficient detail in some documents may preclude full assessment of the risk of bias (e.g. trialists only state in the trial registry record that they will measure ‘pain’, without specifying the measurement scale, time point or metric that will be used). Review authors should indicate insufficient information alongside their responses to signalling questions.

8.8 Differences from the previous version of the tool

Version 2 of the tool replaces the first version, originally published in version 5 of the Handbook in 2008, and updated in 2011 (Higgins et al 2011). Research in the field has progressed, and RoB 2 reflects current understanding of how the causes of bias can influence study results, and the most appropriate ways to assess this risk.

Authors familiar with the previous version of the tool, which is used widely in Cochrane and other systematic reviews, will notice several changes:

  • assessment of bias is at the level of an individual result, rather than at a study or outcome level;
  • the names given to the bias domains describe more clearly the issues targeted and should reduce confusion arising from terms that are used in different ways or may be unfamiliar (such as ‘selection bias’ and ‘performance bias’) (Mansournia et al 2017);
  • signalling questions have been introduced, along with algorithms to assist authors in reaching a judgement about risk of bias for each domain;
  • a distinction is introduced between considering the effect of assignment to intervention and the effect of adhering to intervention, with implications for the assessment of bias due to deviations from intended interventions;
  • the assessment of bias arising from the exclusion of participants from the analysis (for example, as part of a naïve ‘per-protocol’ analysis) is under the domain of bias due to deviations from the intended intervention, rather than bias due to missing outcome data;
  • the concept of selective reporting of a result is distinguished from that of selective non-reporting of a result, with the latter concept removed from the tool so that it can be addressed (more appropriately) at the level of the synthesis (see Chapter 13 );
  • the option to add new domains has been removed;
  • an explicit process for reaching a judgement about the overall risk of bias in the result has been introduced.

Because most Cochrane Reviews published before 2019 used the first version of the tool, authors working on updating these reviews should refer to online Chapter IV for guidance on considering whether to change methodology when updating a review.

8.9 Chapter information

Authors: Julian PT Higgins, Jelena Savović, Matthew J Page, Roy G Elbers, Jonathan AC Sterne

Acknowledgements: Contributors to the development of bias domains were: Natalie Blencowe, Isabelle Boutron, Christopher Cates, Rachel Churchill, Mark Corbett, Nicky Cullum, Jonathan Emberson, Sally Hopewell, Asbjørn Hróbjartsson, Sharea Ijaz, Peter Jüni, Jamie Kirkham, Toby Lasserson, Tianjing Li, Barney Reeves, Sasha Shepperd, Ian Shrier, Lesley Stewart, Kate Tilling, Ian White, Penny Whiting. Other contributors were: Henning Keinke Andersen, Vincent Cheng, Mike Clarke, Jon Deeks, Miguel Hernán, Daniela Junqueira, Yoon Loke, Geraldine MacDonald, Alexandra McAleenan, Richard Morris, Mona Nasser, Nishith Patel, Jani Ruotsalainen, Holger Schünemann, Jayne Tierney, Sunita Vohra, Liliane Zorzela.

Funding: Development of RoB 2 was supported by the Medical Research Council (MRC) Network of Hubs for Trials Methodology Research (MR/L004933/2- N61) hosted by the MRC ConDuCT-II Hub (Collaboration and innovation for Difficult and Complex randomised controlled Trials In Invasive procedures – MR/K025643/1), by a Methods Innovation Fund grant from Cochrane and by MRC grant MR/M025209/1 . JPTH and JACS are members of the National Institute for Health Research (NIHR) Biomedical Research Centre at University Hospitals Bristol NHS Foundation Trust and the University of Bristol, and the MRC Integrative Epidemiology Unit at the University of Bristol. JPTH, JS and JACS are members of the NIHR Collaboration for Leadership in Applied Health Research and Care West (CLAHRC West) at University Hospitals Bristol NHS Foundation Trust. JPTH and JACS received funding from NIHR Senior Investigator awards NF-SI-0617-10145 and NF-SI-0611-10168, respectively. MJP received funding from an Australian National Health and Medical Research Council (NHMRC) Early Career Fellowship (1088535). The views expressed are those of the authors and not necessarily those of the National Health Service, the NIHR, the UK Department of Health and Social Care, the MRC or the Australian NHMRC.

8.10 References

Abraha I, Montedori A. Modified intention to treat reporting in randomised controlled trials: systematic review. BMJ 2010; 340 : c2697.

Bell ML, Fiero M, Horton NJ, Hsu CH. Handling missing data in RCTs; a review of the top medical journals. BMC Medical Research Methodology 2014; 14 : 118.

Bello S, Moustgaard H, Hróbjartsson A. Unreported formal assessment of unblinding occurred in 4 of 10 randomized clinical trials, unreported loss of blinding in 1 of 10 trials. Journal of Clinical Epidemiology 2017; 81 : 42-50.

Berger VW. Quantifying the magnitude of baseline covariate imbalances resulting from selection bias in randomized clinical trials. Biometrical Journal 2005; 47 : 119-127.

Boutron I, Estellat C, Guittet L, Dechartres A, Sackett DL, Hróbjartsson A, Ravaud P. Methods of blinding in reports of randomized controlled trials assessing pharmacologic treatments: a systematic review. PLoS Medicine 2006; 3 : e425.

Brown S, Thorpe H, Hawkins K, Brown J. Minimization--reducing predictability for multi-centre trials whilst retaining balance within centre. Statistics in Medicine 2005; 24 : 3715-3727.

Clark L, Fairhurst C, Torgerson DJ. Allocation concealment in randomised controlled trials: are we getting better? BMJ 2016; 355 : i5663.

Corbett MS, Higgins JPT, Woolacott NF. Assessing baseline imbalance in randomised trials: implications for the Cochrane risk of bias tool. Research Synthesis Methods 2014; 5 : 79-85.

Fergusson D, Aaron SD, Guyatt G, Hebert P. Post-randomisation exclusions: the intention to treat principle and excluding patients from analysis. BMJ 2002; 325 : 652-654.

Gravel J, Opatrny L, Shapiro S. The intention-to-treat approach in randomized controlled trials: are authors saying what they do and doing what they say? Clinical Trials (London, England) 2007; 4 : 350-356.

Haahr MT, Hróbjartsson A. Who is blinded in randomized clinical trials? A study of 200 trials and a survey of authors. Clinical Trials (London, England) 2006; 3 : 360-365.

Hernán MA, Hernandez-Diaz S. Beyond the intention-to-treat in comparative effectiveness research. Clinical Trials (London, England) 2012; 9 : 48-55.

Hernán MA, Robins JM. Per-protocol analyses of pragmatic trials. New England Journal of Medicine 2017; 377 : 1391-1398.

Hernán MA, Scharfstein D. Cautions as Regulators Move to End Exclusive Reliance on Intention to Treat. Annals of Internal Medicine 2018; 168 : 515-516.

Higgins JPT, White IR, Wood AM. Imputation methods for missing outcome data in meta-analysis of clinical trials. Clinical Trials 2008; 5 : 225-239.

Higgins JPT, Altman DG, Gøtzsche PC, Jüni P, Moher D, Oxman AD, Savović J, Schulz KF, Weeks L, Sterne JAC. The Cochrane Collaboration's tool for assessing risk of bias in randomised trials. BMJ 2011; 343 : d5928.

Hollis S, Campbell F. What is meant by intention to treat analysis? Survey of published randomised controlled trials. BMJ 1999; 319 : 670-674.

Jensen JS, Bielefeldt AO, Hróbjartsson A. Active placebo control groups of pharmacological interventions were rarely used but merited serious consideration: a methodological overview. Journal of Clinical Epidemiology 2017; 87 : 35-46.

Jüni P, Altman DG, Egger M. Systematic reviews in health care: Assessing the quality of controlled clinical trials. BMJ 2001; 323 : 42-46.

Kirkham JJ, Dwan KM, Altman DG, Gamble C, Dodd S, Smyth R, Williamson PR. The impact of outcome reporting bias in randomised controlled trials on a cohort of systematic reviews. BMJ 2010; 340 : c365.

Mansournia MA, Higgins JPT, Sterne JAC, Hernán MA. Biases in randomized trials: a conversation between trialists and epidemiologists. Epidemiology 2017; 28 : 54-59.

Menerit CL. Clinical Trials – Design, Conduct, and Analysis. Second Edition . Oxford (UK): Oxford University Press; 2012.

National Research Council. The Prevention and Treatment of Missing Data in Clinical Trials. Panel on Handling Missing Data in Clinical Trials. Committee on National Statistics, Division of Behavioral and Social Sciences and Education . Washington, DC: The National Academies Press; 2010.

Page MJ, Higgins JPT. Rethinking the assessment of risk of bias due to selective reporting: a cross-sectional study. Systematic Reviews 2016; 5 : 108.

Piantadosi S. Clinical Trials: A Methodologic perspective . 2nd ed. Hoboken (NJ): Wiley; 2005.

Schulz KF, Chalmers I, Hayes RJ, Altman DG. Empirical evidence of bias. Dimensions of methodological quality associated with estimates of treatment effects in controlled trials. JAMA 1995; 273 : 408-412.

Schulz KF. Subverting randomization in controlled trials. JAMA 1995; 274 : 1456-1458.

Schulz KF, Grimes DA. Generation of allocation sequences in randomised trials: chance, not choice. Lancet 2002; 359 : 515-519.

Schulz KF, Chalmers I, Altman DG. The landscape and lexicon of blinding in randomized trials. Annals of Internal Medicine 2002; 136 : 254-259.

Schulz KF, Grimes DA. The Lancet Handbook of Essential Concepts in Clinical Research . Edinburgh (UK): Elsevier; 2006 2006.

For permission to re-use material from the Handbook (either academic or commercial), please see here for full details.

IMAGES

  1. The judgement of biases included in the category "other bias" in

    cochrane systematic review biases

  2. (PDF) The judgement of biases included in the category "other bias" in

    cochrane systematic review biases

  3. Risk of bias assessment. According to the Cochrane Hand book for

    cochrane systematic review biases

  4. Example of a Cochrane risk of bias table

    cochrane systematic review biases

  5. Selection of systematic reviews and meta-analyses from the Cochrane

    cochrane systematic review biases

  6. PPT

    cochrane systematic review biases

VIDEO

  1. Part 2: Update on Cochrane review and data format changes in 2023

  2. Switching to Cochrane's new Focused Review Format: what you need to know

  3. Which Antibiotic for Sore Throat?

  4. Cochrane Interactive learning

  5. WCPT Congress

  6. Understanding Behavioral Finance Cognitive Biases and Investment Strategies

COMMENTS

  1. Chapter 7: Considering bias and conflicts of ... - Cochrane

    Actions of authors can, in turn, be influenced by conflicts of interest. In this chapter we introduce issues of bias in the context of a Cochrane Review, covering both biases in the results of included studies and biases in the results of a synthesis.

  2. Risk of Bias 2 (RoB 2) tool | Cochrane Methods

    The Risk of Bias 2 (RoB 2) tool is an update to the original risk of bias tool that launched in 2008. The relevant chapter in the Cochrane Handbook for Systematic Reviews of Interventions Chapter 8, titled ‘Assessing risk of bias in a randomized trial’.

  3. Assessing risk of bias in included studies - Cochrane

    Outline. risk of bias in systematic reviews. assessing sources of bias. putting it into practice: ‘Risk of bias’ tables. incorporating findings into your review. See Chapter 8 of the Handbook. What is bias? Systematic error or deviation from the truth. systematic reviews depend on included studies. incorrect studies = misleading reviews.

  4. Cochrane Handbook | Cochrane Bias

    The Handbook covers all aspects such as preparing a review, searching for studies, assessing risk of bias in included studies, analysing data and undertaking meta-analyses, and interpreting results and drawing conclusions. The current version is available here.

  5. Risk of Bias 2 in Cochrane Reviews: a ... - Cochrane Library

    The revised tool is structured into five domains of bias, according to the stages of a trial in which problems may arise: (1) the randomization process; (2) deviations from intended intervention; (3) missing outcome data; (4) measurement of the outcome; and (5) selection of the reported result.

  6. RoB 2: a revised tool for assessing risk of bias in ... - The BMJ

    Assessment of risk of bias is regarded as an essential component of a systematic review on the effects of an intervention. The most commonly used tool for randomised trials is the Cochrane risk-of-bias tool.

  7. Assessing risk of bias in randomised clinical trials included ...

    Any associated biases could impact seriously on the findings and conclusion of a systematic review. Authors of systematic reviews thus need to assess the risk of bias in included randomised clinical trials. In this 20th Anniversary editorial, we look at the evolution of guidance on appraising studies included in Cochrane Reviews.

  8. The Cochrane Collaboration’s tool for assessing risk of bias ...

    The Cochrane Collaboration’s tool for assessing risk of bias aims to make the process clearer and more accurate. Randomised trials, and systematic reviews of such trials, provide the most reliable evidence about the effects of healthcare interventions.

  9. Cochrane Handbook for Systematic Reviews of Interventions

    An assessment of the validity of studies included in a Cochrane Review should emphasize the risk of bias in their results, i.e. the risk that they will overestimate or underestimate the true intervention effect. Numerous tools are available for assessing methodological quality of clinical trials.

  10. Chapter 8: Assessing risk of bias in a randomized ... - Cochrane

    This chapter details version 2 of the Cochrane risk-of-bias tool for randomized trials (RoB 2), the recommended tool for use in Cochrane Reviews. RoB 2 is structured into a fixed set of domains of bias, focusing on different aspects of trial design, conduct and reporting.