Albatross Plots – video

Sean Harrison, a final year PhD student looking at the associations between individual lifestyle characteristics, prostate cancer and prostate-specific antigen, came to talk to us about his novel method for visualising data when pooling data in meta-analysis is not appropriate (which is actually fairly often – Matthew Page’s recent study found only 63% of systematic reviews present meta-analyses).

Albatross plots are a graphical tool for presenting results of diversely reported studies; in them, P values are plotted against the number of participants in each study, with effect contours added for the specific statistical type the studies used (beta coefficients, odds ratios etc.). This allows you to estimate the overall magnitude of effect and look for heterogeneity even when studies lack the information required for meta-analysis

The video of Sean’s talk, complete with easy to follow Stata instructions and demonstration, is below. Sean talks us through how to create the plots, as well as different options available to modify the plots.  The Stata help file Sean has written comes with gratifying pragmatic detail and practice data.

Put the kettle on and enjoy!

Introducing the ROBINS-I

Assessing the Risk of Bias in non-randomised studies of interventions

Tess Moore

So risk of bias is grippingly exciting and we’ve got the statistics to prove it. The paper describing the ROBINS-I (Risk of Bias in Non-Randomised Studies of Interventions) has been tweeted 363 times and cited 5 times (putting it in the top 5% of all research outputs scored by Altmetric)[1]. And on 17th October 2016 we at the MESS had a great presentation (slides below) from Julian Sterne, Julian Higgins and Barney Reeves, three of its key developers(1).

Why do we need a risk-of bias tool for non-randomised studies? Surely, when we think about interventions, we should only consider getting our evidence from RCTs?

Well it is true that RCTs provide us with the best platform for a comparing interventions and allow us to remove confounding influences on the treatment effect and that is why they are considered the gold standard or best evidence for effects of interventions(2). But some interventions cannot be tested with an RCT and non-randomised studies are often the only studies available to provide evidence long term outcomes, they are the place where tweaks and small changes to interventions can be tested, and where the delivery of interventions in real world settings or to broad spectrum populations can also be tested out. Often they are the only place where harms and unintended negative effects of interventions can be found. So, despite RCTs being the ‘gold standard’ non-randomised studies are often the working mines where some real evidence “ore” lies.

The work on ROBINS-I was funded by the Cochrane Collaboration Methods Innovation Fund and the Medical Research Council.  When starting their work the ROBINS-I team asked Cochrane systematic reviewer teams if they did, in fact, include non-randomised studies in their systematic reviews? An aside – Cochrane systematic reviews typically are reviews of interventions and usually include, exclusively, only RCTs so asking them this was a bit like asking traffic police if they ever went above the speed limit, or vegetarians if they occasionally ate a bacon sandwich.  They found, unsurprisingly, for all the reasons stated above, that Cochrane review teams did indeed include non-randomised studies. Also a quick look at Matt Page’s cross sectional study of the current state of systematic reviews, as reported a few months ago in the MESS, also reported that 9% of Cochrane reviews and 25% of systematic reviews published in  2014 contain non-randomised studies(3).  So a means of clearly and responsively assessing bias of nonrandomised studies of interventions for use in systematic reviews was needed.

Why would we be excited by the ROBINS-I?  Why not just use one of the existing tools?

Before the ROBINS-I tool was developed, and this paper was published, reviewers had access to a huge number (193) of other scales and checklists available to assess ‘methodological quality’ of non-randomised studies as documented by Deeks et al in an evaluation of non-randomised intervention studies (4). From these six emerged as useful for systematic reviews (4). While these covered core and important domains identified by team Deeks et al (such as creation of the intervention group, comparability of groups at analysis, blinding of participants and investigators) they also included aspects related to reporting of the study and to generalisability (external validity) which are not related to how biased a treatment effect might be – and so are not that useful to the systematic reviewer.  And in reality only 2 scales were actually really practically used in systematic reviews the Newcastle Ottawa Scale and the Downs and Black scale. With the ROBINS-I the authors (many part of the Deeks et al research project) tell us we  now have a comprehensive, domain based tool to assess bias in non-randomised studies of interventions that focuses on domains related solely on bias and not related to external validity or reporting. In addition the tool comes with detailed instructions, a manual, for how to complete it, which was often lacking from the pre-existing tools (1).

What makes this tool so different?

The main thing that sets the ROBINS-I apart from the other tools is that it sets up the premise of an ‘imaginary’ RCT. This is the RCT that would replace the non-randomised study you are assessing.  This RCT need not be ethical or feasible, for example you could randomise people to receive care on and intensive care ward or to living in certain parts of a city e.g that have more cycling infrastructure. In the context of the risk of bias tool this is entirely acceptable, whereas in real life it might not be possible to do this (and probably why there are no RCTs). It is against this hypothetical, ‘target’ RCT that you assess the risk of bias of your non-randomised study. So bias is defined as the differences between the non-randomised study you are assessing and the target RCT.  The other main aspect of difference is that it asks you to set out what the potential confounders are, and what were those measured in the study. This is particularly important as while other tools might ask if confounders have been measured and adjusted for this tool asks you to think about which specifically have been measured and which are actually appropriate to have adjusted for; a study cannot get a low risk of bias for adjusting for inappropriate confounders in a ‘tickbox’ type exercise.  Also it asks you to think if there are any confounders that should not have been adjusted for and might interfere with the analysis. Assessment of confounding is particularly important in non-randomised studies because we would expect in non-randomised studies that people would be offered the intervention depending on their prognosis or prognostic variables. E.g. people who are more poorly might be less likely to get a certain intervention than those who are less poorly.

What domains of bias are assessed and how are they operationalised?

The authors identified seven domains of bias. Before the intervention starts or at the time of intervention; bias due to confounding, bias in selection of participants and bias in classification of intervention. And Post intervention; bias due to deviations from intended interventions, bias due to missing data, bias in measurement of outcomes and bias in selection of the reported result. Only the latter four domains, that occur post-intervention, are substantially similar or overlap with those in the risk of bias assessment for RCTs as discussed in a previous MESS (5).  Each signalling question is answered with Yes’, ‘Probably yes’, ‘No’, ‘Probably no’, or ‘No information’ and from these guidance follows on whether the domain is at ‘Low risk’,  ‘Moderate  risk’, ‘Serious risk’ or ‘Critical risk’ of bias.  A study with a numerical outcome judged to be at ‘Low risk’ of bias would be considered to be similar risk of bias as that in a ‘high quality’ RCT.   The  paper (open access) sets out clearly how to use the ROBINS –I with specific instructions for reviewers to follow that I won’t try to reproduce here(1). But it is worth saying that the risk of bias tool is applied per numerical outcome result, not per study. Thus allowing a reviewer to make nuanced risk of bias judgements for studies that have both objective outcomes which may be at low risk of bias and subjective outcomes which may be at higher risk of bias. In their presentation (slides available here) the authors provide illustrative examples for each of the domains which is enormously helpful.

How was the tool developed?

The tool was developed over three years by experts discussing and arguing the best way forward until a consensus was reached on which were the domains of bias to assess. To help reviewers to judge bias for each domain signalling questions were drafted and the wording for each of these was also discussed and refined over several rounds of piloting, and face to face meetings and workshops. This paper is the culmination of huge research collaboration by 35 authors from six countries UK, USA, Canada, France, Denmark and Australia as well as numerous reviewers acting as pilot testers(1). The use of the tool will mean that evidence from non-randomised studies can be assessed with more clarity and transparency and detailed description of bias can be provided.

What’s next?

The authors stated that next they plan to think about how well the tool works for specific study designs such as self-controlled designs, controlled before and after studies, interrupted time series and studies based on regression discontinuity and instrumental variable analyses. They also plan to develop interactive software for using ROBINS-I and guidance for use of ROBINS-I in specific healthcare areas for example public health. The team are keen to generate a repository of research data for the ROBINS-I from people who are using it for systematic reviews. In this way meta-research on ROBINS-I can be carried out in the future to improve it, and could even facilitate automated incorporation of RoB assessments alongside the original papers in databases and other repositories .



  1. Sterne JA, Hernan MA, Reeves BC, Savovic J, Berkman ND, Viswanathan M, et al. ROBINS-I: a tool for assessing risk of bias in non-randomised studies of interventions. BMJ (Clinical research ed). 2016;355:i4919.
  2. Howick J, Phillips B, Ball C, Sackett D, Badenoch D, Straus S, et al. Oxford Centre for Evidence-based Medicine – Levels of Evidence (March 2009). 2009.
  3. Page MJ, Shamseer L, Altman DG, Tetzlaff J, Sampson M, Tricco AC, et al. Epidemiology and Reporting Characteristics of Systematic Reviews of Biomedical Research: A Cross-Sectional Study. PLoS Med. 2016;13(5):e1002028.
  4. Deeks JJ, Dinnes J, D’Amico R, Sowden AJ, Sakarovitch C, Song F, et al. Evaluating non-randomised intervention studies. Health technology assessment (Winchester, England). 2003;7(27):iii-x, 1-173.
  5. Savović J, Weeks L, Sterne JA, Turner L, Altman DG, Moher D, et al. Evaluation of the Cochrane Collaboration’s tool for assessing the risk of bias in randomized trials: focus groups, online survey, proposed recommendations and their implementation. Systematic Reviews. 2014;3(1):37.

[1] accessed 05.01.17


Systematic reviews; many are published, but are they all worth reading?

A summary of ‘Epidemiology and Reporting Characteristics of Systematic Reviews of Biomedical Research: A Cross-Sectional StudyPage M. et al 2016

By Tess Moore

Systematic reviews (SRs) were meant to save us from the overload of medical literature. This  overload is considerable and is increasing. In 2015 MEDLINE indexed 806,000 citations to biomedical research – up 4% on the year before1.  Alongside this increase there has been a surge in the number of SRs from none in 19872 up to 7 per day (2,500 per year) in 20043; 11 a day in 20104; 22 per day, or more than 8,000 a year, in 20145. Clearly, we have come a long way in the history of evidence synthesis – integrating evidence into understandable and manageable bites6.

But what are they like, these systematic reviews?  How reliable are they? Have the methods and reporting improved? And what, as review authors and researchers, do we need to think about for the future and for our own reviews when we publish them?

MESS member Dr Matt Page came to talk to us about ‘the mess that is the systematic review literature’.


What did Matt and his colleagues do?

Matt worked with 11 colleagues from Australia, Brazil, Canada, Spain and the UK to update a study that looked at the properties of contemporary, published SRs in 2004 and update it for SRs published in 20145.


Why did they do this?

Well a lot has happened since the first teams took a look at SRs in 2004. The new reporting guidelines for SRs, PRISMA, were published7, as were the MOOSE guidelines for reviews of observational studies8 and the Institute of Medicine in the US has newly published standards for reporting of SRs9. Plus many journals are now more familiar with publishing SRs and journal editors are more aware of their importance and potential use. So it was timely to compare what they found ten years ago in 2004 to what is happening now.


What did they find?

They found that MEDLINE had indexed 682 SRs in one month (February 2014). This equates to 8,000 per year – three times as many as 2004. Matt’s team set out explicit methods concerning selection and eligibility of reviews to their work as described in beautiful detail in their paper.

The review has six data rich tables describing the parameters of all the reviews by review type and I urge you to take a look. I can highlight the key things they found:

Using a subset cohort of 300 SRs (the same number examined in 2004) they found that 45/300 (15%) were Cochrane reviews of interventions; 119/300 (40%) were non Cochrane intervention reviews; 74/300 (25%) were epidemiological type studies and 33/300 (11%) were diagnosis or prognosis reviews.  Ten percent [29/300] were classified as other (these were reviews of education or of properties of outcome measure scales etc).


How had the reviews done?

Clear reporting and use of appropriate methodology allows us readers to more easily assess the validity of review findings.

Most Cochrane SRs of therapeutic interventions used a protocol and they were all available for everyone to read (98% [44/45]). This happy picture was not reflected in either non-Cochrane therapeutic intervention SRs, where only 22% [26/119] mentioned a protocol with just 4% [5/119] available to read. For DTA (diagnostic test accuracy), epidemiology and other SRs the picture was worse with only 5% [7/136] reporting a protocol. Across all SRs 70% [206/296] had assessed risk of bias but only 16% [31/189] of those actually applied the risk of bias to their analysis. Only 7% [21/300] of studies looked for unpublished data and 47% [141/300] described an assessment of publication bias. Page et al5 go on to say that at least (often a lot higher) one third did not describe some basic SR methods:

eligibility criteria,

years of search,

a full Boolean search strategy for at least one electronic bibliographic database,

methods for data extraction or risk of bias assessment,

a primary outcome,

study limitations in the abstract or

funding source.


Apart from the protocols published by the Cochrane reviews, all of this goes to paint a pretty disappointing picture.


Given this lack of reporting of some of the most basic aspects of systematic reviewing methods reporting we have to ask “Had the review authors used reporting guidelines?”

Matt’s paper reports that only one third (29% [87/300]) referred to reporting guidelines. And worryingly 52 % of these (45/87) misinterpreted the reporting guidelines and thought these were synonymous with SR conduct strategies such as the Cochrane Handbook.

How did 2014 compare to 2004?

One of the most worrying findings was that the number of non-Cochrane SRs that mention they used a protocol (12-13%) is about the same in 2014 as it was in 2004. This is dismally low.  All SRs need a protocol – for the same reason as trials need a protocol – to avoid bias. This is a sad indictment of the teaching of SR methods, or might be a case of poor reporting. When Matt et al assessed the effect of a study mentioning PRISMA reporting guidelines they found that those that did use PRISMA were almost more likely to mention a protocol than those who didn’t (risk ratio = 1.83 95% CI 0.94 to 3.58) – the lower 95% CI just clips the line – so it looks like some work is needed in both teaching methods of SR and how to report SRs.


Matt and his team found that the proportion of types of reviews being done have changed. Proportionately there are fewer therapeutic clinical questions being answered and more epidemiological questions e.g. prevalence of a condition (13% to 25%), the proportion of SRs that were Cochrane also decreased (from 20% to 15%) showing that SRs are being accepted and published more widely (i.e. outside of Cochrane) than they were in 2004. Matt’s team showed that compared to 2004 SRs of 2014 were more likely to identify they were an SR (or meta analysis) in the title, which makes retrieving SRs in searches more likely. They were also more likely to: report eligibility criteria about language; report the flow of studies through the review process (PRISMA flow chart); provide a complete list of excluded studies and reasons; perform a meta-analysis; and assess publication bias.  Some things that hadn’t improved included; assessment of harms; assessment of statistical heterogeneity; specification of a primary outcome; assessment of risk of bias; report of full Boolean search strategy; reporting of both start and end years of search; and the eligibility criteria concerning publication status.

How did the use of PRISMA guidelines by review authors affect the reporting of reviews?


Matt and his team found SRs who mentioned PRISMA were more likely to have reported on a range of key SR methods including the methods used for screening and data extraction and risk of bias assessment as well as using more thorough searching methods and use of meta-analysis.


So what does it all mean?

We have come a long way from the status in 1987 when most medical reviews did not describe any methods for how they had brought their articles together2.  However, sadly, against a backdrop of increasing numbers of SRs and, rather worryingly, a massive increase in narrative reviews4, Matt’s work highlights that the conduct (i.e. methods used) and the reporting of SRs are not that great overall. If reviews are not done adherent to good methodology then results can potentially be misleading. If reviews are not reported in detail and with clarity then it is not possible to assess those methods and judge the validity of the results.  Matt and his colleagues conclude that ‘strategies are needed to increase the value of SRs to patients, health care practitioners and policy makers5.

What strategies are there? Well we could think again about reporting guidelines. Matt’s team showed that PRISMA has improved reporting in their sample of SRs. But there are already 319 covering all types of medical research, listed on the EQUATOR website11. And the PRISMA stable is developing a string of extensions since its first publication 2009. Since 2015 there are three extensions PRISMA-P12 for protocols, PRISMA-IPD13 for individual patient data meta-analyses and PRISMA-NMA14 for network meta-analyses.  And in process there are PRISMA guidelines for reporting of SRs of children, PRISMA-C and PRISMA-DTA for SRs of diagnostic test accuracy.  Reporting guidelines and use of checklists are often requested by journal editors.  But is there something more dynamic to help authors?

For increasing the visibility of protocols we can register SRs on PROSPERO (an international prospective register of SRs) which allows public view, and it is also possible to publish SR protocols on the Systematic Reviews journal published by BioMed Central.

Simpler, more straightforward assistance is available as suggested by Matt Page et al, and the editors of PLOS15. They suggest software to assist SR authors when drafting their paper. They give as an example an online tool, COBWEB16, developed by Barnes et al for trialists that prompts them, when drafting their RCT, to comply with CONSORT reporting guidelines that improved clarity of reporting.  There is a new journal, Research Integrity and Peer Review, dedicated to improving publication of research and they might provide, in time, some evidence of how to improve reporting.  And there is a new wizard to help authors and journal editors to both find and use reporting guidelines. It’s called PENELOPE research, and several BMC journals are already signed up to  a trial of its use which you can read about on the EQUATOR Blog.


In short the take home message is to all of us that prepare SRs, please first conduct the review according to stated methodological guidelines. To prepare a protocol and most importantly to use a checklist for our FIRST draft manuscript to remind us to be clear and write down what we did.  It is important to use these reporting guidelines – whether or not our favoured journal asks for them. To editors of journals we would ask that you please provide us with a sufficient word count to describe our work, in both the paper (we are fine with web appendices) and especially the abstract, and then to enforce the use of a checklist at submission stage.


We would like to thank Dr Matt Page for this fascinating presentation and I urge everyone to go read their paper as it so packed full of information and data.   AND data that are SUPER useful for research grant writing. Also to read a fascinating interview he did with Cochrane Senior Editor, Toby Lasserson, of the Cochrane Editorial.

Tess Moore

We will be back with more of a MESS (Methods in Evidence Synthesis Salon) in September with a talk on ROBINS-I and risk of bias for non-randomised studies.

  1. US National library of medicine. Key MEDLINE indicators (accessed 07/07/16)
  2. Mulrow CD. The medical review article: state of the science. Annals of internal medicine 1987;106:485-8
  3. Moher D, Tetzlaff J, Tricco AC, Sampson M, Altman DG. Epidemiology and Reporting Characteristics of Systematic Reviews. PLoS Med 2007;4:e78
  4. Bastian H, Glasziou P, Chalmers I. Seventy-Five Trials and Eleven Systematic Reviews a Day: How Will We Ever Keep Up? PLoS Med 2010;7:e1000326 
  5. Page MJ, Shamseer L, Altman DG, Tetzlaff J, Sampson M, Tricco AC, et al. Epidemiology and Reporting Characteristics of Systematic Reviews of Biomedical Research: A Cross-Sectional Study. PLoS Med 2016;13:e1002028 
  6. Clarke M. History of evidence synthesis to assess treatment effects: Personal reflections on something that is very much alive. Journal of the Royal Society of Medicine 2016;109:154-63 
  7. Moher D, Liberati A, Tetzlaff J, Altman DG. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. Journal of clinical epidemiology 2009;62:1006-12 
  8. Stroup DF, Berlin JA, Morton SC, Olkin I, Williamson GD, Rennie D, et al. Meta-analysis of observational studies in epidemiology: a proposal for reporting. Meta-analysis Of Observational Studies in Epidemiology (MOOSE) group. Jama 2000;283:2008-12
  9. Institute of Medicine. Finding what works in health care: standards for systematic reviews. National Academies Press; 2011
  10. Chandler J, Churchill R, Higgins J, Lasserson T, D T. Methodological standards for the reporting of new Cochrane intervention reviews, version 1.1. 2012. 2012
  11. The Equator Network. 2016
  12. Moher D, Shamseer L, Clarke M, Ghersi D, Liberati A, Petticrew M, et al. Preferred reporting items for systematic review and meta-analysis protocols (PRISMA-P) 2015 statement. Systematic Reviews 2015;4:1-9
  13. Stewart LA, Clarke M, Rovers M, et al. Preferred reporting items for a systematic review and meta-analysis of individual participant data: The prisma-ipd statement. Jama 2015;313:1657-65
  14. Hutton B, Salanti G, Caldwell DM, Chaimani A, Schmid CH, Cameron C, et al. The PRISMA Extension Statement for Reporting of Systematic Reviews Incorporating Network Meta-analyses of Health Care Interventions: Checklist and ExplanationsPRISMA Extension for Network Meta-analysis. Annals of internal medicine 2015;162:777-84
  15. The Plos Medicine Editors. From Checklists to Tools: Lowering the Barrier to Better Research Reporting. PLoS Med 2015;12:e1001910
  16. Barnes C, Boutron I, Giraudeau B, Porcher R, Altman DG, Ravaud P. Impact of an online writing aid tool for writing a randomized trial report: the COBWEB (Consort-based WEB tool) randomized controlled trial. BMC Medicine 2015;13:1-10
  17. Shanahan D, Marshall D. It’s a kind of magic: how to improve adherence to reporting guidelines. In: EQUATOR Blog; 2016

NIHR Fellowship Opportunity

A very exciting opportunity has just been listed for a two-year NIHR fellowship to work in the area of systematic reviews of observational studies with Professor Jonathan Sterne and Professor Julian Higgins.

The fellowship will be based within the Centre for Research Synthesis and Decision Analysis in the University of Bristol School of Social and Community Medicine, one of the UK’s leading centres for research in population health sciences.


The post incorporates a tailored training programme, including a fully funded Masters degree in epidemiology or a related discipline during the first year (in a University other than Bristol).

More details here

Risk of Bias 2.0

You mean you actually want a more complex way to assess risk of bias? That may take longer?

A long time ago in a galaxy far far away – we assessed RCT ‘quality’   – it could be done in a moment – 5 simple questions from the Jaded scale, add them up and BOOM you have a numerical score that makes it really easy to ‘judge’ your RCTs.  But since then we have moved away from the dark side (Juni 1999) to a more nuanced assessment, not of quality, but of how likely are the outcomes in this RCT to be biased? This has been facilitated by the development of the Cochrane ‘Risk of bias’ tool that was launched into space in 2008. Since then 100% of Cochrane reviews and approx. 30 % of other systematic reviews (Higgins 2011) use this tool.

So why was Jelena here with talk of a new version? That might very well take longer?

Don’t get me wrong, I am about the most squirmingly thorough systematic reviewer out there. I love doing things in duplicate and reporting in detail discrepancies. But really? A longer, more complex means to assess ROB? Jelena took the floor and explained all…

As it stands the Cochrane ROB tool assesses bias across six domains (selection, performance, detection, attrition, reporting, and other bias).  Full details are available for free here or in the Higgins et al 2011 article. Feedback from focus groups and user experience surveys revealed that although it took longer than many of the earlier methods people were generally happy with that – because the additional rigour and the better ‘science’ than had gone before) and it was presented with transparency (people could see how you made your decision – and could disagree if they wanted to) and it didn’t involve any over-simplistic, numerical scoring.  So far so great. On the down side Jelena reported too that people thought some of it was too hard to fill in (Savovic 2014) and interrater agreement was poor.  It came with a lot of supporting advice to help the reviewers to complete it. But the existing tool did a good job of discerning effect estimates with more conservative effects for studies at low risk  (Hartling 2009). So why change it?

The scientific debate on risk of bias didn’t stop when the first tool was launched, researchers have monitored the use of the tool and they have sought information on users’ experiences (Savovic 2014).  The tool took 10 to 60 minutes to complete and users reported two of the domains were difficult to assess, selective ‘reporting of outcomes’ and ‘incomplete data’ and asked for clearer guidance. Although people were using the tool to assess ROB, many of the domains were being marked ‘Unclear’ either because there was no info in the RCT reports OR there was information but the assessors couldn’t tell how the methods in the RCT might affect the bias (in other words the researchers were ‘unclear’). So many Cochrane systematic reviews containing a lot of RCTs labelled simply as ‘unclear’ for ROB and very few labelled as low risk. This hampers our understanding of the evidence? What can we say about a treatment effect if we are ‘unclear’ about how bias may be acting on it? Also they found people were assessing ROB of the RCTs but struggling to translate the judgements for each domain over the whole RCT or for their outcome of interest. In other words although they were going to the trouble of assessing and reporting ROB they weren’t using the ROB assessments to inform interpretation of the review findings.  Finally the existing tool had a section called ‘other’ where you could assess other types of bias – and people were using this for all sorts of things, not always related to bias.

So Jelena and the developers of the first tool have been re-vamping the Cochrane ROB tool. Their work so far has led to five major changes: Clarification of the domains from six to five along with the removal of the possibility of ‘other’ risk of bias as the new tool is comprehensive and all aspects pertaining to bias are covered within the five domains. Comprehensive changes to wording and phrasing of questions in the domains to help with clarity of understanding and response. Addition of signposting questions for each domain that will help reviewers to assess if bias is likely or not. Sitting on the fence becomes less of an option as ‘unclear’ is expunged from the new tool. So, in response to judgements about whether a study has met the criteria for low risk of bias or for each signalling question we are gently guided to making a decision by choosing from ‘yes, probably yes, probably no or no’. And an, optional, automatically generated, overall judgement on bias for the outcome we are looking at. This has been done using a simple algorithm.  So will we change to the new tool?

The new tool has been pilot tested and early feedback and continued scientific discussion is shaping its the development over the spring and summer. I participated in the pilot and the changes are not merely cosmetic or superficial, the tool will help generate more nuanced and meaningful assessments of bias in trials. And is it really longer?

In responding to user feedback and evidence of how it is being used, the new tool will provide focussed domains with signalling questions to aid us to decide if a trial is biased, it may actually be quicker to complete than the current tool. And it will provide a more accurate assessment of the biases.

The plan is to roll out the new Cochrane ROB tool in Oct 2016 to coincide with the Cochrane colloquium. If you are feeling hesitant about adopting it be reassured that it will exist in parallel to the existing Cochrane ROB tool. So, you get to choose which you use.

This MESS (April 12 2016) was brought to you by Dr Jelena Savovic, and hosted by Karen Dawe and Tess Moore. Our snacks were dough nuts and banana bread. We would like to thank Dr Jelena Savovic who presented her research into a new version of the Cochrane Risk of Bias Tool. And of course all of you who came along.

Below is a list of useful papers and resources.

We’ll get MESSy again in June.  See you then.


Jelena Savović, Laura Weeks, Jonathan AC Sterne, Lucy Turner, Douglas G Altman, David Moher and Julian PT Higgins. Evaluation of the Cochrane Collaboration’s tool for assessing the risk of bias in randomized trials: focus groups, online survey, proposed recommendations and their implementation. Syst Rev. 2014; 3: 37.15. doi:  10.1186/2046-4053-3-37

Lisa Hartling, Maria Ospina, Yuanyuan Liang, Donna M Dryden, Nicola Hooton, Jennifer Krebs Seida, Terry P Klassen. Risk of bias versus quality assessment of randomised controlled trials: cross sectional study. BMJ 2009;339:b4012. doi:  10.1186/2046-4053-3-37

Julian P T Higgins, Douglas G Altman, Peter C Gøtzsche, Peter Jüni, David Moher, Andrew D Oxman, Jelena Savović Kenneth F Schulz, Laura Weeks, Jonathan A C Sterne, Cochrane Bias Methods Group, Cochrane Statistical Methods Group.The Cochrane Collaboration’s tool for assessing risk of bias in randomised trials. BMJ 2011;343:d5928 doi: 10.1136/bmj.d5928