Appraising and Scoring Clinical Practice Guidelines for Trust and Quality – The Good, The Bad and The Confusing

Appraisal of Guidelines

Increased pressure to provide high-quality, evidence-based care has driven the rapid proliferation of clinical practice guidelines and thousands of new and updated medical guidelines from hundreds of medical societies and government organizations are published each year. This dramatic increase in the number of published guidelines gives  healthcare providers more evidence to consider when managing their patients; however, this increase also causes significant variation in guideline development processes and the evaluation of scientific data. As a result, conflicting guideline recommendations from various groups and different interpretations of the same evidence has become much more common, leaving healthcare professionals scratching their heads when trying to sift through recommendations and the divisions across profession or specialty lines.

Trying to make sense of and sort through the enormous number of guideline recommendations has led many to call for a single reliable tool designed to encourage high-quality, evidence-based care and identify the trustworthiness of individual guidelines. A trust scoring tool could weed out guidelines that may be of poor quality, fail to answer the right questions or be affected by conflicts of interest. While the intention behind creating a single trust scoring tool  is a noble one, there are inherent difficulties that have plagued efforts to introduce a single system or scorecard for identifying how “high quality” or trustworthy a clinical practice guideline is.

Some of the most well-known and used methodologies that either assess, appraise, or otherwise grade guidelines against a set of criteria or a scorecard include:

  • IOM Standards for Developing Trustworthy Guidelines (2011)
  • IOM Standards for Systematic Reviews (2011)
  • AGREE Instrument (2003)
  • AGREE II Instrument (2010)
  • NEATS Assessment (2017, retired with NGC)
  • ADAPTE Collaboration (2009)
  • G-TRUST Scorecard Tool (2017)

Probably the most well-known attempt at defining a high-quality guideline was the 2011 Institute of Medicine Standards for Trustworthy Clinical Practice Guidelines (2011 IOM Standards). At a high level the 2011 IOM Standards focus on:

  • Establishing Transparency – The processes by which a clinical practice guideline (CPG) is developed and funded should be detailed explicitly and publicly accessible.
  • Managing Conflict of Interest – The goal should be the reduction of conflicts of interest (COIs) as much as possible, and when they exist, the guideline authors should declare them prior to work and within the body of the publication.
  • Diversity Among Developers – The guidelines should be authored by a multi-disciplinary panel that includes methodologists, clinicians, as well as patient and public representatives.
  • Systematic Reviews – The guideline should include systematic reviews, and those reviews should meet the separate IOM standards for such systematic reviews.  
  • Grading the Recommendations – The document should include a transparent system for declaring the strength of each recommendation along with the level of evidence used to make the recommendation. The most common foundation comes from the GRADE working group.
  • Clarity and Implement-ability – Each recommendation found in a trustworthy guideline should be easy-to-understand and also specific enough to be implementable.
  • External Reviews Prior to Publishing – External reviews should be conducted by a full spectrum of relevant stakeholders, including scientific and clinical experts, other organizations (e.g., healthcare, specialty societies), agencies (e.g., federal government), patients, and representatives of the public. Also, a working draft of the guideline should be made available to the general public for comments prior to final publication.
  • Updating – The date ranges used for the literature reviews should be noted in the publication, along with the next scheduled review date. The organizations should also consistently monitor published literature for new evidence that could change the recommendations.

While all of these qualifications seem fairly reasonable and easy to adhere too, it brings some challenges as well.Organizations that oppose the 2011 IOM Standards believe these standards are too prescriptive, significantly increase guideline development cost, and cause significant publishing delays without adding significant value in return. They raise important questions such as:

  • What good is a guideline if, by the time it publishes, it’s already 12-24 months out of date?
  • The average guideline costs over $100,000 to develop, and if adherence to the IOM standards doubles development cost for each guideline, where is this money coming from?
  • What effect does fewer guidelines and less frequent updates have on quality?
  • Should every guideline on every topic be required to meet every single IOM criteria, or can there be exceptions?

While many of the criteria within 2011 IOM Standards are cut and dry there are also components that are much less black and white. This brings us to yet another major criticism of the IOM standards (and most other guidelines scoring and appraisal systems) – there is too much subjectivity.

  • Whose job is it to interpret the subjective parts of the criteria?
  • If the task falls to a single organization, whose job is it to oversee the group doing the scoring and appraisal?
  • How is conflict of interest managed? For example, should organizations who sell guideline development consulting services to medical associations also be allowed to score those same associations’ guidelines for trustworthiness and quality?
  • Since some of these criteria are self-reported, can someone skew results in their own favor?

As with the previous questions, these are all valid concerns that demonstrate, while guideline trust scorecards are great ideas rooted in necessity, coming up with an objective solution that can add proven value is much more difficult than it sounds. While this post wasn’t created to bash an existing appraisal criteria or trust scorecard instruments, we hope it has called attention to this important matter, and the fact that the healthcare community still lacks a universally accepted solution to quality scores for recommendations. This issue will become increasingly important as measures of quality and reimbursement are more and more frequently tied directly to the clinical guidelines.

For now, perhaps we can start by accepting that this isn’t an issue that can be solved by a single organization or a single group’s guidelines quality scale or guidelines trust scorecard. The collective evidence-based medicine community – guideline developing associations, government organizations, public stakeholders, etc. – should come together to find a long-term solution.  

Guideline Central wants to know…

  • Do you have questions or ideas on how to address the appraisal of trust and quality of guidelines?
  • Perhaps the guideline developers should give you all of the tools needed (transparency) to allow you to make a determination whether you feel the guideline is credible or not?
  • Or would you prefer a 3rd party organization making that call for you to save time?
« Where Are The Guidelines Now? Three Months After the Closure of AHRQ’s NGC

Comments are closed.