|
|
||||||||
Guest Access | Sign In via User Name/Password |
|||||||||
* From the Italian National Cancer Institute Regina Elena (Dr. Schünemann), Rome, Italy; McMaster University Faculty of Health Sciences (Drs. Cook and Guyatt), Hamilton, ON, Canada.
Correspondence to: Holger J. Schünemann, MD, PhD, Department of Epidemiology, Italian National Cancer Institute Regina Elena, Via Elio Chianesi 53, 00144 Rome, Italy; e-mail: hjs{at}buffalo.edu
| Abstract |
|---|
|
|
|---|
Key Words: evidence-based medicine grade guideline development guidelines quality of evidence recommendations
| Introduction |
|---|
|
|
|---|
To maintain transparency of the guideline development, we followed explicit rules for managing conflicts of interest. Before participating on the panel, all participants submitted conflict-of-interest statements that were reviewed by the ACCP Health Science and Policy (HSP) Committee. Participants potential conflicts are listed prominently in the front section of the guideline document.3 The panelists updated their conflict-of-interest disclosures again before the final conference and before publication. These disclosures are published with the guidelines and posted on the CHEST journal Web site (www.chestjournal.org).
The development of evidence-based guidelines includes explicitly defining the question that the guideline or recommendation is addressing; formulating eligibility criteria for evidence to be considered; conducting a comprehensive search for evidence; evaluating study quality; summarizing the studies; balancing the benefits and downsides of the alternative management strategies; and, finally, acknowledging values and preferences underlying the recommendations, including considerations on expenditures.456 This process ends with a recommendation for action and a grading of that recommendation according to the balance of desirable effects (benefits), undesirable effects (harms, burden, and resource expenditures), and the quality of the evidence. We followed the methodology for grading the quality of evidence and strength of recommendations that the ACCP codified during a recent ACCP task force meeting. The grading system adopted was a modification from that developed by the Grading of Recommendations Assessment, Development and Evaluation Working Group.789 This article describes the methodology for guideline development for the Antithrombotic and Thrombolytic Therapy: American College of Chest Physicians Evidence-Based Clinical Practice Guidelines (8th Edition). Figure 1 summarizes this process.
|
| Guideline Development for the Eighth ACCP Conference on Antithrombotic and Thrombolytic Therapy |
|---|
|
|
|---|
Defining the Clinical Question
Developing a clinical practice guideline should begin with specifying a clinical question that defines the relevant population, alternative management strategies (comparison), and outcomes.10 For the current ACCP guidelines, authors defined one question for each recommendation or set of recommendations. Readers can find these questions in the corresponding table of each chapter containing practice recommendations.
Presentation of Evidence and Recommendations
To provide a transparent, explicit link among questions, evidence, and recommendations, the section numbering in each chapter corresponds to numbers in the corresponding table in the chapters, which specifies the patients, interventions, and outcomes; the section numbering also corresponds to the numbering of the recommendations themselves.
Process of Searching for Evidence
Defining the clinical question provided the framework for formulating eligibility criteria that guided the search for relevant evidence. In specifying eligibility criteria, authors identified not only patients, interventions, and outcomes, but also methodologic criteria. For many recommendations, authors restricted eligibility to randomized controlled trials (RCTs). For example, as in previous editions, Albers et al11 considered whether clinicians should offer thrombolytic therapy in acute stroke. They defined patients as anyone presenting with acute thrombotic stroke (divided into presentation of < 3 h and > 3 h after onset of symptoms), intervention as any thrombolytic regimen compared to no intervention or placebo, and outcome as death or functional status based on assessment with a validated functional status instrument. The methodology was restricted to RCTs. This question yielded several recommendations, including whether patients with acute ischemic stroke presenting within 3 h of symptom onset should receive IV tissue plasminogen activator (tPA).
For many questions, randomized trials did not provide sufficient data, and chapter authors included observational studies when randomized trials were not the most appropriate design to address the research question. In particular, randomized trials are not necessarily the best design to understand risk groups, that is, the baseline or expected risk of a given event for certain subpopulations. Because no interventions are typically examined in questions about prognosis, one replaces interventions by the duration of exposure measured in time. For example, to obtain information about the risk of ischemic stroke in patients with atrial fibrillation in specific risk groups, the sensible question was: In patients with atrial fibrillation differing in age, BP, left ventricular function, or history of previous embolic events, what is the risk of stroke or death over a given time period?
Identifying the Evidence
To identify the relevant evidence, a team of librarians and research associates at the McMaster University EPC conducted comprehensive literature searches. Methodologic experts (including the editors) and the EPC librarians reviewed each question to ensure the development of a comprehensive search strategy. For example, for questions about antiplatelet agents, the EPC consulted chapter authors to ensure that the search included all relevant antiplatelet agents. More specifically, authors then decided whether to include dipyridamole in a search that already included aspirin, clopidogrel, and ticlopidine.
For each question the authors provided, the librarians searched the Cochrane Database of Systematic Reviews, MEDLINE, and Embase for published English-language literature and human studies between 2002 and May 2006. To filter MEDLINE and Embase search results for RCT evidence, the librarians used the search strategy developed by the Cochrane Collaboration. These searches updated our more comprehensive and sensitive searches conducted for the Seventh ACCP Conference on Antithrombotic and Thrombolytic Therapy: Evidence Based Guidelines.36
The EPC team conducted separate searches for systematic reviews; RCTs; and, if applicable, observational studies. For observational studies, searches were not restricted in terms of methodology. Although increasing the probability of identifying all published studies, this sensitive approach resulted in large numbers of citations for many of the defined clinical questions. Therefore, trained research assistants screened the citation list developed from the search using criteria of increased specificity to reduce the number of irrelevant citations that the authors received. These irrelevant citations included press news, editorials, narrative reviews, single-case reports, studies that included fewer participants than specified by authors as an inclusion criterion, animal studies (any nonhuman studies), and letters to the editor. Authors did not include data from abstracts of meetings for the development of recommendations, and we did not explicitly use Internet sources to search for research data. Authors were encouraged, however, to mention abstracts that reported on groundbreaking data that were particularly relevant to a specific question in the chapters in order to alert readers that new, fully published evidence might become available shortly.
Standard Consideration of Study Quality
High-quality clinical guidelines should pay careful attention to the methodologic quality of the studies that form the basis of their recommendations. Using the example of the prevention of venous thromboembolism during air travel, Table 1
shows the criteria for assessment of study quality (randomization, concealment or treatment allocation, blinding, completeness of follow-up, and whether the analysis was performed according to the intention-to-treat principle), and Table 2
shows the presentation of results that were circulated to the authors. Whereas all authors attended to these criteria, we have summarized the results of the quality assessment for only a minority of the recommendations. Readers can find these summaries in an online appendix to the recommendations (see online supplemental data).
|
|
|
We labeled studies that met these criteria "cohort studies without internal controls." Studies with internal comparisons received the label "cohort studies with concurrent controls" or "cohort studies with historical controls." These cohort studies may succeed or fail to ensure settings, similar time frames, adjustment for differences in patients characteristics, and follow-up with patients. These features were captured in descriptive tables provided to authors when requested from the EPC.
Summarizing Evidence
The electronic searches also included searches for systematic reviews. If authors were satisfied with a recent high-quality systematic review, evidence from that review provided a foundation for the relevant recommendation. For example, Albers et al11 used a systematic review and metaanalysis as the foundation for their recommendation on IV streptokinase for acute ischemic stroke between 0 and 6 h of symptom onset (chapter on Stroke, Section 1.3). Geerts et al12 used several metaanalyses for their recommendations (chapter on Prevention of Venous Thromboembolism, eg, Section 2).
For the first time for a small number of recommendations (see chapters Ansell et al, Warkentin et al, Geerts et al, Kearon et al, Albers et al, Harrington et al, Becker et al, Sobel and Verhaeghe, and Bates et al), we systematically examined the impact of quality of design and implementation of individual studies, precision, consistency and directness of results, likelihood of reporting bias, and presence of very large effects on the quality of the evidence. For recommendations in which we did so, we present tables that summarize these features. Table 3 provides an example.
|
Another chapter in this supplement details the basic grading of methodologic quality.8 In brief, consistent results from RCTs or observational studies with very strong effects result in Grade A recommendations; inconsistent results from RCTs or RCTs with important methodologic limitations receive Grade B, and observational studies without very strong effects result in Grade C quality of evidence.
Group-Specific Recommendations
The absolute magnitude of treatment effects may be very different in patients with varying levels of risk. For instance, although the relative risk reduction of warfarin vs aspirin in stroke prevention for atrial fibrillation patients is likely close to 50% across risk groups, this translates into absolute risk reductions of < 1% per year in the lowest risk groups, and in the vicinity of 5% per year in the highest risk groups. Clearly, optimal management must differ across risk groups, and this is reflected in the recommendations of our atrial fibrillation panel.
In general, we have endeavored to make our recommendations as specific as possible for patient subgroups differing according to risk. Whenever valid prognostic data were available, we used them to estimate absolute effects and made recommendations accordingly. Unfortunately, reliable prognostic indexes are not usually available, limiting the extent to which such group-specific recommendations are possible.
Acknowledge Values and Preferences and Resource Use Underlying Recommendations
Under ideal circumstances, knowledge of average patient values and preferences would be available for every recommendation, the panel members would summarize these values and preferences, and they would be integrated into the recommendations that guideline developers make. We asked all chapter chairs before beginning the searches for the relevant literature to identify recommendations that they believed were particularly sensitive to patients values and preferences. Moderate-quality evidence regarding values and preferences bearing directly on the recommendations proved available for only the chapter that addresses antithrombotic therapy in patients with atrial fibrillation. Our panelists beared in mind what average patient values and preferences may be; the process, however, is speculative.14
Our main strategy for dealing with this unsatisfactory situation is to make the values and preferences underlying the recommendations explicit whenever the panelists believed that value and preference issues were crucial for a recommendation. For example, Albers et al11 suggest for patients with acute ischemic stroke of > 3 h but < 4.5 h that clinicians do not use IV tPA (Grade 2A). For patients with acute stroke onset of > 4.5 h, we recommend against the use of IV tPA (Grade 1A). The authors noted in the corresponding values and preferences statement, "This recommendation assumes a relatively low value on small increases in long-term functional improvement, a relatively high value on avoiding acute intracranial hemorrhage and death, and a relatively high degree of risk aversion."
In addition, we involved three consultants with expertise in the area of values and preferences to collaborate with the chairs of two chapters and try to ensure that the guidelines adequately represented the views of patients.1115 This collaboration led to extensive discussions among the chapter authors and the consultants and the reflection of these discussions in the associated values and preference statements.
In previous iterations of these guidelines, we did not have a standard or coherent approach to dealing with resource allocation (cost) issues. For these guidelines, we implemented recommendations of a recent ACCP task force on integrating resource allocation in clinical practice guidelines by restricting resource expenditure consideration to a small number of recommendations for which they were particularly relevant.1 We relied on two consultants with expertise in economic assessment to help with the process of considering costs in those small numbers of recommendations that we considered very important to the decision. The methods and examples for this process are described in the article by Matchar and Mark in this supplement.2 Recommendations highly sensitive to resource allocation now include value and preference statements regarding how cost issues were integrated.
Grading Strength of Recommendation
A systematic approach to grading the strength of treatment recommendations can minimize bias and aid interpretation of treatment recommendations. Chapter authors have graded their recommendations as strong (Grade 1, desirable effects much greater than undesirable effects or vice versa) and worded the recommendation accordingly as "we recommend" or as weak (Grade 2, desirable effects not clearly greater or less great than undesirable effects) and worded the recommendation as "we suggest." They also have graded the methodologic quality of the underlying evidence. Another chapter in this supplement details our approach to grading recommendations.8
Finalizing and Harmonizing Recommendations
After having completed the steps we have described above, the guideline authors formulated draft recommendations before the conference, which laid the foundation for authors to work together and critique the recommendations. Fig 1 shows the process of guideline development and review. Drafts of chapters that included draft recommendations were usually distributed for peer review to at least two panel members and were always reviewed by at least one panel editor before the conference. Written critiques were prepared and returned to the authors for revision of their work. At the plenary conference, a representative of each chapter presented potentially controversial issues in their recommendations. Chapter authors met to integrate feedback and consider related recommendations in other chapters and to revise their own guidelines accordingly. Authors continued this process after the conference until they reached agreement within their groups and with other author groups who provided critical feedback. The editors of this supplement harmonized the chapters and resolved remaining disagreements between chapters through facilitated discussion. All major correspondence and discussions at the meeting were recorded in written and audio protocols and are publicly available.
Review by ACCP and External Reviewers
The ACCP HSP established a process for the thorough review of all ACCP evidence-based clinical practice guidelines. After final review by the editors, the guidelines underwent review by appropriate NetWorks of the ACCP (for these guidelines, the Cardiovascular and Pulmonary Vascular NetWorks), the HSP, and the Board of Regents. The latter two have the right of approval or disapproval but usually work with the guideline authors and editors to make necessary revisions before final approval. Each group identified primary reviewers who read the full set of chapters as well as individual committee members who were responsible for reviewing one or more chapters. The reviewers considered both content and methodology as well as whether there was balanced, not biased, reporting and adherence to HSP processes. Finally, the CHEST editor-in-chief read and forwarded the manuscripts for nonbiased, independent, external peer review before acceptance for publication.
Limitations of These Guideline Development Methods
Limitations of these guidelines include the possibility that some authors followed this methodology more closely than others, although the development process was centralized by an EPC and supervised by the editors. Second, it is possible that we missed relevant studies in spite of the comprehensive searching process. Third, despite our efforts to begin centralizing the methodologic evaluation of all studies to facilitate uniformity in the validity assessments of the research incorporated into these guidelines, resources were insufficient to conduct this evaluation for all but a few of the recommendations in each chapter. Fourth, we performed only few statistical pooling exercises of primary study results. Finally, sparse data on patient preferences and values represent additional limitations inherent to most guideline development methods.
| Future Directions of ACCP Guidelines |
|---|
|
|
|---|
| Conclusion |
|---|
|
|
|---|
| Conlict of Interest Disclosures |
|---|
|
|
|---|
Dr. Cook discloses that she received grant monies from the Canadian Institutes for Health Research and a dalteparin donation for a peer-review funded trial by the Canadian Institutes for Health Research.
Dr. Guyatt reveals no real or potential conflicts of interest or commitment.
| Acknowledgements |
|---|
| Footnotes |
|---|
Dr. Schünemann is funded by a European Commission: The Human Factor, Mobility and Marie Curie Actions. Scientist Reintegration Grant: IGR 42194—"GRADE."
Accepted for publication December 20, 2007.
| References |
|---|
|
|
|---|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |