This is a chapter from "Methods Guide for Effectiveness and Comparative Effectiveness Reviews."
Comparative Effectiveness Reviews are systematic reviews of existing research on the effectiveness, comparative effectiveness, and harms of different health care interventions. They provide syntheses of relevant evidence to inform real-world health care decisions for patients, providers, and policymakers. Strong methodologic approaches to systematic review improve the transparency, consistency, and scientific rigor of these reports. Through a collaborative effort of the Effective Health Care (EHC) Program, the Agency for Healthcare Research and Quality (AHRQ), the EHC Program Scientific Resource Center, and the AHRQ Evidence-based Practice Centers have developed a Methods Guide for Comparative Effectiveness Reviews. This Guide presents issues key to the development of Comparative Effectiveness Reviews and describes recommended approaches for addressing difficult, frequently encountered methodological issues.
The Methods Guide for Comparative Effectiveness Reviews is a living document, and will be updated as further empirical evidence develops and our understanding of better methods improves. Comments and suggestions on the Methods Guide for Comparative Effectiveness Reviews and the Effective Health Care Program can be made at http://www.effectivehealthcare.ahrq.gov/.
This document was written with support from the Effective Health Care Program at AHRQ. None of the authors has a financial interest in any of the products discussed in this document.
Suggested citation: White CM, Ip S, McPheeters M, et al. Using existing systematic reviews to replace de novo processes in conducting Comparative Effectiveness Reviews. In: Agency for Healthcare Research and Quality. Methods Guide for Comparative Effectiveness Reviews [posted September 2009]. Rockville, MD. Available at: https://effectivehealthcare.ahrq.gov/topics/cer-methods-guide/overview/.
C. Michael White, Pharm.D., FCP, FCCPa
Stanley Ip, M.D.b
Melissa McPheeters, Ph.D., M.P.H.c
Tim S. Carey, M.D., M.P.H.d
Roger Chou, M.D.e
Kathleen N. Lohr, Ph.D.f
Karen Robinson, Ph.D.g
Kathryn McDonald, M.M.h
Evelyn Whitlock, M.D., M.P.H.i
aUniversity of Connecticut/Hartford Hospital Evidence-based Practice Center, Hartford, CT
bTufts Medical Center Evidence-based Practice Center, Boston, MA
cVanderbilt Evidence-based Practice Center, Nashville, TN
dRTI/University of North Carolina Evidence-based Practice Center, Chapel Hill, NC
eOregon Evidence-based Practice Center, Portland, OR
fRTI International, Research Triangle Park, NC
gJohns Hopkins University Evidence-based Practice Center, Baltimore, MD
hStanford-University of California San Francisco Evidence-based Practice Center, Stanford, CA
iOregon Evidence-based Practice Center, Portland, OR
The findings and conclusions in this document are those of the authors, who are responsible for its contents; the findings and conclusions do not necessarily represent the views of AHRQ or the Veterans Health Administration. Therefore, no statement in this report should be construed as an official position of these entities, the U.S. Department of Health and Human Services, or the U.S. Department of Veterans Affairs.
- Using existing systematic reviews (SRs) has potential benefits and risks. Evidence-based Practice Centers (EPCs) and the relevant Task Order Officer should discuss these points.
- This chapter does not focus on the use of existing systematic reviews for obtaining background information, providing background or discussion context, or cross-checking references. Rather, it concerns the use of existing systematic reviews to replace a de novo process. It also does not consider the processes used to create separate products, called “umbrella” reviews, meta-reviews, or reviews of reviews.
- We propose a five-step process to standardize the approach that EPCs can use to decide whether existing systematic reviews might provide value (Figure 1).
- Transparency is a priority; users of a Comparative Effectiveness Review (CER) should be able to determine what was done (Figure 2).
- Two independent reviewers using a modified AMSTAR (Assessment of Multiple Systematic Reviews) instrument should assess the quality of relevant reviews (Table 1).
- EPCs should incorporate existing systematic reviews (i.e., use them to replace all or part of a de novo process) only if they are fully relevant and of high quality. Partly relevant or suboptimal quality reviews should not be incorporated, although they may be useful for cross-checking references and for providing background. It is important to discuss how the findings of the CER agree or disagree with particularly well known SRs (highly cited or published in a high-impact journal) not included in the CER’s discussion section.
- Once EPCs identify relevant, high-quality systematic reviews, they may opt to use them in the following ways: adapting or adopting the search strategy, using the summarized evidence, or a combination of these.
- EPCs can choose to replace a denovo process to answer a key question by selecting the best review or may choose to summarize all of the relevant and high-quality reviews.
- EPCs should routinely review reference lists of such systematic reviews to identify relevant studies
- If EPCs do a de novo synthesis, they should routinely compare results with those of relevant, high-quality systematic reviews and formally address consistency or potential reasons for discrepancies in the discussion of the report.
Introduction and Rationale
Over a 4-year period (2005 to mid-September 2009), 11,390 citations for systematic reviews and 11,281 citations for meta-analyses were retrieved in an OvidSP search. In contrast, over the previous 9 years (1996 to 2005) only 7,390 citations for systematic reviews and 9,251 citations for meta-analyses were retrieved. Approximately 2,500 new systematic reviews (SRs) and meta-analyses were published in 2006 alone.1 A systematic review uses an explicit methodology for systematically searching and synthesizing the literature and for grading evidence. Given the extensive body of existing SR and meta-analysis literature, questions have been raised about whether Evidence-based Practice Centers (EPCs) should use existing SRs in a Comparative Effectiveness Review (CER) commissioned by the Agency for Healthcare Research and Quality (AHRQ) and, if so, in what capacity they should be used. Of course, examining existing SRs to provide background information or other useful references for a CER is a common practice in EPC work, and we do not discuss this procedure further in this chapter.
An informal survey of eight non-EPC centers that conduct systematic reviews in the United Kingdom, Australia, and New Zealand confirmed that they are facing these same questions about the use of existing SRs without any commonly accepted approach.2 In summer 2008, the Existing SR Working Group queried EPC directors about their experiences (including experience with both EPC and non-EPC projects) in this area. Overall, EPCs considered the use of an existing SR 50 percent of the time and used existing SRs slightly more than 30 percent of the time. The most commonly stated reason for using an existing SR was for completeness, but existing SRs were also often used when EPCs faced a topic of extensive breadth, because of the sizable body of literature, or limitations in timeframe or budget. Some EPCs used the existing SR while updating the SR.
When queried about how they were using existing SRs, EPCs indicated that they used existing SRs predominantly (74 percent of the time) for background information or to ensure completeness of the literature search. EPCs sometimes used results of existing SRs to answer key questions in the new SR, but in more than two-thirds of these cases, at least a sample of the original trials or studies included in the existing SR were verified to ensure the quality of original data extraction.
When EPCs considered using existing SRs in a new SR, the most common reason given not to use one was that the identified reviews were not relevant to the specific questions being asked in the new SR. Other frequent reasons not to use existing SRs included: no time savings associated with using the existing SR vs. using de novo methods to answer the key question, poor quality of existing SRs after detailed assessment, outdated existing SRs, and uncertainty about how to include them in a new SR.
As a result of our queries and subsequent discussion within the Working Group, we identified six possible benefits associated with using existing SRs in CERs:
- Allows a cross-check to assure that relevant trials and studies are captured in a new CER.
- Allows EPCs to directly compare and contrast the present CER and previous SRs in terms of findings that may be relevant to health care decisionmakers.
- May save EPCs time, effort, and resources to answer key questions.
- May allow EPCs to anticipate and plan for context-specific methodological issues.
- May help avoid unnecessary redundancy among SRs.
- May provide analyses that are not readily available from other sources (e.g., subgroup analyses from a meta-analysis of individual patient data not available in constituent studies or published reports).
In addition, some existing SRs may contain additional information from primary studies not reported in the manuscripts resulting from author queries or by having a primary study author as an author on the SR.
Conversely, five main risks are associated with using existing SRs in CERs that do not arise in a purely de novo process:
- If EPCs find numerous existing SRs, the time and resources required to evaluate them may be wasted because earlier reports may not be recent enough, not relevant enough to answer the key questions posed, or not of acceptable quality.
- Incorporating the results of existing SRs into a CER could propagate errors arising from errors in data abstraction, selection of studies, and qualitative or quantitative synthesis. Propagating errors can reduce credibility for the CER and the EPC program among stakeholders and users.
- Using an existing SR to answer key questions might create a perception that EPCs are not performing due diligence in conducting a CER. This perception might reduce credibility for the CER and the EPC program among stakeholders and users.
- If the existing SR does not provide evidence from primary studies and analyses in sufficient detail, the methodological process of the CER may be perceived to lack transparency.
- Ambiguity about how to compare multiple existing SRs on the same subject remains an important challenge. Lack of clear methodological guidance on selecting the most appropriate SRs could introduce reviewer bias, which is especially true if existing SRs have discordant results.
The use of existing SRs to substitute for purely de novo CER methods may provide benefits and risks. Ultimately, EPCs need to work with those who commission the work (i.e., their Task Order Officersat AHRQ and decisionmakers who nominated the topic) to determine whether the potential benefits associated with the incorporation of existing SRs are worth the risks to a CER’s comprehensiveness and transparency or the risk of introducing bias. If a decision has been made to incorporate the use of existing SRs in answering one or more key questions in lieu of using a purely de novo process, we recommend that EPCs apply the following approaches.
Figure 1 is a flow diagram adapted from a methods article by Whitlock and colleagues.2 It will help guide EPCs as they move through the process of identification, assessment, and use of existing SRs. To ensure transparency, EPCs can include a graphic similar to the example shown in Figure 2 in a CER report so users can identify the number of original citations identified in an SR search, the number of articles that are excluded, and how the existing SRs are being used.
Locating Existing Systematic Reviews
Using search terms that reflect a priori PICOTS-SD (population, intervention, comparator, outcome, setting, and study design) refines the search and decreases noise. Although EPCs can apply many possible approaches to identify existing SRs for a CER, we recommend two procedures. One strategy is to use a targeted search of higher yield databases.2 Because SRs are a secondary literature source, identifying relevant, high-quality SRs is probably more important than identifying all SRs because redundancy of primary studies across SRs is likely. Higher yield databases include the output of the Evidence-based Practice Center program, MEDLINE’s Top 120 Index Medicus Journals, Health Technology Assessments, Cochrane Database of Systematic Reviews, and Database of Abstracts of Reviews of Effects. EPCs can add other databases depending on the topic. Alternatively, EPCs can identify SRs during their title and abstract searches while conducting a broad de novo literature search for trials and studies, as long as the searches are structured not to exclude reviews. The EPC medical librarian is a valuable resource when making these decisions and developing the search strategy.
Assessing the Relevance of Existing Systematic Reviews
EPCs considering the inclusion of prior SRs in a CER should begin with a fundamental presumption—that the intent is to answer one or more key questions or a specific portion of a key question with an existing SR in lieu of a completely de novo process. Relevance requires consideration of the PICOTS-SD. Those SRs not completely relevant to the current review (partially relevant) may still be useful for background material or for cross-checking references. Some existing SRs will not be relevant at all and should be eliminated from any further consideration at this stage.
Initial Screening for Relevance
As depicted in Figure 1, after EPCs conduct a literature search for existing SRs (Step 1), they need to screen identified citations for relevance (Step 2). Citations that are not SRs (primary research, narrative reviews) or duplicate citations can be readily excluded.
Many factors that determine whether an existing SR is relevant or not are addressed in the SR’s methods section. Timeliness of the existing SR is critical. Timeliness refers not to the publication date of the review, but to how recently the literature search was conducted. When considering issues of timeliness, reviewers should be aware that SRs can become outdated quickly.3 Whether an SR is outdated depends primarily on the topic because some areas may not be as intensely researched and newer studies added only rarely. We generally recommend bridging any search date for an SR that ended a year or earlier than the present date. Given their clinical expertise, expert team members may be helpful in deciding acceptable date parameters; ideally they should make this decision a priori.
If EPCs regard an earlier SR to be outdated, they can still consider using the search results (obtaining data from the evidence tables) and then updating from 1 year before the date of the original literature search to the present time with a de novo process. By going back 1 year before the existing SR’s search date, the lagtime between the publication of an article and its inclusion into standardized literature retrieval databases ought not to be a major factor. Using the search results from these existing SRs would require only that the earliest date for which studies could be included (e.g., 1960) is in line with the date the EPCs have set for their CER.
Focusing on Population, Intervention, Comparator, Outcomes, and the Timing of Their Measurement, Setting, and Study Design To Assess Relevance
For existing SRs that make it to this stage, EPCs should compare the PICOTS-SD in the earlier SRs with these elements in the new CER protocol.4 Determining similarity will depend on how well the existing SR describes these elements. Poor reporting will make it impossible for an EPC to consider inclusion of an existing SR. Poor reporting, however, is an element of quality appraisal as well, so a poorly reported SR would not be eligible for incorporation for both relevance and quality reasons. Appreciating the subtle differences that may exist between an existing SR and the current CER is vital; this generally requires EPCs to give careful consideration of these elements.
Population: The need for the population in an existing SR to “match” completely the intended population in a new CER will depend to some degree on the clinical condition of interest and the questions being addressed. On the one hand, for example, a CER that is attempting to review interventions for hemorrhagic stroke may not be well served by including an existing SR with studies of patients with any kind of stroke unless results clearly separate the subgroup of studies relevant to hemorrhagic stroke patients. On the other hand, a CER that is examining any kind of stroke might be able to incorporate a relevant, high-quality prior SR addressing hemorrhagic stroke only. Similarly, an existing SR restricted to adults will be of limited utility if the new key questions include young children. Other CERs, however, may require less rigidity, and modest differences in age range or geographic range (e.g., United States vs. North America) may be less important.
Intervention: To ensure that existing SRs evaluated the same intervention as intended for the new CER, the team should look carefully at criteria for inclusion used in the older review. It is particularly important to make sure that issues such as dosing and mode of delivery match as closely as possible. When the existing SR was either more or less inclusive than the CER is intended to be, the experts on the team need to determine that this factor will not fundamentally change the conclusions. This may become an issue when dosing regimens change over time, as has been the case with use of higher dose statins in recent years, or for example, in the evolution of cardiac devices such as pacemakers to newer, dual-chamber versions.
Comparator: EPCs should consider whether they are interested in the effect of the intervention of interest as it compares with usual practice or another intervention and ensure that the existing SR matches this criterion. EPCs should note, when comparing treatments with usual care, whether usual practice has changed significantly since the timeframe of the earlier SR; this would make older studies—and perhaps a review of those studies—not applicable to the current concern. Such evolution of usual practice has been a significant issue, for instance, in “medical treatment” after acute coronary syndrome; older versions of medical treatment are no longer comparable with current practice. In surgical reviews, it may be important to know what supportive treatments were used in the past compared to those associated with interventions being reviewed. For example, if patients previously spent longer in postoperative care in bed rather than in active rehabilitation, those older studies may not reflect current practice. For issues of this type, the input of clinical experts can be particularly useful to determine changes in usual care over time.
Outcomes: The outcomes assessed in existing SRs should be the same as or similar to the outcomes envisioned for the CER. The usual caveats regarding use of intermediate or nonpatient-oriented outcomes apply for existing SRs just as they apply to inclusion criteria for constituent studies.
Timing of outcome measurement: Some SRs are restricted to studies with relatively short periods of followup. The period of appropriate followup, of course, depends on the condition, intervention under consideration, and outcome being assessed. The rationale for such restriction may be the lack of availability of longer term followup; when such studies become available, the relevance of the older SR is reduced. Often, short periods of followup involve surrogate outcome measures; both factors (length of followup, surrogate or proxy outcomes) decrease an SR’s relevance. Timing of outcome measurement is not the same as timeliness (how recent the existing SR is), which EPCs should examine early in the relevancy assessment.
Setting: Older SRs can address interventions in a broad or narrow range of settings, such as interventions to reduce falls in inpatient settings, in nursing homes, and in the home and other community settings. Although some of these distinctions will be clear by examining the populations addressed, a previous SR that covers a wider range of settings may not be relevant to a more narrowly scoped CER unless results of the former are stratified by setting.
Study design: SRs can differ appreciably in the types of study designs that they consider acceptable. EPCs may find that surveying inclusion criteria related to study design is a useful early step in an evaluation of relevance. If EPCs plan to include randomized and controlled clinical trials and high-quality comparative cohort studies as evidence in their CERs, but an existing SR covers only randomized controlled trials, then the latter is only partially relevant to the current effort.
The original author of the existing SR could be contacted for additional information if it is not clear whether or not sufficient relevance is present. Once EPCs have established relevance for an existing SR, they should assess and rate quality using the approach described below. Quality assessments (Figure 1, Step 3) are time intensive and should be conducted only on existing SRs found to be relevant.
Assessing the Quality of Relevant Systematic Reviews
Whatever aspect of an existing SR an EPC includes in the CER should adhere to a high methodological standard. EPCs should avoid routinely including all existing SRs in an attempt to be comprehensive. Note that this admonition is in contrast to another effort, a review of reviews, in which reviewers are asked to summarize the available evidence at the level of the systematic review.
Several instruments designed to rate quality of SRs are available.5 Regardless of the specific instrument that is chosen for this purpose, the instrument should address all aspects of the review that the EPC plans to incorporate into the CER, including methods used to identify, select, appraise, and synthesize studies; the possibility of publication bias; and potential conflicts of interest.6
Commonly Used SR Quality Instruments
In assessing the quality (i.e., assessing the risk of bias) of existing SRs, EPCs should address both the methods used by the earlier systematic reviewers to minimize bias and the transparency and completeness with which they reported their methods, individual study details, and results. Checklists for improving reporting of SRs (e.g., QUOROM [recently renamed PRISMA], MOOSE) have been used as surrogate tools for quality assessment, although they were designed to improve transparency and consistency of reporting SR methods, not directly to assess methodological quality.7-9 For example, the QUOROM checklist requires detailed descriptions of the literature search strategy terms and sources searched, but it does not provide criteria for distinguishing adequate from inadequate searches.7 In addition, inadequate reporting of SR methods does not necessarily mean that the SR was conducted poorly. Nonetheless, rating the quality of an SR without understanding how it was conducted is difficult. Several items related to quality of reporting have been incorporated into instruments such as the ones from Oxman and Guyatt and AMSTAR.6,10
The Oxman and Guyatt instrument was one of the early widely used standardized quality rating indexes for evaluating the scientific quality of a review article; unlike other quality rating instruments specifically developed for SRs, some empiric evidence supports its use.10 Reviews with lower quality ratings on the Oxman and Guyatt instrument are more likely to show treatment benefit.11,12 However, methods for evaluating SRs have evolved since the Oxman and Guyatt instrument was developed, and it does not address several methodological domains now thought to be important.13
The newer Assessment of Multiple Systematic Reviews (AMSTAR) tool includes additional criteria, such as whether study selection and data extraction were conducted in duplicate, whether publication bias was assessed, and whether conflicts of interest were reported.6 Although more data are needed to determine its reliability and validity, AMSTAR has been proposed as the preferred instrument for assessing the quality of SRs by the World Health Organization and by the Canadian Optimal Medication Prescribing and Utilization Service (COMPUS), among others.14,15 One domain that is not included in AMSTAR pertains to nonbiased application of inclusion and exclusion criteria, although EPCs can adapt the AMSTAR instrument to include such an item. (See recommendation.)
Limitations in Quality Rating Scales
As much as possible, CER investigators should apply objective and reproducible criteria when using quality assessment instruments such as Oxman and Guyatt or AMSTAR.6,10 For example, a “comprehensive” literature search could be defined as requiring searches on at least two electronic databases, reference list searching, and expert queries. Although EPCs could use this definition in most instances, they may need to tailor criteria for specific topics. For example, for assessing the quality of SRs that evaluate acupuncture, fully meeting the literature search criteria could require searching Asian-language databases.
For some criteria included in quality rating instruments, delineating objective definitions is difficult; EPCs then must apply subjective judgments. For example, AMSTAR includes the items “Was the scientific quality of the included studies used appropriately in formulating conclusions?” and “Were the methods used to combine the findings of studies appropriate?”6 Assessing and rating quality using discrete categorical choices can make quality judgments appear more clear cut and objective than they really are. Operationalizing subjective qualifiers such as “appropriate” at the outset of each assessment, taking into consideration factors relevant to the specific topic at hand, could help. Having at least two independent reviewers from an EPC assess quality and reporting methods for resolving discrepancies is desirable.
Another limitation in applying quality rating instruments is that they are not designed to detect inconsistencies in application of inclusion criteria or errors in data abstraction. For example, an SR16 of antidepressants for low back pain specified randomization as an inclusion criterion but included a nonrandomized clinical trial.17 Among the included studies, this trial reported the highest estimate of benefit and may have affected the SR’s conclusions.16 Checking data from SRs against primary studies can reveal important discrepancies.18,19
Numerical summary scores (e.g., adding up the number of criteria that are adequately met) have been used to summarize the overall quality of SRs. Such scores can be misleading because reviews with different flaws may receive the same summary score. A summary score could not dissect the nature of the bias in the individual review. For example, an SR could meet nearly all methodological criteria and receive a near-perfect summary score, but one serious methodological shortcoming could invalidate its results; a summary score may well not reflect that important shortcoming.
We suggest that CER authors describe the implications of individual methodological flaws rather than rely on numerical summary scores. For example, exclusion of “grey literature” or non-English-language citations may or may not have important effects on estimates of benefits or harms.20,21 If EPCs find no clear indication of publication bias in an SR and if stable and precise estimates are available for the outcome(s) of interest, excluding these types of literature is not likely to be a serious shortcoming. However, excluding “grey literature” or non-English language trials would be a serious shortcoming in an SR if large numbers of trials or important trials are known or suspected to exist in these literature types. As cases in point, medical device evaluations may rely on “grey literature,”22 and alternative and complementary medicine evaluations may rely on foreign-language literature.23
Assigning categorical quality scores (such as “good,” “fair,” or “poor”) may be appropriate after taking into account the number and seriousness of methodological shortcomings.24 In general, good-quality SRs should be defined as those that have few or no methodological shortcomings and a low risk of bias. Fair-quality SRs have some methodological flaws but the EPC conducting the CER determined that the flaws will not seriously bias or invalidate the results. Poor-quality SRs contain a serious flaw or flaws that, in the judgment of the EPC conducting the CER, are highly likely to bias or invalidate the results.
CER Quality Assessment Recommendations
When EPCs assess the quality of an existing SR for a CER project, we recommend:
- At least two independent reviewers should assess SRs for quality.
- EPCs should report methods for resolving discrepancies between reviewers.
- EPCs should confirm the reproducibility of application for inclusion criteria and the accuracy of data abstraction in at least a sample of the studies. They should confirm that a nonbiased application of inclusion criteria was used.
- To have a common starting point, EPCs should use AMSTAR for quality evaluation for two reasons: (1) it was developed based on an SR of quality rating instruments and has undergone some construct and validity testing; and (2) it is becoming more widely used internationally.
AMSTAR assesses 11 criteria for quality and the choices are (Yes, No, Can’t Answer, and Not Applicable).6 We suggest supplementing the AMSTAR questions as deemed appropriate for the particular project or topic at hand. Table 1 summarizes the criteria with some additional considerations that EPCs may have for their CERs.
Checklists have been developed to improve the quality of reporting of meta-analyses evaluating therapeutic interventions (e.g., see previously mentioned PRISMA: http://www.prisma-statement.org/index.htm). These reporting checklists may not be directly applicable to individual patient data meta-analyses. Although these types of meta-analyses may not be comprehensive or systematic in construct, they may provide useful insight when answering certain types of key questions, such as questions regarding subpopulations.
Determining How To Use Existing Systematic Reviews
At this point in the process, we assume that EPCs have identified one or more existing SRs that are relevant to the CER and are of adequate quality. Now EPCs must determine the appropriate way to incorporate them into the CER (Figure 1, Step 4). Several possibilities are available (Figures 1 and 2), and they are not mutually exclusive.
- Incorporate already-summarized evidence from existing SRs into the CER.
- Incorporate summarized evidence from existing SRs into the CER but conductde novo sensitivity analyses. In essence, use an existing SR to answer a key question but then conduct additional analyses using data from the original studies. For example, use an SR to answer a key question in a CER about whether or not to use coenzyme Q10 in heart failure, but then conduct de novo sensitivity analyses to determine the impact of publication date on the results.
- Utilize an SR’s search strategy in lieu of a de novo process but then use de novo methods for analysis and synthesis. This would be possible if the search strategy was consistent with the chapter on finding evidence of the Methods Guide, but the quality of other processes were inadequate or could not be determined.
- Build on existing SRs by updating meta-analyses or qualitative syntheses.
- Address conflicting results of existing SRs with a de novo analysis.
- Use at least part of the comprehensive literature search strategy to identify trials or other studies for the CER.
The quality of each step of the existing review is likely to be a major factor in how the EPCs decide to incorporate existing SRs into a CER. The EPC may incorporate an existing SR in its entirety if its research questions are very similar to the CER’s key question(s) and are of good quality at all steps of the review. They can also include an SR in part if only a portion is either of interest or relevant to a key question or questions within the CER. This may include incorporating summarized evidence within a specific population or for a specific intervention. In these cases, the methods used in the SR would have to be consistent with the chapters on finding evidence, assessing quality, grading the strength of a body of evidence, and principles in the Methods Guide, including issues of scientific independence and avoiding conflicts of interest.
Previous SRs are unlikely to be wholly sufficient to substitute for a CER because CER questions are identified by a process that assesses the redundancy of a topic with previously published SRs.25 Moreover, other factors reduce the possibility that existing SRs will be able to answer all the key questions in a CER: the comprehensive and broad nature of many CERs; the need to evaluate efficacy, effectiveness, and harms; the inclusion of high-quality observational studies (often excluded in other SRs) in many CERs; and evaluations based on factors such as sex/gender, race, and/or ethnicity.
In cases where an EPC cannot determine the accuracy or validity of the result of an earlier SR, an EPC may decide to incorporate part of the existing SR, such as the search strategy, the list of included articles, or the data extraction tables, if these sections are felt to be of adequate quality. However, in cases of reporting deficiencies where SRs may not present results of individual trials, using summary findings without complete reporting may compromise transparency in the CER. Little is gained from incorporating full results of such an SR into a CER because EPCs could not update the meta-analyses or conclusions in the existing SR with more recent trials or studies without obtaining the primary articles and repeating the data abstraction.
If EPCs find that several recent, relevant, and high-quality SRs are appropriate for a given CER, they then need to determine how best to proceed. One approach is to incorporate the single “best” existing SR (most relevant and least biased) into their own reports.2 However, selecting a single review may pose the risk of introducing selection bias; EPCs must ensure transparency in their criteria for eligibility. Another approach is to conduct a meta-review (also known as an “umbrella review”), whereby they select all relevant, high-quality SRs that meet an a priori publication date threshold and then assess the consistency among them.26,27 When using this approach, EPCs should provide summary tables with information about all the included SRs so as to maximize transparency. If the selected relevant, high-quality SRs have discordant findings, EPCs should explore the reasons for these disagreements. If EPCs cannot readily give reasons for the discordant findings, then they can regard this as an indication that they need to adopt a de novo approach to answer that key question.
Reporting Methods and Results
This chapter of the Methods Guide for Comparative Effectiveness Reviews provides the recommended approach to use when locating existing SRs and assessing their relevance and quality, and it offers a strategy for dealing with multiple existing SRs that EPCs can use to replace a de novo process. We emphasize the need for both reproducibility and transparency when using an existing SR (Figure 1, Step 5). By specifying the targeted search databases and terms used to locate existing SRs and employing a flow diagram to demonstrate the disposition of the citations identified (Figure 2), EPCs can ensure that readers of the CER will be able to assess the process and, if desired, reproduce it. If EPCs decide to search for previous SRs within only a specific date range or to exclude citations based solely on the dates of the existing SR’s literature search, then they should specify the rationale for using this cutoff date.
Providing a summary table that specifies the details of included existing SRs used to replace a de novo process is important.28,29 Summary tables of existing SRs should document the volume, type, and quality of the primary research included. In comparing these previous SRs, ideally the table should address the overlap (or lack of overlap) in primary research in these SRs: e.g., what studies or types of studies were included in one review vs. another. (Table 2 is an example.) Documenting these points will help readers in assessing such factors and the magnitude of net benefits; it will also clarify how EPCs have graded the strength of a body of evidence.2 Excluded existing SRs should also be cataloged in a table with the reason for their exclusion.
Discussion: Reiterate Justification for Using Existing Systematic Reviews
In the discussion section of a CER report, EPCs should restate the initial justification for using one or more earlier SRs instead of following a de novo process. They should discuss clearly any limitations arising from the use of existing SRs. Authors should comment on advantages and disadvantages identified through the process of creating the specific CER to help the conduct of future CERs.
Although not the focus of this paper, comparing findings from the CER with the findings from existing SRs is important because it helps health care decisionmakers understand how the CER in question relates to the existing SR literature. Authors can present similarities and differences and discuss potential reasons for any congruities or discrepancies that they have identified.
Many areas require further research to help determine how best to incorporate existing SRs into CERs. These include:
- Determining whether the targeted SR search strategy that has been proposed in this chapter consistently helps to identify the highest quality reviews with less resource allocation than a more broadly conducted search.
- Examining whether applying different relevance or quality criteria markedly changes the SRs that EPCs ultimately include in their CERs or the results derived from these SRs.
- In a situation involving several existing SRs with sufficient relevance and quality, investigating whether the conduct of a meta-review or selecting the best SR approach is the better strategy.
- Documenting savings or increases in time or resources (if any) that come from using an existing SR approach in place of a de novo process.
- Documenting the additional time or resources used in searching for and evaluating existing SRs when they are ultimately not used to replace a de novo process.
- Determining whether it is more efficient to search for an SR as part of the overall search strategy for a topic, or as a first step before searching for primary literature.
- Determining specific criteria to assess the quality of individual patient data meta-analyses.
- Determining if SRs evaluating diagnostic tests or harms require a different emphasis on certain quality criteria or if additional criteria might be warranted.
- Developing and validating criteria for categorizing quality of reviews into good/fair/poor metrics.
1. Moher D, Tetzlaff J, Tricco AC, et al. Epidemiology and reporting characteristics of systematic reviews. PLoS Med 2007;4:e78.
2. Whitlock EP, Lin JS, Shekelle P, et al. Using existing systematic reviews in complex systematic reviews. Ann Intern Med 2008;148:776-82.
3. Shojania KG, Sampson M, Ansari MT, et al. How quickly do systematic reviews go out of date? A survival analysis. Ann Intern Med 2007;147:273-4.
4. Rothwell PM. External validity of randomized controlled trials: “to whom do the results of this trial apply?” Lancet 2005;365:13-4.
5. West S, King V, Carey TS, et al. Systems to rate the strength of scientific evidence. Evidence Report/Technology Assessment No. 47. Research Triangle Institute- University of North Carolina Evidence-based Practice Center. AHRQ Publication No. 02-E016. 2002. Rockville, MD. Agency for Healthcare Research and Quality.
6. Shea BJ, Grimshaw JM, Wells GA, et al. Development of AMSTAR: a measurement tool to assess the methodological quality of systematic reviews. BMC Med Res Methodol 2007;7:10.
7. Moher D, Cook DJ, Eastwood S, et al. Improving the quality of reports of meta-analyses of randomized controlled trials: the QUOROM statement. Quality of Reporting of Meta-analyses. Lancet 1999;354:1896-900.
8. Stroup DF, Berlin JA, Morton SC, et al. Meta-analysis of observational studies in epidemiology: a proposal for reporting. Meta-analysis Of Observational Studies in Epidemiology (MOOSE) group. JAMA 2000;283:2008-12.
9. Shea BJ, Dube C, Moher D. Assessing the quality of reports of systematic reviews: the QUOROM statement compared to other tools. In: Egger M, Smith GD, Altman DG, eds. Systematic Reviews in Health Care: Meta-Analysis in Context. 2nd Edition. London: BMJ Publishing Group; 2001.
10. Oxman AD, Guyatt GH. Validation of an index of the quality of review articles. J Clin Epidemiol 1991;44:1271-8.
11. Jadad AR, McQuay HJ. Meta-analyses to evaluate analgesic interventions: a systematic qualitative review of their methodology. J Clin Epidemiol 1996;49:235-43.
12. Assendelft WJ, Koes BW, Knipschild PG, et al. The relationship between methodological quality and conclusions in reviews of spinal manipulation. JAMA 1995;274:1942-8.
13. Shea B, Dube C, Moher D. Assessing the quality of reports of systematic reviews: the QUORUM statement compared to other tools. In: Egger M, Smith GD, Altman DG, eds. Systematic reviews in health care: meta-analysis in context. London, UK: BMJ Publishing Group, 2001:122-39.
14. Oxman AD, Schunemann HJ, Fretheim A. Improving the use of research evidence in guideline development: 8. Synthesis and presentation of evidence. Health Research Policy and Systems 2006;4:20.
15. COMPUS Procedure. Evidence-based best practice recommendations. Available at: http://www.cadth.ca/media/compus/pdf/COMPUS_%20procedure_e.pdf. Accessed October 29, 2008.
16. Salerno SM, Browning R, Jackson JL. The effect of antidepressant treatment on chronic back pain. Arch Intern Med 2002;162:19-24.
17. Ward NG. Tricyclic antidepressants for chronic low-back pain. Spine 1986;11:661-5.
18. Gotzsche PC, Hrobjartsson A, Marie K, et al. Data extraction errors in meta-analyses that use standardized mean differences. JAMA 2007;298:430-7.
19. Jones AP, Remmington T, Williamson PR, et al. High prevalence but low impact of data extraction and reporting errors were found in Cochrane systematic reviews. J Clin Epidemiol 2005;58:741-2.
20. Egger M, Juni P, Bartlett C, et al. How important are comprehensive literature searches and the assessment of trial quality in systematic reviews? Empirical study. Health Technol Assess 2003;7:1-76.
21. Moher D, Pham B, Klassen TP, et al. What contributions do languages other than English make on the results of meta-analyses? J Clin Epidemiol 2000;53:964-72.
22. Hartling L, McAlister FA, Rowe BH, et al. Challenges in systematic reviews of therapeutic devices and procedures. Ann Intern Med 2005 Jun;142:1100-11.
23. Shekelle PG, Morton SC, Suttorp MJ, et al. Challenges in systematic reviews of complementary and alternative medicine topics. Ann Intern Med 2005 Jun;142:1042-7.
24. Drug Effectiveness Review Project. Quality assessment methods for drug class reviews for the Drug Effectiveness Review Project. Available at: http://www.ohsu.edu/ohsuedu/research/policycenter/DERP/about/upload/Qual... mentDERP-2.pdf. Accessed October 23, 2008.
25. Whitlock EP, Lopez SA, Chang S, et al. Identifying, selecting, and refining topics for research reviews: AHRQ and the Effective Health Care Program. JCE, Submitted.
26. Ruddy R, House A. Meta-review of high-quality systematic reviews of interventions in key areas of liaison psychiatry. Br J Psych 2005;187:109-20.
27. Moe RH, Haavardsholm EA, Christie A, et al. Effectiveness of nonpharmacological and nonsurgical interventions for hip osteoarthritis: an umbrella review of high-quality systematic reviews. Phys Ther 2007;87:1716-27.
28. Chou R, Huffman LH. American Pain Society. American College of Physicians. Nonpharmacologic therapies for acute and chronic low back pain: a review of the evidence for an American Pain Society/American College of Physicians clinical practice guideline. Ann Intern Med 2007;147:492-504.
29. Lorenz KA, Lynn J, Dy SM, et al. Evidence for improving palliative care at the end of life: a systematic review. Ann Intern Med 2008;148:147-59.
Figure 1. Systematic process for identifying, assessing, and using existing systematic reviews
Note: Adapted from Whitlock et al., 2008.2
aDenotes that a de novo process is preferred if several relevant, high-quality SRs come to discordant findings; in that case, the existing SRs should be used solely for hand-searching and background context.
PICOTS-SD= population, intervention, comparator, outcomes, timing, setting, and study design; SR= systematic review.
Figure 2. Illustrative existing systematic review (SR) diagram
Table 1. AMSTAR quality criteria with considerations for Comparative Effectiveness Reviews
|Number||Criterion||Considerations for Comparative Effectiveness Reviews|
|1||Was an a priori design provided?||---|
|2||Was there duplicate study selection and data extraction?||
|3||Was a comprehensive literature search performed?||
|4||Was the status of publication (e.g., grey literature) used as an inclusion criterion?||
|5||Was a list of studies (included and excluded) provided?||---|
|6||Were the characteristics of the included studies provided?||---|
|7||Was the scientific quality of the included studies rated and documented?||
|8||Was the scientific quality of the included studies used appropriately in formulating conclusions?||
|9||Were the methods used to combine the findings of studies appropriate?||---|
|10||Was the likelihood of publication bias assessed?||
|11||Was the conflict of interest stated?||
AMSTAR= Assessment of Multiple Systematic Reviews; EPC=Evidence-based Practice Center.
Table 2. Table template for included SRs
|Included studies (n)||Study types (n)||Total participants (n)||EPC assessment of the quality of primary literature||Overlapping studies (n)a||Comments|
|Reading 2005||7||RCTs, 5 OS, 2||RCTs, 1,175 OS, 2,756||Moderate||Referent||Inclusion criteria not restricted to RCTs.|
|Preakness 2005||6||RCTs, 6 OS, 0||RCTs, 1,464 OS, 0||High||5 of 7||One additional RCT included in this SR vs. Reading 2005. RCT included after contacting author for additional information.|
|Hung 2004||4||RCTs, 4 OS, 0||RCTs, 893 OS, 0||Moderate||4 of 7||All of the RCTs in this SR were included in Reading 2005 and Preakness 2005.|
EPC=Evidence-based Practice Center; OS=observational study; RCT=randomized controlled trial; SR=systematic review.