Quality Assessment Methods

Roger Chou; Marian S McDonagh; Erika Nakamoto; Jessica Griffin

Appendix FQuality Assessment Methods

Individual studies were rated as “good,” “fair” or “poor” as defined below^*:

Studies rated “good” have the least risk of bias and results are considered valid. Good-quality studies include clear descriptions of the population, setting, interventions, and comparison groups; a valid method for allocation of patients to treatment; low dropout rates and clear reporting of dropouts; appropriate means for preventing bias; appropriate measurement of outcomes, and reporting results.

Studies rated “fair” are susceptible to some bias, but it is not sufficient to invalidate the results. These studies do not meet all the criteria for a rating of good quality because they have some deficiencies, but no flaw is likely to cause major bias. The study may be missing information, making it difficult to assess limitations and potential problems. The “fair” quality category is broad, and studies with this rating vary in their strengths and weaknesses: the results of some fair-quality studies are likely to be valid, while others are only probably valid.

Studies rated “poor” have significant flaws that imply biases of various types that may invalidate the results. They have a serious or “fatal” flaw in design, analysis, or reporting; large amounts of missing information; or discrepancies in reporting. The results of these studies are at least as likely to reflect flaws in the study design as the true difference between the compared drugs.

For Controlled Trials

Each criterion was give an assessment of yes, no, or unclear.

Was the assignment to the treatment groups really random?
- Adequate approaches to sequence generation:
  - Computer-generated random numbers
  - Random numbers tables
- Inferior approaches to sequence generation:
  - Use of alternation, case record numbers, birth dates or week days
- Randomization reported, but method not stated
- Not clear or not reported
- Not randomized
Was the treatment allocation concealed?
- Adequate approaches to concealment of randomization:
  - Centralized or pharmacy-controlled randomization (randomization performed without knowledge of patient characteristics).
  - Serially-numbered identical containers
  - On-site computer based system with a randomization sequence that is not readable until allocation
  - Sealed opaque envelopes
- Inferior approaches to concealment of randomization:
  - Use of alternation, case record numbers, birth dates or week days
  - Open random numbers lists
  - Serially numbered non-opaque envelopes
  - Not clear or not reported
Were the groups similar at baseline in terms of prognostic factors?
Were the eligibility criteria specified?
Were outcome assessors and/or data analysts blinded to the treatment allocation?
Was the care provider blinded?
Was the patient kept unaware of the treatment received?
Did the article include an intention-to-treat analysis, or provide the data needed to calculate it (i.e., number assigned to each group, number of subjects who finished in each group, and their results)?
Did the study maintain comparable groups?
Did the article report attrition, crossovers, adherence, and contamination?
Is there important differential loss to followup or overall high loss to followup?

For Cohort Studies

Each criterion was give an assessment of yes, no, or unclear.

Did the study attempt to enroll all (or a random sample of) patients meeting inclusion criteria, or a random sample (inception cohort)?
Were the groups comparable at baseline on key prognostic factors (e.g., by restriction or matching)?
Did the study use accurate methods for ascertaining exposures, potential confounders, and outcomes?
Were outcome assessors and/or data analysts blinded to treatment?
Did the article report attrition?
Did the study perform appropriate statistical analyses on potential confounders?
Is there important differential loss to followup or overall high loss to followup?
Were outcomes prespecified and defined, and ascertained using accurate methods?

For Case-Control Studies

Each criterion was given an assessment of yes, no, or unclear.

Did the study attempt to enroll all (or a random sample of) cases using predefined criteria?
Were the controls derived from the same population as the cases, and would they have been selected as cases if the outcome was present?
Were the groups comparable at baseline on key prognostic factors (e.g., by restriction or matching)?
Did the study report the proportion of cases and controls who met inclusion criteria that were analyzed?
Did the study use accurate methods for identifying outcomes?
Did the study use accurate methods for ascertaining exposures and potential confounders?
Did the study perform appropriate statistical analyses on potential confounders?

Systematic Reviews

Each criterion was given an assessment of yes, no, unclear, or not applicable.

Was an “a priori” design provided?
The research question and inclusion criteria should be established before the conduct of the review.
Was there duplicate study selection and data extraction?
There should be at least two independent data extractors and a consensus procedure for disagreements should be in place.
Was a comprehensive literature search performed?
At least two electronic sources should be searched. The report must include years and databases used (e.g. Central, Embase, and MEDLINE). Key words and/or MeSH terms must be stated and where feasible the search strategy should be provided. All searches should be supplemented by consulting current contents, reviews, textbooks, specialized registers, or experts in the particular field of study, and by reviewing the references in the studies found.
Was the status of publication (i.e., gray literature) used as an inclusion criterion?
The authors should state that they searched for reports regardless of their publication type. The authors should state whether or not they excluded any reports (from the systematic review), based on their publication status, language, etc.
Was a list of studies (included and excluded) provided?
A list of included and excluded studies should be provided.
Were the characteristics of the included studies provided?
In an aggregated from such as a table, data from the original studies should be provided on the participants, interventions and outcomes. The ranges of characteristics in the studies analyzed, e.g., age, race, sex, relevant socioeconomic data, disease status, duration, severity, or other diseases should be reported.
Was the scientific quality of the included studies assessed and documented?
‘A priori’ methods of assessment should be provided (e.g., for effectiveness studies if the author(s) chose to include only randomized, double-blind, placebo controlled studies, or allocation concealment as inclusion criteria); for other types of studies alternative items will be relevant.
Was the scientific quality of the include studies used appropriately in formulating conclusions?
The results of the methodological rigor and scientific quality should be considered in the analysis and the conclusions of the review, and explicitly stated in formulating the recommendations.
Were the methods used to combine the findings of studies appropriate?
Reviews should not combine or pool dissimilar studies. If studies are pooled using a fixed effects model, there should be a clear rationale for doing so. A test should be done to assess for statistical heterogeneity (i.e., Chi-squared test for homogeneity, I²).
Was the likelihood of publication bias assessed?
An assessment of publication bias should include a combination of graphical aids (e.g., funnel plot, other available tests) and/or statistical tests (e.g., Egger regression test). If assessment of publication bias is not possible, the review should provide justification (e.g., small numbers of studies, too much heterogeneity, poor quality, etc.)
Was the conflict of interest stated?
Potential sources of support should be clearly acknowledged in both the systematic review and the included studies.

Footnotes

*: Harris RP, Helfand M, Woolf SH, et al. Current methods of the US Preventive Services Task Force: a review of the process. Am J Prev Med 2001;20:21–35.

Publication Details

Copyright

Copyright Notice

Publisher

Agency for Healthcare Research and Quality (US), Rockville (MD)

NLM Citation

Chou R, McDonagh MS, Nakamoto E, et al. Analgesics for Osteoarthritis: An Update of the 2006 Comparative Effectiveness Review [Internet]. Rockville (MD): Agency for Healthcare Research and Quality (US); 2011 Oct. (Comparative Effectiveness Reviews, No. 38.) Appendix F, Quality Assessment Methods.