Appendix GClinical and Self-Reported Scales and Instruments Commonly Used in Studies of Drug Therapy for Rheumatoid Arthritis and Psoriatic Arthritis

Publication Details

Introduction

This appendix provides a brief overview of the various scales and self-reported measures that investigators used to assess outcomes in all the studies reviewed in this systematic review. The main outcome categories involve radiologic assessments of joint damage (erosion or narrowing) and various instruments that patients or subjects used to report on functional capacity or quality of life; the latter fall into two groups, one related to general health measures and one related to condition- or disease-specific instruments. General measures used in rheumatoid and psoriatic arthritis studies are described first; then the disease-specific measures used in rheumatoid and psoriatic arthritis studies are described separately. The new 2010 American College of Rheumatology ACR criteria are presented at the end of the document.

Radiographic Measures

Radiographic assessment of joint damage in hands (including wrists) or both hands and feet are critical to clinical trials in rheumatoid arthritis. The damage can be both joint space narrowing and erosions, and the underlying construct is sometimes referred to as radiographic progression (i.e., changes, whether positive or negative) as detected by radiography and interpretation. Several approaches exist, but the two commonly used are the Sharp Score (and variants) and the Larsen Score. These and other scoring methods have recently been reviewed by Boini and Guillemin;1 additional citations or sources are given in the brief descriptions below.

Sharp Score and Sharp/van der Heijde Score

The Sharp Score is a means of evaluating joint damage in joints of the hands, including both erosion and joint space narrowing.2 Although it has undergone modifications since its introduction, the version proposed in 1985 has become the standard approach. In this method, 17 joint areas in each hand are scored for erosions; 18 joint areas in each hand are scored for joint space narrowing. The score per single joint for erosions ranges from 0 to 5 and for joint space narrowing from 0 to 4. In both cases, a higher score is worse. Erosion scores range from 0 to 170 and joint space narrowing scores range from 0 to 144. Thus, the “total Sharp Score” is the sum of the erosion and joint space narrowing scores, or 0 to 314.

The Sharp/van der Heijde (SHS) method, introduced in 1989, overcame one drawback to the Sharp Score, namely its focus on only hands, given that feet can also be involved early in rheumatoid arthritis. Therefore, the SHS method was developed to take account of erosions and joint space narrowing in both hands and feet.34 As with the Sharp Score, higher scores reflect worse damage. Erosion is assessed in 16 joints in each hand and 6 joints in each foot. Each joint is scored from 0 to 5 with a maximal erosion score of 160 in the hands and 120 in the feet. Joint space narrowing and subluxation are assessed in 15 joints in the hands and 6 joints in the feet. Each joint is scored from 0 to 4 with a maximal score of 120 in the hands and 48 in the feet. The erosion and joint space narrowing scores are combined to give a total SHS score with a maximum of 448 (weighted toward hands because more joints are scored).

Numerous variants on the Sharp or SHS scores have been developed, differing subtly in terms of the numbers of joints measured and other details.5 Generally, all the Sharp methods are very detailed assessments and the approach, although reliable and sensitive to change, is considered time-consuming and tedious. For a speedier approach, Larsen and colleagues developed a simpler approach.

Larsen Scale for Grading Radiographs

The Larsen Scale is an overall measure of joint damage, originally devised in the 1970s and updated most recently in the late 1990s.610 It produces both a score for each joint (hands and feet) and an overall score that reflects measurement and extent of joint damage. Scores range from 0 (“normal conditions,” i.e., intact bony outlines and normal joint space) to 5 (“mutilating abnormality,” i.e., original bony outlines have been destroyed), so higher scores reflect greater damage. Scores can range from 0 to 250.

General Health Measures

Health Assessment Questionnaire

The Health Assessment Questionnaire (HAQ) is a widely used self-report measure of functional capacity; it is a dominant instrument in studies of patients with arthritis (particularly trials of drugs in patients with rheumatoid arthritis), but it is considered a generic (not disease-specific) instrument. Detailed information on its variations, scoring, etc., can be found at www.chcr.brown.edu/pcoc/EHAQDESCRSCORINGHAQ372.PDF (accessed for this purpose 1/18/2007) or www.hqlo.com/content/1/1/20 (accessed for this purpose 1/18/2007) and in the seminal reports by Fries et al.11 and Ramey et al.12

The full, five-dimension HAQ consists of four domains: disability, discomfort and pain, toxicity, and dollar costs, plus death (obtained through other sources). More commonly, “the HAQ” as used in the literature refers to the shorter version encompassing the HAQ Disability Index (HAQ-DI), the HAQ pain measure, and a global patient outcome measure. The HAQ-DI is sometimes used alone.

The HAQ-DI, with the past week as the time frame, focuses on whether the respondent “is able to…” do the activity and covers eight categories in 20 items: dressing and grooming, arising, eating, walking, hygiene, reach, grip, and common daily activities. The four responses for the HAQ-DI questions are graded as follows: without any difficulty = 0; with some difficulty = 1; with much difficulty = 2; and unable to do = 3. The highest score for any component question in a category determines the category score. The HAQ-DI also asks about the use of aids and devices to help with various usual activities. Two composite scores can be calculated, one with and one without the aids/devices element; both range from 0 to 3.

The HAQ pain domain is measured on a doubly-anchored horizontal visual analog scale (VAS) of 15 cm in length; one end is labeled “no pain” (score of 0) and the other is labeled “very severe pain” (score of 100). Patients mark a spot on the VAS, and scores are calculated as the length from “no pain” in centimeters (cm) multiplied by 0.2 to yield a value that can range between 0 and 3.

With respect to interpretation, HAQ-DI scores of 0 to 1 are generally considered to represent mild to moderate disability, 1 to 2 moderate to severe disability, and 2 to 3 severe to very severe disability.

The HAQ global health status scale measures quality of life (essentially, as how the patient is feeling) with a 15 cm doubly-anchored horizontal VAS scored from 0 (very well) to 100 (very poor).

Medical Outcomes Study Short Form 36 Health Survey

The Medical Outcomes Study Short Form 36 Health Survey (SF-36) is an internationally known generic health survey instrument. Information can be found at www.sf-36.org/tools/sf36.shtml (accessed for this purpose 2/18/2007) and in a large number of articles documenting its psychometric properties.1319 It comprises 36 items in eight independent domains tapping functioning and well-being: physical functioning, role-physical, bodily pain, and general health in one grouping (physical health) and vitality, role-emotional, social functioning, and mental health in another grouping (mental health). The SF-36 provides a separate scale score for each domain (yielding a profile of health) and two summary scores, one for physical health and one for mental health. Each scale is scored from 0 to 100 where higher scores indicate better health and well-being.

A “version 2” of the SF-36 was introduced in the late 1990s to correct some drawbacks in formatting, wording, and other issues and to update the norm-based scoring with 1998 data. It can be fielded in two versions varying by recall period: 4-week recall (the usual approach) and 1-week recall (acute). More recently, it has been tested and used for computer adaptive testing according to item response theory principles.

EuroQol EQ-5D Quality of Life Questionnaire

A third generic quality-of-life instrument is the EuroQol EQ-5D Quality of Life Questionnaire, typically known just as the EQ-5D. More information can be found at http://www.euroqol.org/ (accessed for this purpose 1/18/2007) and in key descriptive articles,20 one of which is about patients with rheumatoid arthritis.21

The EQ-5D covers health status in five domains (three questions each): mobility, self-care, usual activities, pain or discomfort, and anxiety or depression. It is intended for self-response but can be used in other administration modes. Each item can take one of three response levels – no problems, some moderate problems, extreme problems – identified as level 1, 2, or 3, respectively. This yields a profile of one level for each of the five domains; this is essentially a five-digit number, and no arithmetic properties attach to these values. Users can convert health states in the five-dimensional descriptive system into a weighted health state index by applying scores from EQ-5D “value sets” elicited from general population samples to the profile pattern (e.g., 1, 2, 3, 3, 1).

The EQ-5D also has a global health VAS scale (20 cm) scored from 0 to 100.

Rheumatoid Arthritis Measures

American College of Rheumatology 20/50/70

The American College of Rheumatology (ACR) criteria are concerned with improvement in counts of tender and swollen joints and several domains of health.22 A principal aim of these criteria is use in studies (particularly trials) of drugs for rheumatoid arthritis. More information can be found at www.rheumatology.org/publications/response/205070.asp and www.hopkins-arthritis.som.jhmi.edu/edu/acr/acr.html#remis_rheum (both accessed for this purpose 1/18/2007). Originally these latter involved patient assessment, physician assessment, erythrocyte sedimentation rate, pain scale, and functional questionnaire.

Today, based on work done in the mid 1990s,23 values for clinical trial patients are defined as improvement in both tender and swollen joint counts and in three of the following: patient’s assessment of pain; patient’s global assessment of disease activity, patient’s assessment of physical function (sometimes referred to as physical disability), the physician’s global assessment of disease activity, and acute phase reactant (C-reactive protein, or CRP). The 20, 50, or 70 designations (sometimes called the ACR Success Criteria) refer to improvements in percentage terms to 20 percent, 50 percent, or 70 percent in the relevant dimensions. A physician’s global assessment of 70 percent improvement is considered remission.

Thus, patients are said to meet ACR 20 criteria when they have at least 20 percent reductions in tender and swollen joint counts and in at least three of the domains. ACR 50 and ACR 70 criteria are defined in a manner similar to that for ACR 20, but with improvement of at least 50 percent and 70 percent in the individual measures, respectively. Table G-1 illustrates, in a study context, how a patient might be said to have an ACR 50 response.

Table G-1. Example of a patient with an ACR 50 response to treatment.

Table G-1

Example of a patient with an ACR 50 response to treatment.

Ritchie Articular Index

This is a long-standing approach to doing a graded assessment of the tenderness of 26 joint regions, based on summation of joint responses after applying firm digital pressure.24 Four grades can be used: 0, patient reported no tenderness; +1, patient complained of pain; +2, patient complained of pain and winced; and +3, patient complained of pain, winced, and withdrew. Thus, the index ranges from 0 to 3 for individual measures and 0 to 78 overall, with higher scores being worse tenderness.

Certain joints are treated as a single unit, such as the metacarpal-phalangeal and proximal interphalangeal joints of each hand and the metatarsal-phalangeal joints of each foot. For example, the maximum score for the five metacarpal-phalangeal joints of the right hand would be 3, not 15. No weights are used for different types of joints (e.g., by size), because the issue is one of measuring changes (improvements) in tenderness; this is especially relevant for rheumatoid arthritis.

Disease Activity Score

The Disease Activity Score (DAS) is an index of disease activity first developed in the mid 1980s. The history of its development and current definitions, scoring systems, and other details can be found at http://www.das-score.nl/www.das-score.nl/ (accessed for this purpose 1/19/2007) and in recent articles.4,25 The DAS originally included the Ritchie Articular Index (see above), the 44 swollen joint count, the erythrocyte sedimentation rate, and a general health assessment on a VAS. A cut-off level of the DAS of 1.6 is considered to be equivalent with being in remission.

More recently, an index of RA disease activity using only 28 joints – the DAS 28 – has been developed, focusing on joint counts for both tenderness (TJC) and swelling (SJC). It also uses either the patient’s or a physician’s global assessment (PGA) of disease activity (on a 100 mm VAS) and the erythrocyte sedimentation rate (ESR) or C-reactive protein. The formula for calculating a DAS 28 score is as follows: = (0.56 × TJC1/2) + (0.28 × SJC1/2) + (0.7 × ln [ESR]) + (0.014 × PGA [in mm]). Numerous formulas to calculate a variety of DAS and DAS 28 scores exist (see the website above), such as when a global patient assessment of health is unavailable.

The DAS 28 yields a score on a scale ranging from 0 to 10. A DAS 28 of 2.6 is considered to correspond to remission; a DAS 28 of 3.2 is a threshold for low disease activity; and a DAS 28 of more than 5.1 is considered high disease activity.

EULAR Response Criteria

The European League Against Rheumatism (EULAR) response criteria classify patients as good, moderate, or nonresponders based on both change in disease activity and current disease activity, using either the DAS or the DAS28 (see description above).26 For example, to be classified as a good responder a patient must have relevant change in DAS (≥1.2) and low current disease activity (≤2.4), while a nonresponder must have ≤0.6 change in DAS and high disease activity (>3.7).27

The EULAR criteria have been validated in multiple clinical trials, and confirmed in an analysis of nine clinical trials that concluded a high level of agreement and equal validity between ACR and EULAR improvement classifications.28 Good and moderate responders showed significantly more improvement in functional capacity and significantly less progression of joint damage than patients classified as nonresponders.28

Psoriatic Arthritis Measures

Psoriatic Arthritis Response Criteria

The psoriatic arthritis response criteria (PsARC) was initially designed for use in a clinical trial that compared sulphasalazine to placebo in the setting of the Veterans Administration.29 It has since been used as the primary or secondary outcome in all the studies that examined biologics versus placebo in the treatment of PsA. The PsARC includes improvement in at least two of the following, one of which had to be a joint count, and no worsening of any measure: tender or swollen joint count improvement of at least 30%, patient global improvement by one point on a five-point Likert scale, or physician global improvement on the same scale.29

American College of Rheumatology 20

The ACR 20 (American College of Rheumatology 20 percent response) is the other outcome that is used as the primary outcome in clinical trials of biologics. The measurement is similar to that of the ACR 20 used for rheumatoid arthritis with modifications made that increased the number of joints tested from 68 tender and 66 swollen to 76 and 78, respectively, with the addition of distal interphalangeal joints of the feet and carpometacarpal joints of the hands.29 The outcomes from the ACR 20 are generally poorer when compared to the PsARC due to the variation in items measured; this is due in part to the need to see an improvement in tender and swollen joints in the ACR 20 versus an improvement in tender or swollen joint counts. An adaptation of the ACR 20 criteria as of 2010 are presented in Table G-2.

Table G-2. 2010 rheumatoid arthritis criteria.

Table G-2

2010 rheumatoid arthritis criteria.

The Psoriasis Area and Severity Index

The Psoriasis Area and Severity Index (PASI) was developed to measure the effect of treatments in clinical trials of psoriasis and is utilized to capture the psoriasis component found in psoriatic arthritis. The scale was originally published in 1978 in a trial of 27 patients suffering from severe chronic generalized psoriasis that were treated with Ro 10-9359, a retinoic acid derivative.30 The PASI is a composite index of disease severity incorporating measures of scaling, erythema, and induration, and it is weighted by severity and affected body surface area. A PASI >12 defines severe, PASI 7–12 moderate, and PASI <7 mild psoriasis.

References

1.
Boini S, Guillemin F. Radiographic scoring methods as outcome measures in rheumatoid arthritis: properties and advantages. Ann Rheum Dis. 2001 Sep;60(9):817–27. [PMC free article: PMC1753828] [PubMed: 11502606]
2.
Sharp JT, Young DY, Bluhm GB, Brook A, Brower AC, Corbett M, et al. How many joints in the hands and wrists should be included in a score of radiologic abnormalities used to assess rheumatoid arthritis? Arthritis Rheum. 1985 Dec;28(12):1326–35. [PubMed: 4084327]
3.
van der Heijde D, Dankert T, Nieman F, Rau R, Boers M. Reliability and sensitivity to change of a simplification of the Sharp/van der Heijde radiological assessment in rheumatoid arthritis. Rheumatology (Oxford). 1999 Oct;38(10):941–7. [PubMed: 10534543]
4.
van der Heijde D. How to read radiographs according to the Sharp/van der Heijde method. J Rheumatol. 1999 Mar;26(3):743–5. [PubMed: 10090194]
5.
Ory PA. Interpreting radiographic data in rheumatoid arthritis. Ann Rheum Dis. 2003 Jul;62(7):597–604. [PMC free article: PMC1754596] [PubMed: 12810418]
6.
Larsen A. Radiological grading of rheumatoid arthritis. An interobserver study. Scand J Rheumatol. 1973;2(3):136–8. [PubMed: 4769066]
7.
Larsen A, Dale K, Eek M. Radiographic evaluation of rheumatoid arthritis and related conditions by standard reference films. Acta Radiol Diagn (Stockh). 1977 Jul;18(4):481–91. [PubMed: 920239]
8.
Scott DL, Coulton BL, Bacon PA, Popert AJ. Methods of X-ray assessment in rheumatoid arthritis: a re-evaluation. Br J Rheumatol. 1985 Feb;24(1):31–9. [PubMed: 3978364]
9.
Larsen A. How to apply Larsen score in evaluating radiographs of rheumatoid arthritis in long-term studies. J Rheumatol. 1995;22:1974–5. [PubMed: 8992003]
10.
Edmonds J, Saudan A, Lassere M, Scott DL. Introduction to reading radiographs by the Scott modification of the Larsen method. J Rheumatol. 1999;26:740–2. [PubMed: 10090193]
11.
Fries JF, Spitz P, Kraines RG, Holman HR. Measurement of patient outcome in arthritis. Arthritis Rheum. 1980 Feb;23(2):137–45. [PubMed: 7362664]
12.
Ramey DR, Fries JF, Singh G. The Health Assessment Questionnaire 1995 -- Status and Review. In: Spilker B, editor. Quality of Life and Pharmacoleconomics in Clinical Trials. 2nd ed. Philadelphia: Lippincott-Raven Publischers; 1996. pp. 227–37.
13.
Stewart AL, Hays RD, Ware JE Jr. The MOS short-form general health survey. Reliability and validity in a patient population. Med Care. 1988 Jul;26(7):724–35. [PubMed: 3393032]
14.
Stewart AL, Ware JE. Measuring Functioning and Well-Being: The Medical Outcomes Study Approach. Durham, NC: Duke University Press; 1992.
15.
Ware JE Jr, Sherbourne CD. The MOS 36-item short-form health survey (SF-36). I. Conceptual framework and item selection. Med Care. 1992 Jun;30(6):473–83. [PubMed: 1593914]
16.
McHorney CA, Ware JE Jr, Rogers W, Raczek AE, Lu JF. The validity and relative precision of MOS short- and long-form health status scales and Dartmouth COOP charts. Results from the Medical Outcomes Study. Med Care. 1992 May;30(5 Suppl):MS253–65. [PubMed: 1583937]
17.
McHorney CA, Ware JE Jr, Raczek AE. The MOS 36-Item Short-Form Health Survey (SF-36): II. Psychometric and clinical tests of validity in measuring physical and mental health constructs. Med Care. 1993 Mar;31(3):247–63. [PubMed: 8450681]
18.
McHorney CA, Ware JE Jr, Lu JF, Sherbourne CD. The MOS 36-item Short-Form Health Survey (SF-36): III. Tests of data quality, scaling assumptions, and reliability across diverse patient groups. Med Care. 1994 Jan;32(1):40–66. [PubMed: 8277801]
19.
Ware JE Jr, Kosinski M, Bayliss MS, McHorney CA, Rogers WH, Raczek A. Comparison of methods for the scoring and statistical analysis of SF-36 health profile and summary measures: summary of results from the Medical Outcomes Study. Med Care. 1995 Apr;33(4 Suppl):AS264–79. [PubMed: 7723455]
20.
Kind P. The EuroQol instrument: An index of health-related quality of life, Quality of life and PharmacoEconomics in Clinical Trials. 2nd ed. Philadelphia: Lippincott-Raven Publishers; 1996.
21.
Hurst NP, Kind P, Ruta D, Hunter M, Stubbings A. Measuring health-related quality of life in rheumatoid arthritis: validity, responsiveness and reliability of EuroQol (EQ-5D). Br J Rheumatol. 1997 May;36(5):551–9. [PubMed: 9189057]
22.
Felson DT, Anderson JJ, Boers M, Bombardier C, Chernoff M, Fried B, et al. The American College of Rheumatology preliminary core set of disease activity measures for rheumatoid arthritis clinical trials. The Committee on Outcome Measures in Rheumatoid Arthritis Clinical Trials. Arthritis Rheum. 1993 Jun;36(6):729–40. [PubMed: 8507213]
23.
Felson DT, Anderson JJ, Boers M, Bombardier C, Furst D, Goldsmith C, et al. ACR preliminary definition of improvement in rheumatoid arthritis. Arthritis Rheum. 1995;38(6):727–35. [PubMed: 7779114]
24.
Ritchie DM, Boyle JA, McInnes JM, Jasani MK, Dalakos TG, Grieveson P, et al. Clinical studies with an articular index for the assessment of joint tenderness in patients with rheumatoid arthritis. Q J Med. 1968 Jul;37(147):393–406. [PubMed: 4877784]
25.
Aletaha D, Nell VP, Stamm T, Uffmann M, Pflugbeil S, Machold K, et al. Acute phase reactants add little to composite disease activity indices for rheumatoid arthritis: validation of a clinical activity score. Arthritis Res Ther. 2005;7(4):R796–806. [PMC free article: PMC1175030] [PubMed: 15987481]
26.
van Gestel AM, Prevoo ML, van ‘t Hof MA, van Rijswijk MH, van de Putte LB, van Riel PL. Arthritis Rheum. 1996/01/01 ed. 1996. Development and validation of the European League Against Rheumatism response criteria for rheumatoid arthritis Comparison with the preliminary American College of Rheumatology and the World Health Organization/International League Against Rheumatism Criteria; pp. 34–40. [PubMed: 8546736]
27.
Fransen J, van Riel PL. Rheum Dis Clin North Am. 2009/12/08 ed. 2009. The Disease Activity Score and the EULAR response criteria; pp. 745–57.pp. vii–viii. [PubMed: 19962619]
28.
van Gestel AM, Anderson JJ, van Riel PL, Boers M, Haagsma CJ, Rich B, et al. J Rheumatol. 1999/03/25 ed. 1999. ACR and EULAR improvement criteria have comparable validity in rheumatoid arthritis trials. American College of Rheumatology European League of Associations for Rheumatology; pp. 705–11. [PubMed: 10090187]
29.
Mease PJ, Goffe BS, Metz J, VanderStoep A, Finck B, Burge DJ. Etanercept in the treatment of psoriatic arthritis and psoriasis: a randomised trial. Lancet. 2000 Jul 29;356(9227):385–90. [PubMed: 10972371]
30.
Fredriksson T, Pettersson U. Severe psoriasis--oral therapy with a new retinoid. Dermatologica. 1978;157(4):238–44. [PubMed: 357213]