Results

Andrea C. Skelly; Roger Chou; Joseph R. Dettori; Judith A. Turner; Janna L. Friedly; Sean D. Rundell; Rongwei Fu; Erika D. Brodt; Ngoc Wasson; Shelby Kantner; Aaron J.R. Ferguson

NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Skelly AC, Chou R, Dettori JR, et al. Noninvasive Nonpharmacological Treatment for Chronic Pain: A Systematic Review Update [Internet]. Rockville (MD): Agency for Healthcare Research and Quality (US); 2020 Apr. (Comparative Effectiveness Review, No. 227.)

Results

Introduction

Results are organized by Key Question (i.e., by condition) and intervention and then organized by comparators for each subquestion. We categorized postintervention followup as short term (1 to <6 months), intermediate term (≥6 to <12 months) and long term (≥12 months). We prioritized function and pain outcomes based on validated measures. For some conditions (e.g., osteoarthritis [OA]), results are organized by affected region.

We synthesized data qualitatively and quantitatively, using meta-analysis where appropriate. Two continuous primary outcomes (pain, function) provided adequate data for meta-analysis. For meta-analyses providing pooled estimates, we report results from heterogeneity testing. I-squared and corresponding p-values describe the degree and statistical significance of heterogeneity across studies; pooled (subtotal) estimates are statistically significant if the confidence interval does not include the value of 0 for mean differences (MDs) or the value of 1 for risk ratios (RR). (See the Methods section of this report and the protocol for additional details on data analysis and synthesis.) In general, if effect estimates tended to favor one treatment but failed to reach statistical significance with confidence interval crossing the null value of zero or one (perhaps due to sample size), the results are interpreted as showing no clear difference between treatments. If effect estimates are close to zero and not statistically significant, results are interpreted as no difference between groups.

A list of acronyms and abbreviations appears at the end of the report.

Results of Literature Searches

The search and selection of articles are summarized in the literature flow diagram (Figure 2). The original database searches resulted in 4,996 potentially relevant articles; an additional 3520 were identified for this update. After dual review of abstracts and titles, 1574 articles across searches (381 new to this update) were selected for full-text dual review, and 252 publications (34 added for this update) were determined to meet inclusion criteria and were included in this review. Nearly one-fourth of the trials excluded at full text did not meet our criteria for followup duration (i.e., a minimum of 1 month of followup after termination of the intervention, or postintervention if the intervention duration was at least 6 months). Other common reasons for exclusion of primary trials included ineligible population and ineligible intervention or comparator (i.e., combination of treatments or if treatments were additive in nature). Data abstraction and quality assessment tables for all included studies are available in Appendixes D and E.

Description of Included Studies

A total of 233 trials (in 252 publications) were included. For each intervention category, the comparisons evaluated and their respective studies are listed in Table 4. The number of studies and related publications included for each condition (and the number of new studies and publications in this update review) are:

Chronic low back pain: 77 studies in 83 publications (9 new trials)
Chronic neck pain: 27 studies in 28 publications (2 new trials, 1 new publication)
Osteoarthritis: 62 studies in 66 publications (9 new trials in 10 publications
Fibromyalgia: 58 studies in 66 publications (11 new trials in 12 publications)
Chronic tension headache: 9 studies

Thirty-six percent of the included trials were small (<70 participants). Across trials, most patients were female (>57%), with a mean ages ranging from 31 to 78 years; patients with OA tended to be older in general than those in the other conditions (range, 52 to 76 years). Mean pain duration for patients with chronic low back pain, chronic neck pain, and OA were similar and varied widely from 6 months to 15 years. Mean symptom duration in trials of fibromyalgia and chronic tension headache tended to be at least 4 years (up to 22 years). Exercise interventions were the most commonly studied for OA and fibromyalgia. Psychological therapies were most commonly studied for fibromyalgia, and manual therapies were most commonly studied for chronic low back pain. We identified trials of acupuncture for all included conditions. Multidisciplinary rehabilitation was studied primarily for chronic low back pain and fibromyalgia. Most trials of multidisciplinary rehabilitation used a functional restoration approach either explicitly or implicitly. Limited evidence was available for hip or hand OA or chronic tension headache. The majority of trials compared nonpharmacological interventions with usual care, waitlist, no treatment, attention control, or placebo/sham, with very few trials employing pharmacological treatments or exercise as comparators. Little long-term evidence was available across conditions and interventions.

The majority of trials (61%) were rated fair quality with only 6 percent considered good quality (Figure 3). For chronic tension headache, no study was considered good quality. In the majority of trials (72%), attrition was under 20 percent and therefore rated as acceptable. Across trials where attrition was not acceptable, the range was 20 to 63 percent. A primary methodological limitation in many trials was the inability to effectively blind participants and in many cases providers. Poor reporting of randomization and allocation concealment methods were common shortcomings. Acceptable adherence, defined as completion of a minimum of 80 percent of planned treatment, was reported in 44 percent of trials. It was either unclear (40%) or unacceptable (16%) in the majority of trials.

Key Question 1. Chronic Low Back Pain

For chronic low back pain, 68 randomized controlled trials (RCTs) (in 74 publications) were included in the prior Agency for Healthcar Research and Quality (AHRQ) report (N=13,163). Two studies were rated good-quality, 49 studies fair quality, and 17 studies poor quality. The prior AHRQ report found massage, yoga, psychological therapies, exercise, acupuncture, low-level laser therapy, spinal manipulation, and multidisciplinary rehabilitation associated with greater effects than usual care, attention control, sham, or placebo on improved pain or function. The strength of evidence was low or moderate, generally stronger for pain than for function, and observed at short- or intermediate-term followup, with the exception of psychological therapies, which were associated with small effects at long-term followup.

For this update, we identified nine new RCTs (N=1,026). Three of the new studies were rated good quality; four were rated fair quality, and two were rated poor quality. The new trials evaluated exercise (5 trials) massage (2 trials), yoga (2 trials), and interferential therapy (1 trial); one trial evaluated both exercise and yoga interventions. The Key Points summarize the main findings based on the evidence included in the prior report and new trials; the Key Points note where new trials contributed to findings.

Exercise for Chronic Low Back Pain

Key Points

Exercise was associated with a small improvement in short-term function compared with usual care, an attention control, or a placebo intervention (10 trials [4 new], pooled standardized mean difference [SMD] −0.31, 95% confidence interval [CI] −0.50 to −0.13, I²=32%) after excluding an outlier trial; there were no effects on intermediate-term function (5 trials [2 new], pooled SMD −0.17, 95% CI −0.39 to 0.02, I²=0%) or long-term function (1 trial, difference 0.00 on the 0 to 100 Oswestry Disability Index [ODI], 95% CI −11.4 to 11.4) (strength of evidence [SOE]: moderate for short term, low for intermediate and long term).
Exercise was associated with moderate effects on pain versus usual care, an attention control, or a placebo intervention at short-term (11 trials [5 new], pooled difference −1.21 on a 0 to 10 scale, 95% CI −1.77 to −0.65, I²=64%) and long-term (1 trial, difference −1.55, 95% CI −2.76 to −0.34), and a small effect at intermediate-term (5 trials [2 new], pooled MD −0.85, 95% CI −1.67 to −0.07, I²=50%) followup (SOE: low for all timepoints).
No trial evaluated exercise versus pharmacological therapy.
Comparisons involving exercise versus other nonpharmacological therapies are addressed in the sections for the other therapies.
Harms were not reported in most trials; one trial did not find an association between exercise and increased pain versus placebo and one trial reported no adverse events (SOE: low).

Detailed Synthesis

Eleven trials of exercise therapy for low back pain met inclusion criteria (Table 5 and Appendix D).³¹^–⁴⁰^,²¹² Six trials³¹^–³⁶ were included in the prior AHRQ report and five³⁷^–⁴⁰^,²¹² were added for this update. Three trials (1 new) evaluated neuromuscular re-education exercise (motor control exercises),³¹^,³²^,³⁸ four trials 2 new) muscle performance exercises (Pilates or modified Pilates),³⁵^,³⁶^,⁴⁰^,²¹² three trials (1 new) combined exercise techniques,³³^,³⁴^,³⁹ and one trial evaluated strength training.³⁷ Sample sizes ranged from 42 to 295 (total sample=1,204). Five trials compared exercise versus an attention control,³²^,³³^,³⁵^,³⁷^,³⁸ four trials compared exercise versus usual care,³⁴^,³⁶^,⁴⁰^,²¹² and two trials compared exercise versus a placebo intervention (detuned diathermy and ultrasound).³¹^,³⁹ Five trials (1 new)³¹^–³⁴^,³⁷ were conducted in the United States, Europe, or Australia, four trials (2 new)³⁵^,³⁶^,³⁹^,²¹² in Brazil, one new trial³⁸ in Asia, and one new trial⁴⁰ in Iran. The duration of exercise therapy ranged from 6 to 12 weeks and the number of exercise sessions ranged from 6 to 24. Three trials reported outcomes through long-term followup,³²^,³⁹^,²¹² four trials reported outcomes through intermediate-term followup³¹^,³³^,³⁹^,²¹² and the remainder only evaluated short-term outcomes.

Two trials (both new)³⁹^,²¹² were rated good quality, seven trials (2 new)³¹^–³³^,³⁵^–³⁸ were rated fair quality, and two trials (1 new)³⁴^,⁴⁰ were rated poor quality (Appendix E). In two fair-quality trials,³¹^,³⁶ the main methodological limitation was the inability to blind interventions. Limitations in the other trials included unclear randomization and allocation concealment methods, high loss to followup, and baseline differences between intervention groups.

Exercise Compared With Usual Care, an Attention Control, or a Placebo Intervention

Exercise was associated with small effects on short-term function versus controls (11 trials, pooled SMD −0.51, 95% CI −0.98 to −0.08, I²=88%) (Figure 4).³¹^–⁴⁰^,²¹² Excluding one trial³⁸ that reported a much higher SMD (−3.1) and smaller standard deviation (~1.0) compared to the other trials (SMD range −0.81 to 0.17 and standard deviation range 5 to 17) also resulted in a pooled estimate that favored exercise, though the difference was attenuated (10 trials, pooled SMD −0.31, 95% CI −0.50 to −0.13, I²=32%). Seven trials that evaluated function using the Roland-Morris Disability Questionnaire (RDQ) (0 to 24 scale) reported a pooled difference of −2.86 points (95% CI −3.36 to −1.05).³¹^,³⁴^–³⁶^,³⁸^,³⁹^,²¹² and two trials that used the ODI (0 to 100 scale) reported differences that ranged from 3.7 points favoring exercise⁴⁰ to 2.9 points favoring an attention control.³² There were no clear differences in estimates when analyses were stratified according to the type of exercise (pooled SMD estimates ranged from −0.08 to −0.54) or the type of control, or when poor-quality trials were excluded. There were no differences between exercise versus controls in intermediate-term function (5 trials, pooled SMD −0.17, 95% CI −0.39 to 0.02, I²=0%)³¹^–³³^,³⁹^,²¹² or long-term function (1 trial, difference 0.00, 95% CI −11.4 to 11.4 on the ODI).³²

Exercise was associated with moderate effects on short-term pain versus usual care, an attention control, or a placebo intervention (11 trials, pooled difference −1.21 on a 0 to 10 scale, 95% CI −1.77 to −0.65, I²=64%) (Figure 5).³¹^–³⁶^,³⁸^–⁴⁰^,²¹² There were no clear differences in estimates when analyses were stratified according to the type of exercise (pooled differences ranged from −0.59 to −0.98 points on a 0 to 10 scale), the type of control (usual care, attention control, or placebo intervention), and when poor-quality trials were excluded. Exercise was associated with small effects on intermediate-term pain versus controls (5 trials, pooled difference −0.85, 95% CI −1.67 to −0.07, I²=50%).³¹^–³³^,³⁹^,²¹² For long-term pain, effects of exercise on pain were moderate compared with attention control, but findings were based on one trial (difference −1.55, 95% CI −2.76 to −0.34).³²

Evidence on effects of exercise on quality of life was limited. One trial³² found no differences between exercise versus an attention control on the Nottingham Health Profile at short-term, intermediate-term, or long-term followup, and one trial³⁶ found exercise associated with higher scores on the Short-Form 36 (SF-36) physical functioning (difference 5.8 points on 0 to 100 scale, p=0.026), bodily pain (difference 8.3 points, p=0.03), and vitality subscales (difference 5.3 points, p=0.029) at short-term followup; there were no differences on other SF-36 subscales (Table 5). Another trial found exercise associated with greater improvement in the SF-36 Physical Component Summary versus an attention control (difference 8.26 on a 0 to 100 scale, 95% CI 5.27 to 11.25) but no difference on the SF-36 Mental Component Summary (difference 1.27, 95% CI −3.38 to 5.92).³⁸

No trial evaluated effects of exercise on use of opioid therapies or healthcare utilization. There was insufficient evidence to determine effects of duration of exercise therapy or number of sessions on outcomes.

Exercise Compared With Pharmacological Therapy

No trial of exercise versus pharmacological therapy met inclusion criteria.

Exercise Compared With Other Nonpharmacological Therapies

Findings for exercise versus other nonpharmacological therapies are addressed in the sections on other nonpharmacological therapies.

Harms

Harms were not reported in most trials. One trial³¹ found no difference between exercise and a placebo intervention (detuned diathermy) in likelihood of increased pain, and another trial³⁵ reported no adverse events (Appendix D).

Psychological Therapies for Chronic Low Back Pain

Key Points

Psychological therapy was associated with small improvements in function compared with usual care or an attention control at short-term (3 trials, pooled SMD −0.24, 95% CI −0.38 to −0.04, I²=0%), intermediate-term (3 trials, pooled SMD −0.24, 95% CI −0.38 to −0.10, I²=0%), and long-term followup (3 trials, pooled SMD −0.28, 95% CI −0.43 to −0.13, I²=0%) (SOE: moderate).
Psychological therapy was associated with small improvements in pain compared with usual care or an attention control at short-term (3 trials, pooled difference −0.75 on a 0 to 10 scale, 95% CI −1.01 to −0.41, I²=0%), intermediate-term (3 trials, pooled difference −0.71, 95% CI −0.97 to −0.46, I²=0%), and long-term followup (3 trials, pooled difference −0.55, 95% CI −0.92 to −0.23, I²=0%) (SOE: moderate).
Evidence from one poor-quality trial was too unreliable to determine effects of psychological therapy versus exercise (SOE: insufficient).
One trial of cognitive behavioral therapy versus an attention control reported no serious adverse events and one withdrawal due to adverse events in 468 patients (SOE: low).

Detailed Synthesis

Five trials (reported in 6 publications) of psychological therapies for low back pain met inclusion criteria (Table 6 and Appendix D).¹⁰⁴^–¹⁰⁸^,¹³³^,¹⁹⁵ All of the trials were included in the prior AHRQ report. Three trials evaluated group cognitive-behavioral therapy (CBT),¹⁰⁴^–¹⁰⁷ one trial evaluated respondent therapy (progressive muscle relaxation),¹⁰⁸ and one trial evaluated operant therapy.¹³³ Sample sizes ranged from 49 to 701 (total sample=1,308). The number of psychological therapy sessions ranged from six to eight, and the duration of therapy ranged from 6 to 8 weeks. In one trial¹⁰⁶^,¹⁰⁷ the duration of therapy was unclear. Three trials compared psychological therapies versus usual care,¹⁰⁴^,¹⁰⁵^,¹⁰⁸ one trial compared psychological therapy versus an attention control (advice),¹⁰⁶^,¹⁰⁷ and one trial compared psychological therapy versus exercise therapy.¹³³ All trials were conducted in the United States or the United Kingdom. Four trials reported outcomes through long-term (12 to 34 months) followup,¹⁰⁵^–¹⁰⁷^,¹³³^,¹⁹⁵ one trial evaluated outcomes through intermediate-term followup,¹⁰⁴ and one trial only evaluated short-term outcomes.¹⁰⁸

Three trials¹⁰⁴^–¹⁰⁷ were rated fair quality and two trials poor quality (Appendix E).¹⁰⁸^,¹³³ The major methodological limitation in the fair-quality trials was the inability to effectively blind patients and caregivers to the psychological intervention. Other methodological shortcomings in the poor-quality trials included unclear randomization and allocation concealment methods and high attrition.

Psychological Therapy Compared With Usual Care or an Attention Control

Psychological therapy was associated with small improvements in function compared with usual care or an attention control at short-term (3 trials, pooled SMD −0.24, 95% CI −0.38 to −0.04, I²=0%),¹⁰⁴^,¹⁰⁶^,¹⁰⁸ intermediate-term (3 trials, pooled SMD −0.24, 95% CI −0.38 to −0.10, I²=0%)¹⁰⁴^–¹⁰⁶ and long-term followup (3 trials, pooled SMD −0.28, 95% CI −0.43 to −0.13, I²=0%) (Figure 6).¹⁰⁵^,¹⁰⁶^,¹⁹⁵ Pooled differences on the RDQ or modified RDQ were −1.2 to −1.5 points at all time points. For short-term function, two fair-quality trials¹⁰⁴^,¹⁰⁶^,¹⁰⁷ evaluated CBT and one poor-quality trial¹⁰⁸ evaluated respondent therapy (progressive relaxation). Excluding the poor-quality trial of progressive relaxation,¹⁰⁸ which found no effect on short-term function (SMD −0.08, 95% CI −0.48 to 0.31), had no effect on the pooled estimate (2 trials, pooled SMD −0.26, 95% CI −0.44 to −0.05).

Psychological therapy was associated with small improvements in pain compared with usual care or an attention control at short-term (3 trials, pooled difference −0.75 on a 0 to 10 scale, 95% CI −1.01 to −0.41, I²=0%),¹⁰⁴^,¹⁰⁶^,¹⁰⁸ intermediate-term (3 trials, pooled difference −0.71, 95% CI −0.97 to −0.46, I²=0%),¹⁰⁴^–¹⁰⁶ or long-term followup (3 trials, pooled difference −0.55, 95% CI −0.92 to −0.23, I²=0%) (Figure 7).¹⁰⁵^,¹⁰⁷^,¹⁹⁵ Excluding a poor-quality trial of progressive relaxation, which found no effect on short-term pain (difference −0.14, 95% CI −1.27 to 0.99), did not change the pooled estimate (2 trials, pooled difference −0.78, 95% CI −1.08 to −0.47). For intermediate-term and long-term pain, all trials were fair quality and evaluated CBT.

Effects of psychological therapy on short-term or intermediate-term SF-36 Physical Component (PCS) or Mental Component (MCS) scores were small (differences 0 to 2 points on a 0 to 100 scale) and not statistically significant, except for short-term MCS (2 trials, pooled difference 2.18, 95% CI 0.37 to 4.05).¹⁰⁴^,¹⁰⁶ One trial found no effect of psychological therapy on work status or healthcare visits¹⁰⁷ and one trial found no effect of psychological therapy on markers of healthcare utilization.¹⁹⁶

Psychological Therapy Compared With Pharmacological Therapy

No trial of psychological versus pharmacological therapy met inclusion criteria.

Psychological Therapy Compared With Exercise

One poor-quality trial found no differences between psychological versus exercise therapy in intermediate-term or long-term function.¹³³ Differences on the McGill Pain Questionnaire were less than 0.5 points on a 0 to 78 scale, and differences on the Sickness Impact Profile were 0.60 to 1.30 points on a 0 to 100 scale.

Harms

Data on harms were sparse. One trial of cognitive-behavioral therapy versus an attention control reported no serious adverse events and one withdrawal due to adverse events among 468 patients randomized to CBT.¹⁰⁶^,¹⁰⁷

Physical Modalities for Chronic Low Back Pain

Key Points

Ultrasound

Two trials found inconsistent effects of ultrasound versus sham ultrasound on short-term function (SOE: insufficient). Two trials found no differences between ultrasound versus sham ultrasound in short-term pain (SOE: low).
One trial found no differences between ultrasound versus sham ultrasound in risk of any adverse events or risk of serious adverse events (SOE: low).

Interferential Therapy

One new trial found interferential therapy associated with effects on short-term function and pain that were below the threshold for small (statistical significance uncertain) when compared with a placebo therapy (SOE: low).

Low-Level Laser Therapy

One trial found low-level laser therapy associated with a small improvement compared with sham laser for short-term function (difference −8.2 on the 0 to 100 ODI, 95% CI −13.6 to −2.8) and a moderate improvement for short-term pain (difference −16.0 on a 0 to 100 scale, 95% CI −28.3 to −3.7) (SOE: low).
One trial found no differences between low-level laser therapy versus exercise therapy in intermediate-term function or pain (SOE: low).
One trial of low-level laser therapy reported no adverse events (SOE: low).

Traction

Two trials found no differences between traction versus sham traction in short-term function or pain (SOE: low).
Harms were not reported in either trial.

Short-Wave Diathermy

Data from a small, poor-quality trial were insufficient to determine effects of short-wave diathermy versus sham (detuned) diathermy (SOE: insufficient).

Detailed Synthesis

Ultrasound

Two trials (n=50 and n=455) of ultrasound versus sham ultrasound for low back pain met inclusion criteria (Table 7 and Appendix D).¹³⁹^,¹⁴⁰ Both of the trials were included in the prior AHRQ report. The duration of ultrasound therapy was 4 and 8 weeks and the number of sessions was 6 and 10. Both trials evaluated outcomes at short-term (1 month) followup. One good-quality trial¹⁴⁰ was conducted in the United States and one fair-quality trial¹³⁹ in Iran (Appendix E). Methodological limitations in the fair-quality trial included failure to blind care providers and unclear blinding of outcome assessors.

Ultrasound Compared With Sham Ultrasound

Limited evidence indicated no clear differences between ultrasound versus sham ultrasound at short-term followup. One good-quality trial (n=455) found no difference between ultrasound versus sham ultrasound in the RDQ (median 3 vs. 3, p=0.93), likelihood for ≥50 percent improvement in pain (RR 1.09, 95% CI 0.88 to 1.35), SF-36 general health (median 72 vs. 74), likelihood of prescription drug use for low back pain (16% vs. 18%, p=0.54), or risk of serious adverse events (1.3% vs. 2.7%, RR 0.48, 95% CI 0.12 to 1.88) or any adverse event (6.0% vs. 5.9%, RR 1.03, 95% CI 0.49 to 2.13).¹⁴⁰ In the smaller (n=50) fair-quality trial, there was no difference between ultrasound versus sham ultrasound in pain (mean 27.7 vs. 25.5 on a 0 to 100 scale, p=0.48), although ultrasound was associated with better function (mean 22.8 vs. 30.5 on the 0 to 40 Functional Rating Index, p=0.004).¹³⁹ No trial evaluated longer-term outcomes.

Ultrasound Compared With Pharmacological Therapy or With Exercise

No trial of ultrasound versus pharmacological therapy or versus exercise met inclusion criteria.

Harms

One trial found no differences between ultrasound versus sham ultrasound in risk of any adverse event (RR 1.03, 95% CI 0.49 to 2.13) or serious adverse event (RR 0.48, 95% CI 0.12 to 1.88).¹⁴⁰

Interferential Therapy

One new trial (n=150)¹⁴⁴ of interferential therapy met inclusion criteria (Table 8 and Appendix D). It found small differences between 1 kHz or 4 kHz interferential therapy versus placebo therapy in the RDQ (differences 0.2 or 0.3 points) and pain (differences 0.2 or 0.4 points) at short-term followup; the statistical significance of findings was unclear due to errors in reporting of the confidence intervals (confidence intervals did not incorporate the point estimates). The trial was rated fair-quality due to the data discrepancies.

Interferential Therapy Compared With Pharmacological Therapy or With Exercise

No trial of interferential therapy versus pharmacological therapy or versus exercise met inclusion criteria.

Harms

One trial found no differences between 1 kHz or 4 kHz interferential therapy versus placebo interferential current in withdrawals due to adverse event (4% vs. 4% vs. 4%, RR 1.0, 95% CI 0.14 to 6.8).¹⁴⁴

Low-Level Laser Therapy

Three trials of low-level laser therapy (n=34, 56, and 71) met inclusion criteria (Table 9 and Appendix D).¹⁴¹^,¹⁴²^,¹⁷⁰ All of the trials were included in the prior AHRQ report. One trial¹⁴² evaluated neodymium:yttrium-aluminum-garnet (Nd:YAG) laser and two trials¹⁴¹^,¹⁷⁰ evaluated gallium-arsenide (GaAs) laser. Two trials compared low-level laser therapy versus sham laser therapy¹⁴¹^,¹⁴² and one trial low-level laser therapy versus exercise plus sham laser.¹⁷⁰ One trial was conducted in the United States,¹⁴² one in Iran,¹⁷⁰ and one in Argentina.¹⁴¹ The duration of laser therapy ranged from 2 to 6 weeks and the number of sessions ranged from 10 to 12. One trial¹⁴¹ reported intermediate-term outcomes and the other two trials reported short-term outcomes.

Two trials¹⁴²^,¹⁷⁰ were rated fair quality and one trial¹⁴¹ poor quality (Appendix E). The major methodological limitation in the fair-quality trials was unclear allocation concealment methods.¹⁴²^,¹⁷⁰ The poor-quality trial also did not report randomization methods, did not conduct intention-to-treat analysis at intermediate-term followup, and reported high attrition; it was also unclear if timing of followup was the same in all patients.¹⁴¹

Low-Level Laser Therapy Compared With Sham Laser

One fair-quality trial found Nd:YAG laser therapy associated with moderate improvement in pain (difference −16.0 on a 0 to 100 scale, 95% CI −28.3 to −3.7) and a small improvement in function (difference −8.2 points on the 0 to 100 ODI, 95% CI −13.6 to −2.8) at short-term followup.¹⁴² A poor-quality trial found GaAs laser therapy associated with increased likelihood of having no pain at intermediate-term followup (44.7% vs. 15%, p<0.01), but the analysis was restricted to patients who reported that laser therapy was effective at the end of a 2-week course of treatment.¹⁴¹

Low-Level Laser Therapy Compared With Pharmacological Therapy

No trial of low-level laser therapy compared with pharmacological therapy met inclusion criteria.

Low-Level Laser Therapy Compared With Exercise Therapy

One fair-quality trial found no clear differences between GaAs laser therapy versus exercise plus sham laser in function (difference in change from baseline −4.4 on the 0 to 100 ODI, 95% CI −11.4 to 2.5) or pain (difference in change from baseline −0.9 on a 0 to 10 scale, 95% CI −2.5 to 0.7) at intermediate-term followup.¹⁷⁰ For pain, the difference at followup was similar to the baseline difference (mean 7.3 vs. 6.3), and final scores were very similar (4.4 vs. 4.3).

Harms

No adverse events were reported in any of the three trials of low-level laser therapy.¹⁴¹^,¹⁴²^,¹⁷⁰

Traction

Two trials of traction (n=151 and 60) met inclusion criteria (Table 10 and Appendix D).¹³⁷^,¹³⁸ Both of the trials were included in the prior AHRQ report. One trial¹³⁷ evaluated continuous traction (12 sessions in 5 weeks) and the other¹³⁸ evaluated intermittent traction (20 sessions in 6 weeks). The comparator in both trials was sham traction (traction at <10% or 20% of body weight, compared with 35% to 50% for active traction). Both trials were conducted in the Netherlands and reported only short-term outcomes. The trials were rated fair quality due to failure to blind care providers (Appendix E).

Traction Compared With Sham Traction

There were no differences between traction versus sham traction at short-term followup in function (25 vs. 23 on the 0 to 100 ODI in one trial and 4.7 vs. 4.0 on the 0 to 24 RDQ, difference 0.7, 95% CI −1.1 to 2.6) or pain (32 vs. 36 on a 0 to 100 scale, p=0.70 and 24 vs. 20, difference 3.7, 95% CI −8.4 to 15.8).¹³⁷^,¹³⁸ One trial¹³⁸ also found no difference between intermittent traction versus sham on the total SF-36 (66 vs. 65 on a 0 to 100 scale) and one trial¹³⁷ found no difference between continuous traction versus sham in global perceived effect, work absence, or medical consumption.

Traction Compared With Pharmacological Therapy or With Exercise

No trial of low-level laser therapy compared with pharmacological therapy or with exercise met inclusion criteria.

Harms

Neither trial reported harms.

Short-Wave Diathermy

Data were insufficient from one poor-quality trial (n=68) to evaluate effects of short-wave diathermy (3 times weekly for 4 weeks) versus sham (detuned) diathermy for low back pain (Table11 and Appendix D).¹⁴³ The trial was included in the prior AHRQ report. Methodological limitations included unclear randomization and allocation concealment methods, differential attrition, and baseline differences between groups (Appendix E). Although diathermy was associated with worse pain than sham treatment at short-term (8 weeks after completion of therapy) followup (25 vs. 13), statistical significance was not reported. There were no statistically significant differences in likelihood of using analgesics (7% vs. 22%, RR 0.34, 95% CI 0.08 to 1.50) or being unable to work or having limited activities (7% vs. 19%, RR 0.40, 95% CI 0.09 to 1.80), but estimates were imprecise.

Harms

Adverse events were not evaluated in the trial.

Manual Therapies for Chronic Low Back Pain

Key Points

Spinal Manipulation

Spinal manipulation was associated with small improvements compared with sham manipulation, usual care, an attention control, or a placebo intervention in short-term function (3 trials, pooled SMD −0.34, 95% CI −0.75 to −0.02, I²=45%) and intermediate-term function (3 trials, pooled SMD −0.40, 95% CI −0.85 to −0.05, I²=65%) (SOE: low).
There was no difference between spinal manipulation versus sham manipulation, usual care, an attention control, or a placebo intervention in short-term pain (3 trials, pooled difference −0.36 on a 0 to 10 scale, 95% CI −0.62 to 0.25, I²=0%), but manipulation was associated with a small improvement compared with controls on intermediate-term pain (3 trials, pooled difference −0.64, 95% CI −0.93 to −0.35, I²=0%) (SOE: low for short term, moderate for intermediate term).
There were no differences between spinal manipulation versus exercise in short-term function (3 trials, pooled SMD 0.02, 95% CI −0.28 to 0.30; I²=37%) or intermediate-term function (4 trials, pooled SMD 0.01, 95% CI −0.15 to 0.21; I²=19%) (SOE: low).
There were no differences between spinal manipulation versus exercise in short-term pain (3 trials, pooled difference 0.31 on a 0 to 10 scale, 95% CI −0.42 to 1.06; I²=34%) or intermediate-term pain (4 trials, pooled difference 0.23, 95% CI −0.14 to 0.59, I²=0%) (SOE: low).
No serious adverse events or withdrawals due to adverse events were reported in seven trials; nonserious adverse events with manipulation (primarily increased pain) were reported in three trials (SOE: low).

Massage

Massage was associated with small improvements in short-term function compared with sham massage or usual care (6 trials [2 new], SMD −0.38, 95% CI −0.63 to −0.20, I²=0%). There were no differences between massage versus controls in intermediate-term function (3 trials, SMD −0.09, 95% CI −0.26 to 0.12, I²=0%) (SOE: moderate for short term, low for intermediate term).
Massage was associated with a small improvement in short-term pain compared with sham massage or usual care (5 trials [1 new], pooled difference −0.55 on a 0 to 10 scale, 95% CI −0.88 to −0.23, I²=0%). There was no difference between massage versus controls in intermediate-term pain (3 trials, pooled difference −0.02, 95% CI −0.56 to 0.44, I²=0%) (SOE: moderate for short term, low for intermediate term).
One trial found no differences between massage versus exercise in intermediate-term function or pain (SOE: low).
Four trials of massage reported no serious adverse events; in four trials, the proportion of massage patients who reported increased pain ranged from <1 to 26 percent (SOE: low).

Detailed Synthesis

Spinal Manipulation

Eight trials of spinal manipulation for low back pain met inclusion criteria (Table 12 and Appendix D).¹⁴³^,¹⁷¹^–¹⁷⁴^,¹⁹⁰^–¹⁹² All of the trials were included in the prior AHRQ report. All of the trials evaluated standard (high-velocity low-amplitude) manipulation techniques; one trial¹⁹² evaluated flexion-distraction manipulation and one trial¹⁷² evaluated both high-velocity low-amplitude and flexion-distraction manipulation. Sample sizes ranged from 75 to 1,001 (total sample=2,580). The number of manipulation therapy sessions ranged from 4 to 24 and the duration of therapy ranged from 4 to 12 weeks. In one trial, patients were randomized to 12 manipulation sessions over 1 month or to 12 sessions over 1 month plus biweekly maintenance sessions for an additional 10 months.¹⁷³ Two trials compared spinal manipulation versus usual care,¹⁷²^,¹⁷⁴ one trial spinal manipulation versus an attention control (minimal massage),¹⁷¹ one trial spinal manipulation versus sham manipulation,¹⁷³ one trial spinal manipulation versus a placebo treatment (sham short-wave diathermy),¹⁴³ and four trials spinal manipulation versus exercise.¹⁷⁴^,¹⁹⁰^–¹⁹² One trial was conducted in Egypt¹⁷³ and the rest in the United States, United Kingdom, or Australia. Six trials reported outcomes through intermediate-term followup¹⁷¹^,¹⁷³^,¹⁷⁴^,¹⁹⁰^–¹⁹² and two trials only evaluated short-term outcomes.¹⁴³^,¹⁷²

Two trials¹⁴³^,¹⁷³ were rated poor quality and the remainder fair quality (Appendix E). The major methodological limitation in the fair-quality trials was use of an unblinded design. Methodological shortcomings in the poor-quality trials included unclear randomization and allocation concealment methods, failure to report intention-to-treat analysis, and high attrition.

Spinal Manipulation Compared With Sham Manipulation, Usual Care, an Attention Control, or a Placebo Intervention

Spinal manipulation was associated with small improvements in function compared with controls at short-term followup (3 trials, SMD −0.34, 95% CI −0.75 to −0.02, I²=45%)¹⁷¹^–¹⁷³ and intermediate-term followup (3 trials, SMD −0.40, 95% CI −0.85 to −0.05, I²=65%)¹⁷¹^,¹⁷³^,¹⁷⁴ (Figure 8). Based on the original 0 to 100 scales (ODI and Von Korff functional disability [VF]) used in two trials, the pooled difference was −5.12 (95% CI −10.53 to 0.77) for short-term function and −9.27 (95% CI −13.42 to −5.12) for intermediate-term function. Estimates were similar when a poor-quality trial¹⁷³ was excluded. For short-term function, one trial reported similar effects for standard manipulation (difference −1.3 on the RDQ, 95% CI −2.9 to 0.6) and flexion-distraction manipulation (difference −1.9, 95% CI −3.6 to −0.2); therefore, results for both arms were combined for the pooled analysis.¹⁷²

There was no clear difference between spinal manipulation versus sham manipulation, an attention control, or a placebo intervention in short-term pain (3 trials, pooled difference −0.36 on a 0 to 10 scale, 95% CI −0.62 to 0.25, I²=0%) (Figure 9).¹⁴³^,¹⁷¹^,¹⁷³ Two of the trials were rated poor quality; the results of the fair-quality trial¹⁷¹ were consistent with the overall estimate (difference −0.21, 95% CI −0.69 to 0.26). Manipulation was associated with a small improvement in intermediate-term pain compared with sham manipulation, usual care, or an attention control (3 trials, pooled difference −0.64 on a 0 to 10 scale, 95% CI −0.93 to −0.35, I²=0%).¹⁷¹^,¹⁷³^,¹⁷⁴ The estimate was similar when a poor-quality trial¹⁷³ was excluded (2 trials, difference −0.60, 95% CI −0.98 to −0.21).¹⁷¹^,¹⁷⁴

Two trials found no differences between spinal manipulation versus controls on the SF-36 MCS and PCS.¹⁷¹^,¹⁷⁴ One trial¹⁷¹ found no differences in short-term PCS (mean difference 0.94 on a 0 to 100 scale, 95% CI −1.55 to 3.42) or MCS scores (mean difference −0.17 on a 0 to 100 scale, 95% CI −2.70 to 2.36) at short-term followup. At intermediate-term followup, pooled differences were also very small and not statistically significant for the PCS (2 trials, mean difference 1.54, 95% CI −0.03 to 3.10, I²=0%) or the MCS (2 trials, mean difference 0.52, 95% CI −1.94 to 2.97, I²=44%).¹⁷¹^,¹⁷⁴

Spinal Manipulation Compared With Pharmacological Therapy

No trial of spinal manipulation versus pharmacological therapy met inclusion criteria.

Spinal Manipulation Compared With Exercise

There were no differences between spinal manipulation versus exercise in function at short-term (3 trials, SMD 0.02, 95% CI −0.28 to 0.30, I²=37%)¹⁹⁰^–¹⁹² or intermediate-term followup (4 trials, SMD 0.01, 95% CI −0.15 to 0.21, I²=19%)¹⁷⁴^,¹⁹⁰^–¹⁹² (Figure 10). Excluding one trial¹⁹² of flexion-distraction manipulation resulted in similar findings.

There were no differences between spinal manipulation versus exercise in short-term pain (3 trials, pooled difference 0.31, 95% CI −0.42 to 1.06, I²=34%)¹⁹⁰^–¹⁹² or intermediate-term pain (4 trials, pooled difference 0.23, 95% CI −0.14 to 0.59, I²=0%) (Figure 11).¹⁷⁴^,¹⁹⁰^–¹⁹² Excluding one trial¹⁹² of flexion-distraction manipulation resulted in similar findings.

Two trials found no differences between spinal manipulation versus controls on the SF-36 MCS and PCS.¹⁷⁴^,¹⁹⁰ One trial found no differences in short-term PCS (mean difference −1.25 on a 0 to 100 scale, 95% CI −3.32 to 0.83) or MCS scores (mean difference 0.95, 95% CI −0.96 to 2.86).¹⁹⁰ At intermediate-term followup, pooled differences were also very small (<1 point) and not statistically significant for the PCS (2 trials, mean difference −0.89, 95% CI −2.33 to 0.55, I²=0%) or the MCS (2 trials, mean difference 0.64, 95% CI −0.96 to 2.24).¹⁷⁴^,¹⁹⁰

Harms

Seven trials of spinal manipulation reported no serious adverse events or withdrawals due to adverse events.¹⁷¹^–¹⁷⁴^,¹⁹⁰^–¹⁹² Nonserious adverse events (primarily increased pain) were reported in three trials.¹⁷¹^,¹⁷³^,¹⁹⁰

Massage

Eight trials of massage for low back pain met inclusion criteria (Table 13 and Appendix D).¹⁰⁸^,¹⁷⁵^–¹⁸⁰^,¹⁸⁹ Six trials¹⁰⁸^,¹⁷⁵^–¹⁷⁸^,¹⁸⁹ were included in the prior AHRQ report and two new trials¹⁷⁹^,¹⁸⁰ were identified for this update. Massage techniques varied across trials. Two trials evaluated reflexology,¹⁰⁸^,¹⁷⁸ two trials (one new) myofascial release,¹⁷⁵^,¹⁷⁹ one trial relaxation or structural massage,¹⁷⁷ one trial (new) acupressure¹⁸⁰ and two trials mixed massage techniques that included Swedish massage.¹⁷⁶^,¹⁸⁹ Sample sizes ranged from 15 to 401 (total sample=1,133). Two trials compared massage versus sham massage,¹⁷⁵^,¹⁷⁸ three trials massage versus usual care,¹⁰⁸^,¹⁷⁷^,¹⁸⁹ and one trial compared massage versus an attention control (self-care education).¹⁷⁶ Two new trials compared the intervention to sham, one new trial compared acupressure to sham acupressure,¹⁸⁰ and one new trial compared myofascial release to sham myofascial release.¹⁷⁹ One trial was conducted in India,¹⁷⁵ one trial in Iran,¹⁸⁰ and the rest in the United States or Europe. The duration of massage therapy ranged from 2 to 10 weeks and the number of massage sessions ranged from 4 to 24. Three trials reported outcomes through intermediate-term followup,¹⁷⁶^,¹⁷⁷^,¹⁸⁹ and five only reported short-term outcomes.¹⁰⁸^,¹⁷⁵^,¹⁷⁸^–¹⁸⁰ No trial reported long-term outcomes.

Seven of the massage trials were rated fair-quality¹⁰⁸^,¹⁷⁵^–¹⁷⁹^,¹⁸⁹ and one trial was rated poor-quality¹⁸⁰ (Appendix E). Methodological limitations included unclear allocation concealment methods and unblinded design. One trial reported high loss to followup¹⁰⁸; the poor quality trial¹⁸⁰ also was unclear regarding blinding of outcome assessors and did not provide information on treatment compliance.

Massage Compared With Sham Massage, Usual Care, or an Attention Control

Massage was associated with small effects on short-term function versus sham massage or usual care (6 trials, SMD −0.38, 95% CI −0.63 to −0.20, I²=0%) (Figure 12).¹⁰⁸^,¹⁷⁵^,¹⁷⁷^–¹⁸⁰ The massage technique was myofascial release in two trials (pooled SMD −0.45, 95% CI −0.88 to −0.04,¹⁷⁵^,¹⁷⁹ structural or relaxation massage in one trial (difference −1.72 on the 0 to 23 modified RDQ, 95% CI −2.78 to −0.67),¹⁷⁷ foot reflexology in two trials (pooled SMD −0.15, 95% CI −0.60 to 0.50),¹⁰⁸^,¹⁷⁸ and acupressure in one trial (mean difference −12.2, 95% CI −18.6 to −5.8 on the 9 to 63 Fatigue Severity Scale).¹⁸⁰ Estimates were similar when trials were stratified according to whether the comparator was sham massage or usual care. There was no effect on intermediate-term function (3 trials, SMD −0.09, 95% CI −0.26 to 0.12, I²=0%) (Figure 12).¹⁷⁶^,¹⁷⁷^,¹⁸⁹

Massage was associated with small effects on short-term pain versus sham massage or usual care (5 trials, pooled difference −0.55 on a 0 to 10 scale, 95% CI −0.88 to −0.23, I²=0%) (Figure 13).¹⁰⁸^,¹⁷⁵^,¹⁷⁷^–¹⁷⁹ On a 0 to 10 scale, effects were −0.60 points (95% CI −1.72 to 0.46) in two trials of foot reflexology,¹⁰⁸^,¹⁷⁸ −0.68 points (95% CI −1.35 to −0.10) in two trials of myofascial release,¹⁷⁵^,¹⁷⁹ and −0.35 points (95% CI −0.82 to 0.12) in a trial of relaxation or structural massage.¹⁷⁷ Estimates were similar when trials were stratified according to whether the comparator was sham massage or usual care. There was no difference between massage (structural or relaxation massage or mixed massage techniques, including Swedish massage) versus an attention control or usual care in intermediate-term pain (3 trials, pooled difference −0.02, 95% CI −0.56 to 0.44, I²=0%).¹⁷⁶^,¹⁷⁷^,¹⁸⁹

One trial found no difference between massage versus usual care in use of opioids at intermediate-term followup or healthcare costs.¹⁷⁷ There was insufficient evidence to determine effects of duration of massage or number of massage sessions on findings. Two trials¹⁷⁷^,¹⁸⁹ found no differences between massage versus usual care on the SF-36 MCS (mean difference 0.87 on a 0 to 100 scale, 95% CI −1.01 to 2.75, I²=0%) or PCS scores (mean difference 3.91 on a 0 to 100 scale, 95% CI −4.50 to 12.31, I²=77%) at intermediate-term followup, and one trial¹⁰⁸ found no effects on various SF-36 subscales or the Beck Depression Inventory at short-term followup. One trial found massage associated with greater likelihood of experiencing ≥3 point improvement in the RDQ or ≥20 point improvement on a 0 to 100 VAS pain scale, but did not report statistical significance, which could not be calculated because the denominators were unclear.¹⁷⁹

Massage Compared With Pharmacological Therapies

No trial of massage versus pharmacological therapy met inclusion criteria.

Massage Compared With Exercise

One trial found no differences between massage versus exercise in intermediate-term function (difference 1.2 on the 0 to 24 RDQ, 95% CI −1.47 to 3.87), pain (difference 0.60 on the 0 to 10 Von Korff pain scale, 95% CI −0.67 to 1.87), or the SF-36 MCS or PCS scores (differences 0 to 3 points on 0 to 100 scales, p>0.05).¹⁸⁹

Harms

Four trials¹⁷⁵^,¹⁷⁶^,¹⁷⁹^,¹⁸⁰ of massage reported no serious adverse events, and one trial¹⁷⁸ reported no adverse events. In four trials, the proportion of massage patients who reported increased pain ranged from <1 to 26 percent.¹⁷⁵^–¹⁷⁷^,¹⁸⁹

Mindfulness-Based Stress Reduction for Chronic Low Back Pain

Key Points

There were no differences between mindfulness-based stress reduction (MBSR) versus usual care or attention control in short-term function (4 trials, pooled SMD −0.14, 95% CI −0.51 to 0.02, I²=0%), intermediate-term function (1 trial, SMD −0.20, 95% CI −0.46 to 0.06), or long-term function (1 trial, SMD −0.09, 95% CI −0.35 to 0.16) (SOE: low).
MBSR was associated with a small improvement compared with usual care or an attention control in short-term pain (3 trials, pooled difference −0.68 on a 0 to 10 scale, 95% CI −1.29 to −0.28, I²=45%) after excluding two poor-quality trials; MBSR was also associated with a small improvement in intermediate-term pain (1 trial, difference −0.75, 95% CI −1.16 to −0.34), with no statistically significant effects on long-term pain (1 trial, difference −0.22, 95% CI −0.63 to 0.19) (SOE: moderate for short term, low for intermediate and long term).
One trial reported temporarily increased pain in 29 percent of patients undergoing MBSR, and three trials reported no harms (SOE: low).

Detailed Synthesis

Five trials (7 publications) of MBSR for low back pain met inclusion criteria (Table 14 and Appendix D).¹⁰⁴^,¹⁹⁴^–¹⁹⁹ All of the trials were included in the prior AHRQ report. In three trials,¹⁰⁴^,¹⁹⁵^–¹⁹⁸ the MBSR intervention was closely modeled on the program developed by Kabat-Zinn;²⁸² in the other two trials, the MBSR intervention appeared to have undergone some adaptations from the original Kabat-Zinn program.¹⁹⁴^,¹⁹⁹ In all trials, the main intervention consisting of 1.5 to 2 hour weekly group sessions for 8 weeks. Sample sizes ranged from 35 to 282 (total sample=629). Three trials compared MBSR versus usual care¹⁰⁴^,¹⁹⁴^–¹⁹⁶^,¹⁹⁹ and two trials compared MBSR versus an attention control (education).¹⁹⁷^,¹⁹⁸ Four trials¹⁰⁴^,¹⁹⁵^–¹⁹⁹ were conducted in the United States and one trial¹⁹⁴ in Iran. One trial focused on patients on opioid therapy for low back pain.¹⁹⁹ One trial reported outcomes through long-term (22 months after 8-week MBSR course) followup,¹⁰⁴^,¹⁹⁵^,¹⁹⁶ and the others only evaluated short-term outcomes.

Three trials¹⁰⁴^,¹⁹⁵^–¹⁹⁸ were rated fair quality and two trials poor quality (Appendix E).¹⁹⁴^,¹⁹⁹ The major methodological limitation in the fair-quality trials was the inability to effectively blind patients and caregivers to the MBSR intervention. One poor-quality trial reported unclear randomization and allocation concealment methods and had high attrition,¹⁹⁴ and another poor-quality trial reported a large baseline difference in baseline pain scores (Brief Pain Inventory score 6.3 on a 0 to 10 scale with MBSR versus 4.9 with usual care).¹⁹⁹

MBSR Compared With Usual Care or an Attention Control

MBSR was associated with no statistically significant differences in short-term function compared with usual care or an attention control (4 trials, pooled SMD −0.14, 95% CI −0.51 to 0.02, I²=0%) (Figure 14).¹⁰⁴^,¹⁹⁷^,¹⁹⁸ Three trials¹⁰⁴^,¹⁹⁷^,¹⁹⁸ evaluated function using the RDQ (pooled difference −0.89 points on a 0 to 24 scale, 95% CI −2.37 to 0.30), and one trial¹⁹⁹ used the ODI (difference −3.00 points on a 0 to 100 scale, 95% CI −11.39 to 5.39). One trial found no difference between MBSR versus usual care in intermediate-term (SMD −0.20, 95% CI −0.46 to 0.06) or long-term function (SMD −0.09, 95% CI −0.35 to 0.16).¹⁰⁴^,¹⁹⁵ There was no clear difference between MBSR versus controls in likelihood of a clinically meaningful effect on function (≥30% improvement in RDQ or RDQ improved by ≥2.5 points) at short term (2 trials, 1.17, 95% CI 0.88 to 1.57).¹⁰⁴^,¹⁹⁷ Data were restricted to one trial for intermediate-term (RR 1.41, 95% CI 1.13 to 1.77)¹⁰⁴ and long-term followup (RR 1.32, 95% CI 1.00 to 1.74).¹⁹⁵

MBSR was associated with no statistically significant effects on short-term pain compared with usual care or an attention control, when all trials were included in the analysis (5 trials, pooled difference −0.88 on a 0 to 10 scale, 95% CI −1.82 to 0.08, I²=89%) (Figure 15).¹⁰⁴^,¹⁹⁴^,¹⁹⁷^–¹⁹⁹ However, the estimate favored MBSR and statistical heterogeneity was substantial. Excluding two poor-quality trials,¹⁹⁴^,¹⁹⁹ one of which reported the largest effect in favor of MBSR (−2.23 points) as well as one of which was the only trial with results that favored usual care (mean difference 0.40 points), resulted in a small, statistically significant effect on short-term pain (3 trials, pooled difference −0.68, 95% CI −1.29 to −0.28, I²=45%) and reduced statistical heterogeneity.¹⁰⁴^,¹⁹⁷^,¹⁹⁸ Estimates were similar when analyses were stratified according to whether the trial evaluated usual care or an attention control comparator. One trial found MBSR associated with a small improvement compared with an attention control on intermediate-term pain (difference −0.75 on a 0 to 10 scale, 95% CI −1.16 to −0.34); there was no statistically significant effect on long-term pain (difference −0.22, 95% CI −0.63 to 0.19).¹⁹⁵ MBSR was associated with greater likelihood of a clinically meaningful effect on pain (defined as ≥30% improvement) at short-term (2 trials, RR 1.49, 95% CI 1.14 to 1.95, I²=0%)¹⁰⁴^,¹⁹⁷ and intermediate-term followup (1 trial, RR 1.56, 95% CI 1.14 to 2.14),¹⁰⁴ but not at long-term followup (41% vs. 31%, RR 1.32, 95% CI 0.95 to 1.85).¹⁹⁵

Three trials found no clear differences between MBSR versus usual care or an attention control on quality of life measured by the 12-Item Short Form Health Survey (SF-12) or 36-Item Short Form Health Survery (SF-36).¹⁰⁴^,¹⁹⁴^,¹⁹⁷ Two trials reported conflicting effects on short-term PCS (mean difference 2.89, 95% CI −5.13 to 10.92, I²=97%) and MCS scores (mean difference 4.27, 95% CI −0.07 to 9.51, I²=88%), though statistical heterogeneity was high.¹⁰⁴^,¹⁹⁴ One trial found no difference in intermediate-term PCS (mean difference −0.56, 95% CI −2.52 to 1.40) or MCS scores (mean difference 2.06, 95% CI 0.05 to 4.07) scores.¹⁰⁴ One trial found MBSR associated with less medication use for low back pain at short term (43% vs. 54%) but not at intermediate term (47% vs. 53%); MBSR was associated with a small decrease in severity of depression (difference 0.63 points on the Patient Health Questionnaire (PHQ-8) at intermediate-term), with no clear differences in measures of healthcare utilization.¹⁰⁴^,¹⁹⁶

MBSR Compared With Pharmacological Therapy or With Exercise

No trial of MBSR versus pharmacological or versus exercise therapy met inclusion criteria.

Harms

In one trial, 29 percent of MBSR patients reported temporarily increased pain.¹⁰⁴ Three trials¹⁹⁷^–¹⁹⁹ reported no adverse events and one trial¹⁹⁴ did not report adverse events.

Mind-Body Practices for Chronic Low Back Pain

Key Points

Yoga

Yoga was associated with moderate effects on function versus an attention or waitlist control at short-term (8 trials [2 new], pooled SMD −0.45, 95% CI −0.69 to −0.28, I²=31%) and small effects at intermediate-term (3 trials, pooled SMD −0.29, 95% CI −0.47 to −0.11, I²=0%) (SOE: moderate for short term, low for intermediate term).
Yoga was associated with small effects on pain versus an attention or waitlist control at short-term (7 trials [2 new], pooled difference −0.87 on a 0 to 10 scale, 95% CI −1.49 to −0.24, I²=64%) and moderate effects at intermediate-term (2 trials, pooled difference −1.16, 95% CI −2.16 to −0.27, I²=0%) (SOE: low for short term, moderate for intermediate term).
Yoga was associated with no statistically significant differences versus exercise in short-term or intermediate-term pain or function (SOE: low).
Yoga was not associated with increased risk of harms versus controls (SOE: low).

Qigong

One trial found no differences between qigong versus exercise in short-term function (difference 0.9 on the RDQ, 95% CI −0.1 to 2.0), although intermediate-term results showed a small improvement favoring exercise (difference 1.2, 95% CI 0.1 to 2.3) (SOE: low).
One trial found qigong associated with a small improvement in pain versus exercise at short-term followup (difference 7.7 on a 0 to 100 scale, 95% CI 0.7 to 14.7), but the difference at intermediate-term was not statistically significant (difference 7.1, 95% CI −1.0 to 15.2) (SOE: low).
One trial found no difference between qigong versus exercise in risk of adverse events (SOE: low).

Detailed Synthesis

Yoga

Ten trials of yoga for low back pain met inclusion criteria (Table 15, Appendix D).³⁷^,²⁰⁴^–²¹¹^,²²⁰ Eight trials²⁰⁴^–²¹⁰^,²²⁰ were included in the prior AHRQ report and two trials³⁷^,²¹¹ were added for this update. In the prior AHRQ report, four trials evaluated Iyengar yoga,²⁰⁸^–²¹⁰^,²²⁰ two trials Viniyoga,²⁰⁶^,²⁰⁷ –and two trials Hatha yoga²⁰⁴^,²⁰⁵; one new trial evaluated Kundalini yoga³⁷ and the other new trial evaluated (Restorative Exercise and Strength Training for Operational Resilience and Excellence) RESTORE yoga.²¹¹ Across all trials, sample sizes ranged from 60 to 320 (total sample=1,520). Six trials compared yoga versus an attention control (education),³⁷^,²⁰⁵^–²⁰⁸^,²¹⁰ two trials yoga versus wait list control,²⁰⁴^,²⁰⁹ one trial yoga versus usual care,²¹¹ and five trials yoga versus exercise.³⁷^,²⁰⁵^–²⁰⁷^,²²⁰ One trial was conducted in India²²⁰ and the rest in the United States or Europe. The duration of yoga therapy ranged from 4 to 24 weeks and the number of sessions ranged from 4 to 48. In one trial, patients who received 12 weeks of yoga therapy were randomized to ongoing once-weekly maintenance sessions or to no maintenance.²⁰⁵ Three trials reported outcomes through intermediate-term followup,²⁰⁵^,²⁰⁸^,²⁰⁹ and seven only reported short-term outcomes.³⁷^,²⁰⁴^,²⁰⁶^,²⁰⁷^,²¹⁰^,²¹¹^,²²⁰

All of the trials were rated fair quality (Appendix E). Trials could not effectively blind patients; other methodological limitations included unclear allocation or randomization methods and high attrition.

Yoga Compared With an Attention Control or Waitlist

Yoga was associated with small effects on short-term function versus an attention control or waitlist (8 trials, pooled SMD −0.45, 95% CI −0.69 to −0.28, I²=31%) (Figure 16).³⁷^,²⁰⁴^–²⁰⁸^,²¹⁰^,²¹¹ Results were similar for Viniyoga (2 trials, pooled SMD −0.54, 95% CI −1.36 to 0.18),²⁰⁶^,²⁰⁷ Hatha yoga (2 trials, SMD −0.45, 95% CI −0.82 to −0.09),²⁰⁴^,²⁰⁵ Iyengar yoga (2 trials, SMD −0.38, 95% CI −1.38 to 0.14),²⁰⁸^,²¹⁰ Kundalini yoga (1 trial, SMD −0.13, 95% CI −0.57 to 0.31),³⁷ or RESTORE yoga (1 trial, SMD −0.74, 95% CI −1.23 to −0.25).²¹¹ Six trials evaluated function using the RDQ or modified RDQ, with a difference on a 0 to 24 or 0 to 23 scale of −2.32 (95% CI −3.48 to −1.40, I²=46%).²⁰⁴^–²⁰⁸^,²¹¹ Yoga was also associated with small effects on intermediate-term function versus controls (3 trials, pooled SMD −0.29, 95% CI −0.47 to −0.11, I²=0%).²⁰⁵^,²⁰⁸^,²⁰⁹ In two trials that evaluated intermediate-term function with the RDQ or modified RDQ, the difference was −1.65 points (95% CI −3.17 to −0.32, I²=0%).²⁰⁵^,²⁰⁸ No trials were rated poor quality.

Yoga was associated with small effects on short-term pain versus controls (7 trials, pooled difference −0.87, 95% CI −1.49 to −0.24 on a 0 to 10 scale, I²=64%) (Figure 17).³⁷^,²⁰⁴^–²⁰⁷^,²¹⁰^,²¹¹ Estimates were similar from two trials of Viniyoga (pooled difference −1.25, 95% CI −3.78 to 1.27),²⁰⁶^,²⁰⁷ two trials of Hatha yoga (difference −0.80, 95% CI −1.46 to −0.20),²⁰⁴^,²⁰⁵ and one trial of Iyengar yoga (difference −1.40, 95% CI −2.43 to −0.37);²¹⁰ one trial of Kundalini yoga³⁷ and one trial of RESTORE yoga²¹¹ showed no clear effects on pain, but estimates were imprecise. Yoga was also associated with moderate effects on intermediate-term pain versus controls, based on two trials (pooled difference −1.16, 95% CI −2.16 to −0.27, I²=0%).²⁰⁵^,²⁰⁹

Data on effects of yoga on quality of life were limited. One trial found no difference between yoga versus an attention control on the SF-36 Physical and Mental Component Summaries at short-term or intermediate-term followup (differences 0.42 to 2.02 points on a 0 to 100 scale).²⁰⁸ One other trial found no differences between yoga versus an attention control on the SF-36, but did not provide data.²⁰⁶

One trial found yoga associated with lower (better) scores on the Beck Depression Inventory than waitlist at intermediate-term followup (mean 4.6 vs. 7.8 on a 0 to 63 scale, p=0.004)²⁰⁹ and one trial found no difference between yoga versus waitlist in opioid use (9% vs. 7%, p=0.40) or other medical treatments for pain (39% vs. 37%, p=0.42) at short-term followup.²⁰⁴ One trial found yoga associated with fewer work absence days compared with an attention control at 5 to 8 months followup (mean difference −8.0 days, 95% CI −15.8 to −0.2), but differences were not statistically significant at 1 to 4 months for at 9 to 12 months.³⁷

Yoga Compared With Pharmacological Therapy

No trial of yoga versus pharmacological therapy met inclusion criteria.

Yoga Compared With Exercise

There were no differences between yoga versus exercise in short-term function (4 trials, pooled SMD −0.04, 95% CI −0.27 to 0.16, I²=0%)³⁷^,²⁰⁵^–²⁰⁷ or intermediate-term function (1 trial, SMD −0.01, 95% CI −0.26 to 0.24)²⁰⁵ (Figure 18). One trial found no difference between yoga versus exercise on the SF-36 at short-term followup (data not provided).²⁰⁶ No trials were rated poor quality.

Effects of yoga versus exercise on short-term pain were not statistically significant and there was marked heterogeneity (5 trials, pooled difference −0.63 on a 0 to 10 scale, 95% CI −1.68 to 0.45, I²=88%) (Figure 19).³⁷^,²⁰⁵^–²⁰⁷^,²²⁰ Effects favored yoga in one trial of Iyengar yoga (difference −2.00, 95% CI −2.50 to −1.50) and in one trial of Viniyoga (difference −1.50, 95% CI −2.36 to −0.64). The other three trials (one trial each of Viniyoga, Kundalini yoga, and Hatha yoga) each found no differences between yoga versus exercise. One trial found no difference between yoga versus exercise in intermediate-term pain (difference 0.30, 95% CI −0.39 to 0.99).²⁰⁵

Harms

Data on harms were limited, but trials reported no clear difference between yoga versus control interventions in risk of any adverse event (primarily mild, self-limiting back or joint pain).²⁰⁵^,²⁰⁷^,²⁰⁸ Three serious adverse events were reported across three trials (≤1% of patients), all in patients randomized to yoga: worsening back pain due to yoga,²⁰⁵^,²⁰⁷^,²⁰⁸ herniated disc²⁰⁵^,²⁰⁷^,²⁰⁸ and cellulitis²⁰⁵ (whether the latter two complications were related to yoga is unclear).

Qigong

One German trial (n=125) compared qigong (weekly sessions for 3 months) versus exercise therapy (including stretching and strengthening) (Table 16 and Appendix D).²¹⁹ The trial was included in the prior AHRQ report. It was rated fair quality due to baseline differences between groups, unblinded design, and suboptimal compliance (Appendix E). There was no difference between qigong versus exercise in short-term function (difference 0.9 on the 0 to 24 RDQ, 95% CI −0.1 to 2.0), although intermediate-term results slightly favored exercise (difference 1.2, 95% CI 0.1 to 2.3). Qigong was associated with slightly worse pain versus exercise at short-term followup (difference 7.7 on a 0 to 100 scale, 95% CI 0.7 to 14.7), but the difference at intermediate-term was not statistically significant (difference 7.1, 95% CI −1.0 to 15.2). There were no differences in sleep, measures of the SF-36 PCS or MCS scores, or in risk of adverse events.

Acupuncture for Chronic Low Back Pain

Key Points

Acupuncture was associated with a small improvement in short-term function compared with sham acupuncture or usual care (4 trials, pooled SMD −0.23, 95% CI −0.35 to −0.04, I²=25%). There were no differences between acupuncture versus controls in intermediate-term function (3 trials, pooled SMD −0.08, 95% CI −0.42 to 0.28, I²=64%) or long-term function (1 trial, adjusted difference −3.4 on the 0 to 100 ODI, 95% CI −7.8 to 1.0) (SOE: low).
Acupuncture was associated with small improvements in short-term pain compared with sham acupuncture, usual care, an attention control, or a placebo intervention (5 trials, pooled difference −0.54 on a 0 to 10 scale, 95% CI −0.91 to −0.16, I²=25%). There was no difference in intermediate-term pain (5 trials, pooled difference −0.22, 95% CI −0.67 to 0.21, I²=0%); one trial found acupuncture associated with greater effects on long-term pain (difference −0.83, 95% CI −1.53 to −0.13) (SOE: moderate for short term, low for intermediate term and long term).
There was no clear difference between acupuncture versus control interventions in risk of study discontinuation due to adverse events. Serious adverse events were rare with acupuncture and control interventions (SOE: low).

Detailed Synthesis

Eight trials of acupuncture for low back pain met inclusion criteria (Table 17 and Appendix D).¹⁷⁶^,²²⁴^–²³⁰ All of the trials were included in the prior AHRQ report. All trials evaluated needle acupuncture to body acupoints; one trial also evaluated electroacupuncture.²²⁵ Sample sizes ranged from 46 to 1,162 (total sample=2,645). Four trials compared acupuncture versus sham acupuncture,²²⁴^,²²⁶^–²²⁸ three trials acupuncture versus usual care,²²⁶^,²²⁸^,²³⁰ two trials acupuncture versus a placebo intervention (sham transcutaneous electrical nerve stimulation [TENS]),²²⁵^,²²⁹ and one trial acupuncture versus an attention control (self-care education).¹⁷⁶ One trial was conducted in Asia²²⁷ and the rest in the United States or Europe. The duration of acupuncture therapy ranged from 6 to 12 weeks and the number of acupuncture sessions ranged from 6 to 15. One trial reported outcomes through long-term followup,²³⁰ four trials through intermediate-term followup,¹⁷⁶^,²²⁴^–²²⁶ and the remainder only evaluated short-term outcomes.

One trial was rated good quality,²²⁴ five trials fair quality,¹⁷⁶^,²²⁶^–²²⁸^,²³⁰ and two trials²²⁵^,²²⁹ poor quality (Appendix E). Limitations in the fair-quality and poor-quality trials included unblinded design, unclear randomization or allocation concealment methods, and high attrition.

Acupuncture Compared With Sham Acupuncture, Usual Care, an Attention Control, or a Placebo Intervention

Acupuncture was associated with small improvements in short-term function compared with sham acupuncture or usual care (4 trials, pooled SMD −0.23, 95% CI −0.35 to −0.04, I²=25%) (Figure 20).²²⁴^,²²⁶^–²²⁸ Each trial measured function using a different scale; across trials the SMD ranged from −0.34 to 0.00. Differences were slightly greater in trials that compared acupuncture against usual care (2 trials, SMD −0.43, 95% CI −0.60 to −0.22)²²⁶^,²²⁸ than against sham acupuncture (4 trials, SMD −0.13, 95% CI −0.24 to 0.01).²²⁴^,²²⁶^–²²⁸ None of the trials were rated poor quality. There were no differences between acupuncture versus controls in intermediate-term function (3 trials, pooled SMD −0.08, 95% CI −0.42 to 0.28, I²=64%)¹⁷⁶^,²²⁴^,²²⁶ or long-term function (1 trial, adjusted difference −3.4 on the 0 to 100 ODI, 95% CI −7.8 to 1.0).²³⁰

Acupuncture was associated with small improvements in short-term pain compared with sham acupuncture, usual care, an attention control, or a placebo intervention (5 trials, pooled difference −0.54 on a 0 to 10 scale, 95% CI −0.91 to −0.16, I²=25%) (Figure 21).²²⁴^–²²⁸ The pooled estimate was similar when poor-quality trials were excluded. When stratified according to the type of control intervention, acupuncture was associated with greater effects when compared with usual care (2 trials, pooled difference −1.01, 95% CI −1.60 to −0.28)²²⁶^,²²⁸ than when compared with sham acupuncture (4 trials, pooled difference −0.21, 95% CI −0.66 to 0.18).²²⁴^,²²⁶^–²²⁸ There was no difference between acupuncture versus controls in intermediate-term pain (5 trials, pooled difference −0.22, 95% CI −0.67 to 0.21, I²=0%).¹⁷⁶^,²²⁴^–²²⁶^,²³⁰ One trial found acupuncture associated with greater effects on long-term pain than usual care (difference −0.83, 95% CI −1.53 to −0.13).²³⁰

Data on effects of acupuncture on quality of life were limited. In two trials, differences between acupuncture versus sham acupuncture or usual care on short-term or intermediate-term SF-36 PCS and MCS scores were small (range 0.64 to 3.92 points on a 0 to 100 scale), and most differences were not statistically significant.²²⁴^,²²⁸ Two trials found no clear effects of acupuncture and controls on measures of depression.²²⁴^,²²⁷

Two trials found no clear differences between acupuncture versus an attention control in measures of healthcare utilization (provider visits, medication fills, imaging studies, costs of services),¹⁷⁶^,²²⁶ and one trial found no clear differences at intermediate-term followup between acupuncture versus placebo TENS in likelihood of working full time.²²⁵

One trial found acupuncture associated with a higher likelihood of short-term (4.5 months) treatment response (defined as ≥33% pain improvement and ≥12% functional improvement) versus usual care (48% vs. 27%, RR 1.74, 95% CI 1.43 to 2.11), but there was no difference versus sham acupuncture (RR 1.08, 95% CI 0.92 to 1.25).²²⁸

No trial evaluated effects of acupuncture on use of opioid therapies or healthcare utilization. There was insufficient evidence to determine effects of duration of acupuncture or number of acupuncture sessions on findings.

Acupuncture Compared With Pharmacological Therapy or With Exercise

No trial of acupuncture versus pharmacological therapy or versus exercise met inclusion criteria.

Harms

Data on harms were limited but indicated no clear difference between acupuncture versus control interventions in risk of withdrawal due to adverse events.²²⁶^,²³⁰ Serious adverse events were rare with acupuncture and control interventions.¹⁷⁶^,²²⁴^,²²⁶^–²²⁸

Multidisciplinary Rehabilitation for Chronic Low Back Pain

Key Points

Multidisciplinary rehabilitation was associated with small improvements in function compared with usual care at short-term (4 trials, pooled SMD −0.30, 95% CI −0.63 to 0.00, I²=58%) and intermediate-term followup (4 trials, pooled SMD −0.37, 95% CI −0.69 to −0.08, I²=34%); there was no difference in long-term function (2 trials, pooled SMD −0.04, 95% CI −0.36 to 0.35, I²=0%) (SOE: low).
Multidisciplinary rehabilitation was associated with small improvements in pain compared with usual care at short-term followup (4 trials, pooled difference −0.53 on a 0 to 10 scale, 95% CI −0.86 to −0.11, I²=0%) and intermediate-term followup (4 trials, pooled difference −0.62, 95% CI −1.06 to −0.18, I²=0%); the long-term difference was smaller and not statistically significant (2 trials, pooled difference −0.35, 95% CI −1.10 to 0.34, I²=0%) (SOE: moderate for short term and intermediate term, low for long term).
Multidisciplinary rehabilitation was associated with a small improvement compared with exercise in short-term function (6 trials, pooled SMD −0.20, 95% CI −0.54 to 0.00, I²=0%) and intermediate-term function (5 trials [excluding outlier trial], pooled SMD −0.20, 95% CI −0.40 to −0.00, I²=0%); there was no effect on long-term function (2 trials [excluding outlier trial], pooled SMD −0.07, 95% CI −0.50 to 0.39, I²=0%) (SOE: moderate for short term and intermediate term, low for long term).
Multidisciplinary rehabilitation was associated with a small improvement compared with exercise in short-term pain (6 trials, pooled difference −0.69 on a 0 to 10 scale, 95% CI −1.16 to −0.22, I²=0%) and intermediate-term pain (5 trials [excluding outlier trial], pooled difference −0.55, 95% CI −1.00 to −0.11, I²=0%); there was no effect on long-term pain (2 trials [excluding outlier trial], pooled difference 0.00, 95% CI −1.31 to 1.17) (SOE: moderate for short term and intermediate term, low for long term).
Data on harms were sparse; no serious harms were reported (SOE: insufficient).

Detailed Synthesis

Sixteen trials (reported in 21 publications) of multidisciplinary rehabilitation for low back pain met inclusion criteria (Table 18 and Appendix D).³⁵^,¹³³^,¹⁴⁰^,¹⁸⁹^,²⁵⁵^–²⁶⁰^,²⁶⁹^–²⁸¹ All of the trials were included in the prior AHRQ report. In accordance with our definition for multidisciplinary rehabilitation, the intervention in all trials included a psychological therapy and an exercise therapy component, with therapy developed by clinicians from at least two disciplines. Most multidisciplinary rehabilitation interventions incorporated techniques and approaches consistent with principles of functional restoration.²⁸³ The intensity of multidisciplinary rehabilitation varied substantially, with treatment ranging from 4 to 150 hours. Five trials evaluated a multidisciplinary rehabilitation intervention that met our criteria for high intensity (≥20 hours/week or >80 hours total).²⁵⁵^,²⁶⁰^,²⁷⁰^,²⁷¹^,²⁷⁸ The duration of therapy ranged from 4 days to up to 13 weeks. Sample sizes ranged from 20 to 459 (total sample=1,964). Six trials compared multidisciplinary rehabilitation versus usual care,²⁵⁵^–²⁶⁰ nine trials compared multidisciplinary rehabilitation versus exercise therapy,¹³³^,²⁵⁷^,²⁷⁰^,²⁷¹^,²⁷³^–²⁷⁸ and one trial compared multidisciplinary rehabilitation versus oral medications.²⁶⁹ One trial²⁶⁹ was conducted in Iran and the remainder were conducted in the United States, the United Kingdom, or Australia. Five trials reported outcomes through long-term (12 to 60 months) followup,¹³³^,²⁵⁵^,²⁶⁹^,²⁷⁰^,²⁷⁶ eight trials evaluated outcomes through intermediate-term followup,¹³³^,²⁵⁸^–²⁶⁰^,²⁷¹^,²⁷³^,²⁷⁵^,²⁷⁸^,²⁷⁹ and three trials only evaluated short-term outcomes.²⁵⁶^,²⁷⁴^,²⁷⁷

Ten trials²⁵⁵^,²⁵⁷^,²⁵⁸^,²⁷⁰^,²⁷¹^,²⁷⁴^–²⁷⁸ were rated fair quality and six trials poor quality (Appendix E).¹³³^,²⁵⁶^,²⁵⁹^,²⁶⁰^,²⁶⁹^,²⁷³ The major methodological limitation in the fair-quality trials was the inability to effectively blind patients and caregivers to the multidisciplinary rehabilitation. Other methodological shortcomings included unclear randomization and allocation concealment methods and high attrition.

Multidisciplinary Rehabilitation Compared With Usual Care

Multidisciplinary rehabilitation was associated with small improvements in function compared with controls at short-term (4 trials, pooled SMD −0.30, 95% CI −0.63 to 0.00, I²=58%),²⁵⁵^–²⁵⁸ and intermediate-term followup (4 trials, pooled SMD −0.37, 95% CI −0.69 to −0.08, I²=34%) (Figure 22).²⁵⁷^–²⁶⁰ There was no difference in long-term function (2 trials, pooled SMD −0.04, 95% CI −0.36 to 0.35, I²=0%).²⁵⁵^,²⁵⁷ In trials that measured function using the RDQ, the difference was −0.67 points (95% CI −21.5 to 0.81, 2 trials) at short term and −1.9 points (95% CI −3.70 to −0.18, 2 trials) at intermediate term. Restriction to high-intensity multidisciplinary rehabilitation interventions or exclusion of poor-quality trials had little effect on estimates. At short-term followup, effects on function were somewhat larger with high intensity multidisciplinary rehabilitation interventions (2 trials, pooled SMD −0.50, 95% CI −0.94 to −0.22)²⁵⁵^,²⁵⁶ than with nonhigh intensity interventions (3 trials, pooled difference −0.20, 95% CI −0.38 to 0.04),²⁵⁶^–²⁵⁸ but the interaction was not statistically significant (p=0.19). At intermediate term, there were no clear differences between high intensity (1 trial, SMD −0.59, 95% CI −0.99 to −0.19)²⁶⁰ and nonhigh intensity (3 trials, pooled difference −0.30, 95% CI −0.69 to 0.06)²⁵⁷^–²⁵⁹ interventions (p=0.48 for interaction).

Multidisciplinary rehabilitation was associated with small improvements compared with usual care in pain at short-term (4 trials, pooled difference −0.53 on a 0 to 10 scale, 95% CI −0.86 to −0.11, I²=0%)²⁵⁵^–²⁵⁸ and intermediate-term followup (4 trials, pooled difference −0.62, 95% CI −1.06 to −0.18, I²=0%)²⁵⁷^–²⁶⁰ (Figure 23). The long-term difference was smaller and not statistically significant (2 trials, pooled difference −0.35, 95% CI −1.10 to 0.34, I²=0%).²⁵⁵^,²⁵⁷ Excluding poor-quality trials²⁵⁶^,²⁵⁹^,²⁶⁰ had little effect on estimates. At short-term followup, effects on pain were somewhat larger with high intensity multidisciplinary rehabilitation interventions (2 trials, pooled difference −0.86, 95% CI −1.57 to −0.31)²⁵⁵^,²⁵⁶ than with nonhigh intensity interventions (3 trials, pooled difference −0.35, 95% CI −0.71 to 0.15),²⁵⁶^–²⁵⁸ but the interaction between intensity and effects of multidisciplinary rehabilitation was not statistically significant (p=0.48). At intermediate term, estimates were similar for high intensity (1 trial, difference −0.53, 95% CI −1.35 to 0.29)²⁶⁰ and nonhigh intensity (3 trials, pooled difference −0.66, 95% CI −1.22 to −0.09) interventions (p=0.82 for interaction).²⁵⁷^–²⁵⁹

Data on other outcomes was limited. One trial found no differences between multidisciplinary rehabilitation versus usual care on the SF-36 Social Functioning or Mental Functioning subscales.²⁵⁷ Three trials reported inconsistent effects on work or disability/sick leave status.²⁵⁵^,²⁵⁷^,²⁶⁰ Two trials found multidisciplinary rehabilitation associated with fewer health system contacts versus usual care.²⁵⁵^,²⁵⁸

Multidisciplinary Rehabilitation Compared With Pharmacological Therapy

One poor-quality trial (n=74) found multidisciplinary rehabilitation (intensity unclear) associated with greater effects on short-term quality of life than oral medications (acetaminophen, nonsteroidal anti-inflammatory drugs [NSAIDs], and chlordiazepoxide).²⁶⁹ The difference on the SF-36 PCS was 25.5 points (95% CI 14.7 to 36.3) and on the SF-36 MCS was 23.0 points (95% CI 10.8 to 35.2). Effects were smaller at intermediate term and statistically significant for the SF-36 PCS (difference 15.4, 95% CI 2.35 to 28.45) but not for the SF-36 MCS (difference 9.0, 95% CI −3.88 to 21.9). Effects were not statistically significant at long-term (12-month) followup (differences 13.6 and 4.9 points, respectively).

Multidisciplinary Rehabilitation Compared With Exercise

Multidisciplinary rehabilitation was associated with a small improvement in short-term function compared with exercise (6 trials, pooled SMD −0.20, 95% CI −0.54 to 0.001, I²=32%) (Figure 24).²⁷⁰^,²⁷²^–²⁷⁵^,²⁷⁷ Estimates were similar when a poor-quality trial²⁷³ was excluded and when analyses were restricted to trials of high-intensity multidisciplinary rehabilitation (2 trials, pooled difference −0.14, 95% CI −0.50 to 0.22).²⁷⁰^,²⁷² Multidisciplinary rehabilitation was associated with substantially greater effects than exercise on intermediate-term function (6 trials, pooled SMD −1.04, 95% CI −2.82 to 0.71, I²=96%), but statistical heterogeneity was very large.¹³³^,²⁷¹^,²⁷³^,²⁷⁵^,²⁷⁶^,²⁷⁸^,²⁷⁹ Excluding an outlier trial (SMD −5.31, 95% CI −6.20 to −4.42)²⁷⁶ eliminated statistical heterogeneity and resulted in a markedly attenuated (small) effect (5 trials, pooled SMD −0.20, 95% CI −0.40 to −0.00, I²=0%). There was no difference between multidisciplinary rehabilitation versus exercise in long-term function (3 trials, pooled SMD −1.82, 95% CI −5.90 to 2.24, I²=98%).¹³³^,²⁷⁰^,²⁷⁶ Excluding the outlier trial²⁷⁶ described above resulted in a pooled SMD close to 0 (−0.07, 95% CI −0.50 to 0.39, I²=0%).

Multidisciplinary rehabilitation was associated with small improvements in short-term pain versus exercise (6 trials, pooled difference −0.69 on a 0 to 10 scale, 95% CI −1.16 to −0.22, I²=0%) (Figure 25). Estimates were similar when one poor-quality trial²⁷³ was excluded (5 trials, pooled difference −0.53, 95% CI −1.12 to 0.11), and estimates were similar when analyses were stratified according to intensity of multidisciplinary rehabilitation. In two trials that evaluated high intensity multidisciplinary rehabilitation, the pooled difference was −0.62 (95% CI −1.61 to 0.37).²⁷⁰^,²⁷² Estimates at intermediate term (6 trials, pooled difference −1.20 points, 95% CI −2.43 to 0.09, I²=95%)²⁷¹^,²⁷³^,²⁷⁵^,²⁷⁷^–²⁷⁹ and long term (3 trials, pooled difference −1.68, 95% CI −5.25 to 1.97, I²=98%)¹³³^,²⁷⁰^,²⁷⁶ favored multidisciplinary rehabilitation, but differences were not statistically significant. Substantial statistical heterogeneity was present in analyses of intermediate-term and long-term pain, with an outlier trial²⁷⁶ that reported substantially larger effects than the other trials. For intermediate term, the outlier trial reported a difference of −3.90 points, versus −0.31 to −0.73 points in the other trials. Excluding the outlier trial eliminated statistical heterogeneity and resulted in a small, statistically significant difference in intermediate-term pain that favored multidisciplinary rehabilitation (5 trials, pooled difference −0.55, 95% CI −1.00 to −0.11, I²=0%); there was no difference in long-term pain (2 trials, pooled difference 0.00, 95% CI −1.31 to 1.17, I²=0%). For intermediate-term pain, exclusion of a poor-quality trial²⁷³ (5 trials, pooled difference −1.52, 95% CI −3.35 to 0.39) or restriction of analyses to high intensity multidisciplinary rehabilitation interventions (2 trials, pooled difference −0.60, 95% CI −1.44 to 0.24)²⁷¹^,²⁷⁸^,²⁷⁹ did not reduce heterogeneity and differences remained not statistically significant.

Data on other outcomes was limited. One trial found multidisciplinary rehabilitation associated with better scores versus exercise on SF-36 subscales at short-term followup (differences 10 to 21 points).²⁷⁷ Four trials found no clear differences between multidisciplinary rehabilitation versus exercise on severity of depression.¹³³^,²⁷²^–²⁷⁴ Two trials found no clear effects on work status²⁷⁰^,²⁷⁸^,²⁷⁹ and one trial found high intensity multidisciplinary rehabilitation associated with fewer days or sick leave than exercise, but nonhigh intensity rehabilitation associated with more days of sick leave.²⁷⁰ Two trials found inconsistent effects on number of health system contacts.²⁷⁰^,²⁷¹

Harms

Data on harms were sparse and reported in only two trials. One study reported no clear difference between multidisciplinary rehabilitation versus exercise in risk of transient worsening of pain,²⁷⁷ and one trial reported no harms with either multidisciplinary rehabilitation or medications alone.²⁶⁹

Key Question 2. Chronic Neck Pain

For chronic neck pain, 25 RCTs were included in the prior AHRQ report (N=3294). One study was rated good-quality, sixteen studies fair quality, and eight studies poor quality. The prior AHRQ report found combination exercise, low-level laser therapy, Alexander Technique and acupuncture associated with greater effects than usual care, no treatment, advice alone, or sham on improved function; only combination exercise and low-level laser therapy were also associated with greater improvement in pain. The strength of evidence was low or moderate, and observed at short- intermediate- or long-term followup.

For this update, we identified two new RCTs (N=156) and a new publication (subanalysis) of a previously included trial; all were rated fair quality. One trial evaluated exercise and the other evaluated manual therapy (massage); the subsequent publication provided data for mind-body practices (Alexander Technique) and acupuncture. The Key Points summarize the main findings based on the evidence included in the prior report and new trials; the Key Points note where new trials contributed to findings.

Exercise for Chronic Neck Pain

Key Points

Across types of exercise, there was no clear improvement in function (3 trials [excluding outlier trial], pooled SMD −0.22, 95% CI −0.66 to 0.17, I²=73%) or pain (3 trials [excluding outlier trial], pooled SMD −0.70, 95% CI −1.62 to 0.15, I²=64%) versus no treatment, waitlist or attention control in the short term (SOE: low).
A subgroup of two trials of combination exercises (including 3 of the following 4 exercise categories: muscle performance, mobility, muscle re-education, aerobic) suggests a small benefit for function and pain versus waitlist or attention control over the short term; and function versus attention control in the long term (1 trial) (SOE: low).
There was no clear improvement in function for exercise versus no intervention at intermediate term (1 trial) and a small improvement versus attention control in the long term (1 trial) (SOE: low for both).
There was no improvement in pain for exercise versus no intervention or attention control at intermediate term (2 trials) and versus attention control at long-term (3 trials) (SOE: low for both).
The effect of exercise versus NSAIDs and muscle relaxants on function and pain was indeterminate at short or intermediate term due to insufficient evidence from a single poor-quality trial (SOE: insufficient).
Muscle performance exercise (Pilates) was associated with a small improvement in function and a substantial improvement in pain compared with oral medication (acetaminophen) in the short-term in one new fair quality trial (SOE: low).
Harms were poorly reported in trials of exercise with only two trials describing adverse events. No serious harms were reported in either trial. Minor complaints included muscle pain with exercise, knee pain and lumbar spine pain (SOE: low).

Detailed Synthesis

Eight trials of exercise therapy for neck pain met inclusion criteria (Table 19 and Appendix D).⁴¹^–⁴⁶^,¹⁰⁰^,¹⁰¹ Seven trials⁴¹^–⁴⁶^,¹⁰⁰ were included in the prior AHRQ report and one¹⁰¹ was added for this update. Four trials evaluated participants with chronic neck pain associated with office work,⁴¹^,⁴³^,⁴⁵^,⁴⁶ and one trial each included patients with chronic neck pain following whiplash,⁴⁴ nonspecific neck pain,⁴² cervical arthritis,¹⁰⁰ and mechanical neck pain (new trial).¹⁰¹ Across trials, participants were predominately female (>80%) with only the new trial predominantly men (78%).¹⁰¹ Mean ages ranged from 38 to 52 years.

Five trials (1 new) evaluated muscle performance exercises (resistive training),⁴¹^,⁴³^,⁴⁵^,⁴⁶^,¹⁰¹ three combined exercise techniques,⁴²^,⁴⁴^,¹⁰⁰ and one neuromuscular rehabilitation.⁴⁶ Sample sizes ranged from 40 to 265 (total sample=973). Four trials compared exercise versus an attention control,⁴¹^,⁴³^,⁴⁴^,⁴⁶ one versus no treatment,⁴⁵ one versus waitlist,⁴² and two (1 new) versus pharmacological care.¹⁰⁰^,¹⁰¹ Four trials were conducted in Europe,⁴¹^,⁴²^,⁴⁵^,⁴⁶ one in Australia,⁴⁴ one in China,⁴³ one in Turkey,¹⁰⁰ and one in Brazil (new trial).¹⁰¹ The duration of exercise therapy ranged from 6 weeks to 12 months, and the number of supervised exercise sessions ranged from 3 to 52. Three trials reported outcomes through long-term followup,⁴¹^,⁴⁴^,⁴⁶ two through intermediate-term followup,⁴⁵^,¹⁰⁰ and three (1 new) evaluated only short-term outcomes.⁴²^,⁴³^,¹⁰¹

Four trials, including the new trial, were rated fair quality⁴³^–⁴⁵^,¹⁰¹ and four poor quality⁴¹^,⁴²^,⁴⁶^,¹⁰⁰ (Appendix E). In the four fair-quality trials, the main methodological limitation was the inability to blind interventions. Limitations in the other trials included inability to blind interventions, unclear randomization and allocation concealment methods, unclear or high loss to followup, and baseline differences between intervention groups.

Exercise Compared With No Treatment, Waitlist, or an Attention Control

Across types of exercise, there was no clear improvement in function versus no treatment, waitlist or an attention control in the short term (4 trials, pooled SMD −0.73, 95% CI −1.84 to 0.36, I²=95.1%), but statistical heterogeneity was very large⁴²^–⁴⁵ (Figure 26). Excluding an outlier trial (SMD −2.22, 95% CI −2.74 to −1.70)⁴³ reduced the statistical heterogeneity and resulted in an attenuated effect (SMD −0.22, 95% CI −0.66 to 0.17, I²=72.6%). However, two studies that included combination exercises (3 of the following 4 exercise categories: muscle performance, mobility, muscle re-education, aerobic) found small improvement in function compared with controls short term (2 trials, pooled SMD −0.44, 95% CI −0.76 to −0.09, data not shown in figure).⁴²^,⁴⁴ A fair-quality study reported a continued small benefit with combination exercise in the long term (SMD −0.39, 95% CI −0.74 to −0.03).⁴⁴

Exercise tended toward moderately greater effects on short-term pain compared with no treatment, waitlist or an attention control (4 trials, pooled difference −1.33, 95% CI −2.68 to 0.07, I²=89.4%), but statistical heterogeneity was very large,⁴²^–⁴⁵ (Figure 27). Excluding an outlier trial (difference −2.92, 95% CI −3.38 to −2.46)⁴³ reduced the statistical heterogeneity and resulted in an attenuated effect (difference −0.70, 95% CI −1.62 to 0.15, I²=63.7%). The effect of exercise on reducing pain was substantially greater in trials assessing combination exercises (2 trials, pooled difference −1.12, 95% CI −1.82 to −0.43; data not shown in figure).⁴²^,⁴⁴ There were no differences in pain comparing exercise versus controls in the intermediate term (2 trials, pooled difference −0.25, 95% CI −0.81 to 0.31, I²=0%)⁴¹^,⁴⁵ or the long term (3 trials, pooled difference 0.07, 95% CI −0.51 to 0.88, I²=0%).⁴¹^,⁴⁴^,⁴⁶

Data on effects of exercise on quality of life were limited. One fair-quality trial⁴⁴ found significant improvement in SF-36 PCS and MCS in the short term (difference in change score 3.60 on a 0-100 scale, 95% CI 1.23 to 5.97 and 4.00, 95% CI 1.24 to 6.77, respectively) and PCS in the long term (difference in change score 3.80, 95% CI 1.30 to 6.30). A poor-quality trial found no difference in SF-36 PCS or MCS in the short term.⁴² No trial evaluated effects of exercise therapies on use of opioid therapies or healthcare utilization.

There was insufficient evidence to determine effects of duration of exercise therapy or number of sessions on outcomes.

Exercise Compared With Pharmacological Therapy

Two trials, (1 new) compared exercise with pharmacological therapy. Differences in the pharmacological therapies and study quality precluded pooling of the trials.

One poor-quality trial (N=40)¹⁰⁰ comparing 1.5 months of home combination exercises (posture, stretching, strengthening and endurance exercises) versus ibuprofen plus thiocolchicoside for 15 days found no between-group difference in function (Neck Disability Index [NDI]) at 3-month (difference −2.2 on 0-50 scale, 95% CI −5.8 to 1.5) or 6-month followup (difference of −1.8, 95% CI −5.7 to 2.1). The study reported similar results for pain intensity (difference −1.0 on a 0-10 scale, 95% CI −2.3 to 0.3 at 3-month and difference −0.8, 95% CI −2.3 to 0.7 at 6-month followup). The exercise group reported a better quality of life compared with the medication group at 3-month and 6-month followup using the Turkish version of the Nottingham Health Profile (difference −141, scale not stated though usual scale 0-100, 95% CI −214 to −68; difference −135, 95% CI −209 to −62, respectively).¹⁰⁰ The groups scored comparably on the Beck Depression Inventory at both followup periods (Table 18).

The new fair-quality trial (N=64)¹⁰¹ found Pilates exercise to be associated with a small improvement in function according to the NDI (difference −5.6 on 0-50 scale, 95% CI −8.4 to −2.8) and a substantial improvement in pain (difference −3.1 on 0-10 scale, 95% CI −4.2 to −2.1) compared with oral medication (acetaminophen) in the short term. SF-36 scores were reported for individual domains; physical functioning, bodily pain, general health, vitality, and mental health showed a small improvement with exercise compared with acetaminophen.

Exercise Compared With Other Nonpharmacological Therapies

Findings for exercise versus other nonpharmacological therapies are addressed in the sections for other nonpharmacological therapies.

Harms

Only two exercise trials reported harms. One reported only mild complaints that included muscle pain with exercise (5%), knee pain (3%), and lumbar spine pain (3%).⁴⁴ None required referral to a medical practitioner. In the other, investigators reported no serious harms related to the intervention.⁴² One occurrence of minor knee pain was reported in the exercise group.

Psychological Therapies for Chronic Neck Pain

Key Points

No difference was found in function (NDI, 0−80 scale) or pain (visual analog scale [VAS], 0-10 scale) in the short term (adjusted difference 0.1, 95% CI −2.9 to 3.2 and 0.2, 95% CI −0.4 to 0.8, respectively) or intermediate term (adjusted difference 0.2, 95% CI −2.8 to 3.1 and 0.2, 95% CI −0.3 to 0.8, respectively) from one fair-quality study comparing relaxation training and no intervention or exercise (SOE: low for all). We found no trials with outcomes assessed in the long term.
We found no evidence comparing relaxation training with pharmacological therapy.
The only trial of relaxation training did not report harms.

Detailed Synthesis

We found one trial comparing the effects of relaxation training versus no intervention (N=258) or exercise therapy (N=263) in female office workers with chronic neck pain⁴⁵ (Table 20 and Appendix D). This trial was included in the previous AHRQ report. Relaxation training and muscle performance exercise therapy were done in 30-minute sessions three times per week for 12 weeks, with 1 week of reinforcement training 6 months after randomization. Patients in the no-treatment group were instructed not to change their usual activities. Adherence to the relaxation schedule during the intervention period was 42 percent of the scheduled sessions. The nature of the intervention and control precluded blinding of participants and people administering the interventions; therefore, this trial was rated as fair quality.

Relaxation Training Compared With No Treatment

The one fair-quality trial found no between-group differences in the short term (3 months) or intermediate term (9 months) as measured by a neck disability scale (difference 0.1 on a 0-80 scale, 95% CI −2.9 to 3.2, and difference 0.2, 95% CI −2.8 to 3.1, respectively)⁴⁵ (Table 19). The neck disability scale, a nonvalidated instrument, asked whether the participant had pain or difficulty on eight functional activities, with each activity scored from 0 (no pain or hindrance) to 10 (unbearable pain or maximum hindrance), for a total of 80 points. Likewise, there were no differences in pain intensity between groups at the same time frames, (difference 0.2 on a 10-point scale, 95% CI −0.4 to 0.8, and difference 0.2, 95% CI −0.3 to 0.8, respectively). There were no trials evaluating relaxation in the long term.

Relaxation Training Compared With Pharmacological Therapy

We did not find any trials meeting our criteria that compared a relaxation training with pharmacological therapy.

Relaxation Training Compared With Exercise Therapy

The one fair-quality trial found no differences between relaxation training and exercise therapy in the short term (3 months) or intermediate term (9 months) as measured by a neck disability scale described above (difference 0.2 on a 0-80 scale, 95% CI −2.8 to 3.2, and difference 0.2, 95% CI −2.7 to 3.2, respectively)⁴⁵ (Table 19). Similarly, there were no differences in pain intensity between groups at the same time frames (difference −0.2 on a 10-point scale, 95% CI −0.8 to 0.4, and difference −0.2, 95% CI −0.8 to 0.3, respectively). There were no trials comparing relaxation with exercise therapy in the long term.

Harms

The trial on relaxation therapy did not report harms.⁴⁵

Physical Modalities for Chronic Neck Pain

Key Points

Low-level laser therapy was associated with a moderate improvement in short-term function (2 trials, pooled difference −13.60, 95% CI −26.30 to −6.30, I²=0%, 0-100 scale) and pain (3 trials, pooled difference −1.89 on a 0-10 scale, 95% CI −3.34 to −0.06, I²=61%) compared with sham (SOE: moderate for function and pain).
Data from two small, poor-quality trials, one evaluating cervical traction versus attention control (infrared irradiation) and the other electromagnetic fields versus sham, were insufficient to determine effects on function or pain over the short term (SOE: insufficient).
No trials assessed outcomes in the intermediate term or long term, or compared a physical modality to pharmacological therapy or exercise.
Harms were poorly reported in trials of low-level laser. Adverse effects occurred with similar frequency in the laser and sham groups in the one trial reporting such effects. The most frequently reported adverse effects included mild (78%) or moderately (60%) increased neck pain, increased pain elsewhere (78%), mild headache (60%), and tiredness (24%) (SOE: low).
The trials of cervical traction and electromagnetic fields did not report harms.

Detailed Synthesis

A total of five trials (N range, 53 to 90; total sample=363)¹⁴⁵^–¹⁴⁹ evaluating physical modalities for the treatment of chronic neck pain met inclusion criteria (Table 21 and Appendixes D and E). All of the trials were included in the prior AHRQ report. Interventions included traction, laser therapy, and electromagnetic field therapy.

One trial (N=79) conducted in Hong Kong compared intermittent cervical traction versus attention control (infrared irradiation).¹⁴⁶ Each treatment was administered for 20 minutes twice weekly for 6 weeks. This trial was considered poor quality due to lack of patient and caregiver blinding, high and unequal attrition (41% in traction group, 58% in control), and dissimilar baseline characteristics between groups.

Three trials (N range, 53 to 90; total sample=203)¹⁴⁵^,¹⁴⁷^,¹⁴⁸ compared low-level laser therapy with sham. The mean duration of pain varied from 4 years in two trials¹⁴⁵^,¹⁴⁸ to 15 years in a third.¹⁴⁷ Treatment consisted of laser application (wavelength range, 830 to 904 nm) over several myofascial tender points; across the trials, duration ranged from 30 seconds to 3 minutes per tender point and frequency varied from daily to twice weekly over periods of 2 or 7 weeks. One trial was rated good quality¹⁴⁷ and two fair quality.¹⁴⁵^,¹⁴⁸ Common methodological limitations in the two fair-quality trials included inadequate reporting of treatment allocation and no or unclear blinding of the care provider. In addition, baseline characteristics were not similar in one trial, in which the intervention group tended to have more pain and tenderness and longer duration of symptoms.¹⁴⁵

One trial (N=81) compared the effects of eighteen 30-minute sessions (3-5 times per week) of low frequency pulsed electromagnetic fields versus sham.¹⁴⁹ The treatment consisted of an electromagnetic coil against the back of the neck while the participants were lying on a pillow. The investigators covered the set of light emitting diodes that pulse to signal the coil being energized in order to blind the participants to the treatment or sham. This trial was rated as poor quality due to several factors: failure to describe the number randomized in each group; inadequate reporting of treatment compliance and information to calculate participant attrition and intent to treat analysis; care provider not blinded to treatment; and baseline characteristics dissimilar between groups.

Physical Modalities Compared With Attention Control or Sham

Traction. One poor-quality trial found no short-term differences in function comparing intermittent cervical traction versus attention control (infrared irradiation) using the Northwick Park Questionnaire (NPQ) (difference −1.8, 95% CI −10.8 to 7.2, 0-100% scale).¹⁴⁶ Likewise, there was no difference in pain intensity between groups (difference −0.7, 95% CI −2.2 to 0.8, 10 point scale). There were no trials evaluating cervical traction in the intermediate term or long term.

Low-Level Laser Therapy. Laser was associated with moderately greater effects compared with sham on short-term function (2 trials, pooled difference −13.60, 95% CI −26.30 to −6.30, I²=0%, 0-100 scale) (Figure 28)¹⁴⁷^,¹⁴⁸ and short-term pain (3 trials, pooled difference −1.89, 95% CI −3.34 to −0.06, I²=61%, 0-10 scale) (Figure 29).¹⁴⁵^,¹⁴⁷^,¹⁴⁸ Pain improvement of greater than −3.0 on a 10-point VAS scale was substantially more common with laser therapy in the good-quality trial (RR 6.0, 95% CI 1.9 to 19.0).¹⁴⁷ Quality of life improvement also favored low-level laser as measured by the SF-36 PCS (difference 4.5, 95% CI 0.7 to 8.2)¹⁴⁷ and the Nottingham Health Profile (difference −16.1 on a 0-100 scale, 95% CI −30.9 to −1.3).¹⁴⁸ Measures demonstrating no difference between groups included the SF36 MCS and the McGill Pain Questionnaire component scores¹⁴⁷ (Table 20). There were no trials evaluating laser therapy in the intermediate term or long term.

Electromagnetic Fields. One poor-quality trial found no between-group differences in short-term difficulty with activities of daily living (ADLs) (difference 1.6, 95% CI −1.5 to 4.8, scale 0-24, nonvalidated measure).¹⁴⁹ The ADL instrument asked whether the participant had pain or difficulty on eight activities scored from 0 (never) to 3 (always), for a total of 24 points.

Likewise, there was no difference in pain intensity between groups (difference 1.1, 95% CI −0.3 to 2.6, 0-10 scale) or in patients’ assessment of improvement (difference 1.2, 95% CI −15.2 to 17.6, 0-100 scale).¹⁴⁹ There were no trials evaluating electromagnetic fields in the intermediate term or long term.

Physical Modalities Compared With Pharmacological Therapy or With Exercise Therapy

We did not find any trials meeting our criteria comparing a physical modality with pharmacological therapy or with exercise.

Harms

Only one laser trial reported harms.¹⁴⁷ The trial reported a large number of adverse effects with similar frequency in both groups. However, the sham group reported nausea significantly more frequently (42% vs. 20%) while the laser group reported stiffness more frequently (20% vs. 4%). The most frequently reported adverse effects included mild (78%) or moderate (60%) increased neck pain, increased pain elsewhere (78%), mild headache (60%), and tiredness (24%). Harms were not reported by either trial evaluating cervical traction or electromagnetic fields.

Manual Therapies for Chronic Neck Pain

Key Points

Massage

The effects of Swedish massage on function (≥5 point improvement on the NDI) versus self-management attention control were small and not statistically significant in one trial in the short term (39% versus 14%, RR 2.7, 95% CI 0.99 to 7.5) and intermediate term (57% versus 31%, RR 1.8, 95% CI 0.97 to 3.5) (SOE: low for both time periods).
Massage was associated with a small improvement in short-term function compared with attention or waitlist control (2 trials [1 new], pooled difference –3.66 on a 0-50 NDI scale, 95% CI –6.58 to –0.56, I²=10%) (SOE: low).
Massage was associated with a moderate improvement compared with waitlist control in short-term pain intensity experienced during the previous 7 days (1 new trial, difference –1.8 on a 0-10 scale, 95% CI –2.7 to –0.9) (SOE: low).
No clear evidence that massage improved pain in the intermediate term versus exercise (p>0.05, data not reported) was seen in a third fair-quality trial (SOE: low).
Three fair-quality trials (1 new) reported no serious adverse effects; transient nonserious pain or soreness was reported during or following massage in two trials (1 new) and during or after exercise, but not massage, in a third trial (SOE: low).

Detailed Synthesis

Massage

Three trials of massage therapy met inclusion criteria (Table 22 and Appendix D).¹⁸¹^–¹⁸³ Two trials¹⁸¹^,¹⁸² were included in the prior AHRQ report and one¹⁸³ was added for this update. Sample sizes ranged from 64 to 108 (total sample=264). One trial compared Swedish massage versus attention control (self-care education),¹⁸² the new trial compared Tuina massage versus waitlist¹⁸³ and one trial compared classical massage versus two types of exercise (muscle re-education and strength training targeting the neck and shoulder muscles).¹⁸¹ Swedish and classical massage (nonforceful) were performed on the neck and back, and in some cases the pectoral muscles and rotator cuff or arms. Tuina massage included soft tissue massage, local muscle stretching, mobilization and traction of the cervical spine, and manipulation of local pain (trigger) points; no high-velocity/low-amplitude thrusts were applied. Muscle re-education exercise was performed with a newly developed training device strapped to the head and consisted of a plate with 5 exchangeable surfaces that allow for progression of task difficulty; strength training included both isometric and dynamic exercises targeting the neck and shoulders. One trial was conducted in the United States,¹⁸² one in Sweden¹⁸¹ and the new trial in Germany.¹⁸³ One trial administered 6 massage treatments over 3 weeks,¹⁸³ a second trial 10 massage treatments over 10 weeks,¹⁸² and the third trial 22 massage treatments over 11 weeks.¹⁸¹ The new trial evaluated outcomes in the short term only¹⁸³; trials included in the original report one in reported the intermediate term only,¹⁸¹ and one reported on the short and intermediate term.¹⁸²

All trials were rated fair quality (Appendix E). Methodological limitations included the inability to blind interventions in all trials, and 21 percent attrition in the trial comparing massage with exercise.¹⁸¹

Massage Therapy Compared With an Attention Control or Waitlist

One trial of Swedish massage versus attention control found that a greater proportion of participants in the massage group achieved ≥5 point improvement on the NDI in the short-term (39% versus 14%, RR 2.7, 95% CI 0.99 to 7.5) and intermediate term (57% versus 31%, RR 1.8, 95% CI 0.97 to 3.5).¹⁵³ Massage was associated with a small improvement in short-term function compared with attention or waitlist controls (2 trials [1 new], pooled difference −3.66 on a 0 to 50 NDI scale, 95% CI −6.58 to −0.56, I²=10.2%) (Figure 30).¹⁸²^,¹⁸³ The massage technique in one trial was soft tissue massage and mobilization of upper extremity joints and the cervical spine (i.e., Tuina massage) (difference −4.8, 95% CI −7.0 to −2.6 on the 0 to 50 NDI scale)¹⁸³ and structural or relaxation massage (i.e., Swedish massage) in one trial (difference −2.3, 95% CI −4.7 to 0.1 on the 0-50 NDI scale).¹⁸²

One new, small fair quality study reported that Tuina massage was associated with moderate improvement in pain intensity experienced during the previous 7 days compared with waitlist controls (difference −1.8 on a 0-10 scale, 95% CI −2.7 to −0.9).¹⁸³

A greater proportion of participants in the Swedish massage group reported improvement in a symptom bothersomeness scale (≥30%) in the short term (55% versus 25%; RR 2.2, 95% CI 1.04 to 4.2) but not the intermediate term (43% vs. 39%; RR 1.1, 95% CI 0.6 to 2.0) compared with attention controls in one trial.¹⁸² One new trial found no differences between groups in SF-36 PCS and MCS while one reported a better quality of life as measured by the SF-12 PCS (difference 5.6 on a 0-100 scale, 95% CI 2.4 to 8.9), but not on the SF-12 MCS (difference 2.6 on a 0-100 scale, 95% CI −1.4 to 6.6).¹⁸³

Massage Therapy Compared With Pharmacological Therapy

No trial of manual therapy versus pharmacological therapy met inclusion criteria.

Massage Therapy Compared With Exercise

One fair-quality study reported no difference in intermediate-term pain comparing classical massage with neck coordination exercises (difference 0.2, 95% CI −0.82 to 1.22, 0-10 scale) or muscle performance exercises (no data given, p>0.05).¹⁸¹ The use of opioid therapies and healthcare utilization were not evaluated.

Harms

None of the trials reported serious adverse effects. Nonserious mild adverse effects included discomfort or pain during (n=5) or after Swedish massage (n=3) in one trial.¹⁸² In the new trial of Tuina massage, the proportion of patients reporting mild adverse events was 41.3% (19/46); most included increased pain (aching muscles, n =11; headache, n=3 and point tenderness, n=1).¹⁸³ Other mild adverse events included dizziness, sleepiness, mood swings, nausea, difficulty staying asleep, difficulty moving the head and neck. In the third trial, transient neck or headache pain was reported in the neuromuscular training exercise group (n=10); there was no mention of complications for the strength training or massage groups.¹⁸¹

Mind-Body Practices for Chronic Neck Pain

Key Points

Alexander Technique resulted in a small improvement in function in the short term (difference −5.56 on a 0-100% scale, 95% CI −8.33 to −2.78) and intermediate term (difference −3.92, 95% CI −6.87 to −0.97) compared with usual care alone, based on one fair-quality trial (SOE: low).
There was no clear evidence that basic body awareness therapy improved function in the short term versus exercise in one fair-quality trial (SOE: low).
There is insufficient evidence from one poor-quality trial to determine the effects of qigong on intermediate-term or long-term function or pain versus exercise; no data were available for short term outcomes (SOE: insufficient).
Both fair-quality trials reported no serious treatment-related adverse events. The trial evaluating Alexander Technique versus usual care found no clear between-group difference for nonserious adverse events, such as pain and incapacity, knee injury, or muscle spasm (RR 2.25, 95% CI 1.00 to 5.04). The other trial reported no differences between basic body awareness and exercise in any nonserious adverse effect (RR 0.65, 95% CI 0.37 to 1.14) (SOE: low).

Detailed Synthesis

Three trials (reported in 4 publications) of mind-body practices met inclusion criteria, (Table 23 and Appendix D).²¹³^,²¹⁴^,²²¹^,²²² All three trials were included in the prior AHRQ report; only a newly identified publication (subanalysis)²¹⁴ of a previously included trial²¹³ was added for this update. One trial evaluated the Alexander Technique (a method of self-care developed to help people enhance their control of reaction and improve their way of going about everyday activities) plus usual care (N=344),²¹³ one trial basic body awareness therapy (N=113),²²² and one trial of qigong (N=139).²²¹ One trial compared mind-body techniques versus usual care²¹³ and two trials versus individually adjusted cervical and shoulder strengthening and stretching exercises,²²¹ or group-led exercises for whole body strengthening, aerobic, and coordination exercises.²²² Two trials were conducted in Sweden²²¹^,²²² and one in England.²¹³ The duration of mind-body treatment ranged from 10 to 20 weeks and the number of treatment sessions ranged from 12 to 20. One trial reported outcomes during the intermediate term and long term,²²¹ one short-term and intermediate-term outcomes,²¹³ and one short-term outcomes only.²²²

Two of the trials were rated fair quality²¹³^,²²² and one trial poor quality²²¹ (Appendix E). In the two fair-quality trials, the main methodological limitation was the inability to blind interventions. Limitations in the other trial included the inability to blind interventions, high attrition, and unequal loss to followup between groups.

Mind-Body Practices Compared With Usual Care

One fair-quality trial found a small improvement in function as measured by the NPQ in favor of the Alexander Technique plus usual care versus usual care alone in the short term (difference −5.56 on a 100% scale, 95% CI −8.33 to −2.78) and intermediate term (difference −3.92, 95% CI −6.87 to −0.97).²¹³ There were no significant differences between the intervention group and usual care for the physical component score of the SF-12 (version 2) at 1-month or 7-month followup. However, significantly larger improvements in the MCS occurred in the Alexander group versus the usual care group 7 months following treatment (difference, 2.12 on a 0-100 scale, 95% CI 0.42 to 3.82).²¹³

In a new secondary economic analysis of a subset (57%) of patients from a previously included trial there were no significant differences between Alexander Technique and usual care in terms of UK National Health Service (NHS) healthcare utilization (appointments or prescription items).²¹⁴ While more people paid for extra Alexander lessons in the private healthcare setting, this represented people who attended all trial sessions and paid for extra. There were no differences in terms of utilizing other private healthcare services.

Mind-Body Practices Compared With Pharmacological Therapy

No trial of mind-body practice versus pharmacological therapy met inclusion criteria.

Mind-Body Practices Compared With Exercise

There were no differences in function as measured by the NDI between basic body awareness therapy (1 fair-quality study, n=113)²²² in the short term (mean change from baseline −2 versus −1, p>0.05) or qigong (poor-quality study, n=139)²²¹ in the intermediate term or long term (median 22 versus 18, p>0.05, at each time period) versus exercise therapy. The trial assessing qigong found no difference in pain at 6 or 12 months following treatment (median 2.6 versus 2.3 and 2.8 versus 2.3, p>0.05, respectively).²²¹ Two of the eight sections of the SF-36v2 favored basic body awareness therapy versus exercise in the short term (bodily pain and social functioning) in the fair-quality trial.²²² No other section of the SF-36v2 demonstrated a difference between groups.

No trial evaluated effects of mind-body practices on use of opioid therapies.

Harms

Two trials, one of basic body awareness therapy²²² and the other of Alexander Technique,²¹³ reported no serious adverse effects. One patient in the basic body awareness group and four patients in the exercise group reported that they discontinued treatment due to increased neck symptoms or pain in other joints (p=0.363). The event risk for all nonserious adverse events was 0.27 in the body awareness therapy group and 0.40 in the exercise group (RR 0.65, 95% CI 0.37 to 1.14). In the trial comparing Alexander Technique versus usual care, no clear difference was seen in the risk of any nonserious adverse event (e.g., pain and incapacity, knee injury, muscle spasm, and complications after surgery): RR 2.25 (95% CI 1.00 to 5.04).

Acupuncture for Chronic Neck Pain

Key Points

Acupuncture was associated with small improvements in short-term and intermediate-term function versus sham acupuncture, a placebo (sham laser), or usual care (short term, 5 trials, pooled SMD −0.40, 95% CI −0.67 to −0.14, I²=61%; intermediate term, 3 trials, pooled SMD −0.19, 95% CI −0.37 to 0.05, I²=0%). One trial reported no difference in function in the long term (SMD −0.23, 95% CI −0.61 to 0.16) (SOE: low for all time periods).
There were no differences in pain in trials comparing acupuncture with sham acupuncture or placebo interventions in the short term (4 trials [excluding outlier trial], pooled difference −0.27 on a 0-10 scale, 95% CI −0.59 to 0.05, I²=2%), intermediate term (3 trials, pooled difference 0.40, 95% CI −0.45 to 1.44, I²=19%), or long term (1 trial, difference −0.35, 95% CI −1.34 to 0.64) (SOE: low for all time periods).
There was insufficient evidence from two small poor-quality trials to draw conclusions regarding short-term function or pain for acupuncture versus NSAIDs (SOE: insufficient).
No serious adverse events were reported in six trials reporting harms. The most commonly reported nonserious adverse events in people receiving acupuncture included numbness/discomfort, fainting, and bruising (SOE: moderate).

Detailed Synthesis

We identified nine trials (reported in 10 publications) of acupuncture that met our inclusion criteria, (Table 24 and Appendix D).²¹³^,²¹⁴^,²³¹^–²³⁷^,²⁵⁴ All trials were included in the prior AHRQ report; only a newly identified publication (subanalysis)²¹⁴ of a previously included trial²¹³ was added for this update. All trials evaluated needle acupuncture to body acupoints; two also evaluated electroacupuncture.²³⁴^,²³⁷ Control groups included sham acupuncture in five trials,²³¹^–²³⁴^,²³⁶ placebo intervention (sham TENS²³⁵ and sham laser acupuncture²³⁷) in two trials, usual care in one trial,²¹³ and pharmacological therapy (Zaltoprofen²⁵⁴ and Trilisate²³¹) in two trials. The duration of acupuncture therapy ranged from 2 weeks to 5 months, and the number of sessions from 5 to 14. Sample sizes ranged from 30 to 345 (total sample=1,260). Across trials, participants were predominately female (from 60% to 90%) with mean ages ranging from 37 to 53 years. One trial was conducted in the United States,²³¹ one in Turkey,²³⁴ and the rest in Asia²³²^,²³³^,²³⁷^,²⁵⁴ or Europe.²¹³^,²³⁵^,²³⁶ One trial reported outcomes through long-term followup,²³⁶ four trials through intermediate-term followup,²¹³^,²³⁵^–²³⁷ and the remainder only evaluated short-term outcomes.²³¹^–²³⁴^,²⁵⁴

Seven trials were rated fair quality²¹³^,²³²^–²³⁷ and two trials poor quality²³¹^,²⁵⁴ (Appendix E). Common limitations in the fair-quality trials included unclear allocation concealment methods and of care provider blinding; additionally, the poor-quality trials had baseline group dissimilarity (not controlled for) and high attrition.

Acupuncture Compared With Sham Acupuncture, Usual Care, or a Placebo Intervention

Acupuncture was associated with small improvements in short-term and intermediate-term function versus sham acupuncture, placebo (sham laser), or usual care (short term, 5 trials,²¹³^,²³²^,²³³^,²³⁶^,²³⁷ pooled SMD −0.40, 95% CI −0.67 to −0.14, I²=61%; intermediate term, 3 trials,²¹³^,²³⁶^,²³⁷ pooled SMD −0.19, 95% CI −0.37 to 0.05, I²=0.0%) (Figure 31). Trials measured function using the NDI or the NPQ; across trials the SMD ranged from −0.78 to −0.03 in the short term and −0.29 to −0.05 in the intermediate term. None of the trials were rated poor quality. One trial reported no difference in function in the long term (SMD −0.23, 95% CI −0.61 to 0.16).²³⁶

Acupuncture was associated with small improvements in short-term pain versus controls (5 trials, pooled difference −0.66, 95% CI −1.46 to 0.11, I²=78.4%), but statistical heterogeneity was large.²³²^–²³⁴^,²³⁶^,²³⁷ (Figure 32). Excluding an outlier trial (pooled difference −1.80, 95% CI −2.36 to −1.24)²³² eliminated statistical heterogeneity and resulted in a markedly attenuated effect (difference −0.27, 95% CI −0.59 to 0.05, I²=2%). Stratified analyses according to the type of control (sham or placebo laser) resulted in similar estimates. Trials reported no differences in pain between acupuncture versus controls in the intermediate term (3 trials, pooled difference 0.40, 95% CI −0.45 to 1.44, I²=18.7%)²³⁵^–²³⁷ or long term (1 trial, difference −0.35, 95% CI −1.34 to 0.64).²³⁶

In a secondary economic analysis of a subset (57%) of patients, 1 trial reported that there were no significant differences between acupuncture and usual care in terms of UK NHS healthcare utilization (appointments or prescription items).²¹⁴ While more people paid for extra acupuncture in the private healthcare setting, this represented people who attended all trial sessions and paid for extra. There were no differences in terms of utilizing other private healthcare services.

In general, acupuncture did not improve quality of life compared with sham intervention in the short term or intermediate term as reported in four trials²³³^,²³⁵^–²³⁷ (Table 23).

No trial evaluated effects of acupuncture on use of opioid therapies.

Acupuncture Compared With Pharmacological Therapy

Two small poor-quality trials evaluated acupuncture versus NSAIDs. One trial (n=27) compared acupuncture three times per week for 3 weeks versus 80 mg of Zaltoprofen alone three times per day for 3 weeks.²⁵⁴ The other trial (n=30) compared 14 sessions of acupuncture versus 500 mg of Trilisate per day for 8 weeks.²³¹ In the short term, one trial reported no difference in NDI (difference −0.4, 95% CI −4.6 to 3.8).²⁵⁴ Both trials reported no difference between groups in pain as measured by the McGill Pain Questionnaire²³¹ or VAS.²⁵⁴ One trial found no differences between groups in the Beck Depression Index, the SF-36, or the EQ-5D in the short term²⁵⁴ (Table 23).

Acupuncture Compared With Exercise Therapy

No trial of acupuncture versus exercise met inclusion criteria.

Harms

Six of the eight trials assessing acupuncture reported harms.²¹³^,²³³^,²³⁵^–²³⁷^,²⁵⁴ No serious adverse events (defined as involving death, hospitalization, persistent disability, or a life-threatening risk in one trial²¹³ and undefined in the other five studies) were reported in any trial. The most commonly reported nonserious adverse effects in people receiving acupuncture included numbness/discomfort (2.7%), fainting (1.1%), and bruising (1.1%).

Key Question 3. Osteoarthritis Pain

For OA, 53 RCTs (in 56 publications) were included in the prior AHRQ report (N=6,101). Four studies were rated good quality, 31 studies fair quality, and 18 studies poor quality. The prior AHRQ report found exercise and ultrasound (US) associated with greater effects than usual care, an attention control or a sham procedure on improved function (exercise, US) or pain (exercise) for the treatment of knee OA. The strength of evidence was low or moderate, generally stronger for function than for pain, and observed at short, intermediate, and long term (with the exception of pain) for exercise but only short term for ultrasound. For hip OA, exercise and manual therapy were associated with small improvements compared with usual care and exercise for function (short and intermediate term) and pain (intermediate term). The strength of evidence was low. For hand OA, there was either no difference between treatment groups for function or pain or the evidence was insufficient to draw conclusions.

For this update, we identified nine new RCTs (in 10 publications) of knee OA (N=1,235); no new trials evaluating hip or hand OA were identified. One of the new studies was rated good quality, seven were rated fair quality, and one was rated poor quality. The new trials evaluated exercise (5 trials), psychological therapies (2 trials), and physical modalities (ultrasound) (2 trials). The Key Points summarize the main findings based on the evidence included in the prior report and new trials; the Key Points note where new trials contributed to findings.

Exercise for Osteoarthritis Knee Pain

Key Points

Exercise was associated with a small improvement in function compared with usual care, no treatment, or sham intervention short term (8 trials [1 new trial], pooled SMD −0.29, 95% CI −0.46 to −0.11, I²=10%) moderate improvement intermediate term (11 trials [two new trials and excluding outlier trial], pooled SMD −0.63, 95% CI −1.17 to −0.10, I²=91%), and small improvement long term (4 trials [2 new trials], pooled SMD −0.22, 95% CI −0.34 to −0.08, I²=0%) (SOE: moderate for short term; low for intermediate and long term).
One trial found no statistical difference between exercise or sham procedure in the proportion of patients who reported clinically relevant reductions (≥1.75 points) in VAS pain on movement (prior week) [58% (34/59) vs. 42% (27/65); RR 1.4, 95% CI 1.0 to 2.0] or VAS global improvement in pain [59% (35/59) vs. 50% (33/65); RR 1.2, 95% CI 0.8 to 1.6] in the short term.
Exercise was associated with a small improvement in pain short term (8 trials [1 new trial], pooled difference on a 0-10 scale −0.47, 95% CI −0.86 to −0.10, I²= 42%) versus usual care, no treatment, waitlist, or sham intervention (SOE: moderate), a moderate improvement intermediate term (11 trials [2 new trials], pooled difference −1.34, 95% CI −2.12 to −0.54, I²=90% on a 0-10 scale) compared with usual care, an attention control, waitlist, or no treatment (SOE: low), and a small improvement long term (4 trials [2 new trials], pooled difference −0.30 on a 0 to 10 scale, 95% CI −0.49 to 0.00, I²=0%) compared to usual care, attention control, or waitlist. (SOE: low).
One new trial found that more patients who received exercise versus pharmacological therapy (analgesics and anti-inflammatory drugs) achieved a clinically important improvement in function in the intermediate term (>10 point improvement on the Knee Injury and Osteoarthritis Outcome Score [KOOS] ADL), 47% (22/47) versus 28% (13/46); RR 1.7, 95% CI 1.0 to 2.9, although the difference did not reach statistical significance. There were no differences between the groups across all other function and pain outcomes measured (SOE: low).
Harms were not well reported. Across seven trials, one reported minor temporary increase in pain with exercise, four others found no difference in worsening pain versus controls, and one reported no difference in falls or death (SOE: moderate).

Detailed Synthesis

Twenty-three trials (in 26 publications) of exercise therapy for knee osteoarthritis (OA) met inclusion criteria (Table 25 and Appendix D).⁴⁷^–⁷¹^,¹⁰²^,¹⁰³ Eighteen trials (in 21 publications) ⁴⁷^–⁶⁷ were included in the prior AHRQ report and five (in six publications)⁶⁸^–⁷¹^,¹⁰²^,¹⁰³ were added for this update. Eight trials evaluated muscle performance exercise versus attention control,⁵¹^,⁵²^,⁵⁴^,⁵⁷^,⁵⁸^,⁶⁶ no treatment⁴⁹^,⁵³^,⁶⁵ or usual care (1 new trial).⁷¹ In nine trials (3 new trials), the interventions consisted of combined exercise approaches compared with usual care,⁴⁷^,⁵⁵^,⁵⁶^,⁶⁰^,⁶³^,⁶⁸^–⁷⁰ an attention control⁶⁴ or no treatment.⁵⁰ Muscle performance exercises were a component of nine of these trials (3 new trials).⁴⁷^,⁵⁰^,⁵⁵^,⁵⁶^,⁶⁰^,⁶³^,⁶⁴^,⁶⁸^–⁷⁰ One trial had an aerobic exercise arm that consisted of a facility-based, 1-hour walking program three times per week over 3 months, and it used an attention control.⁵¹^,⁵⁷^,⁵⁸ A single trial evaluated a mobility exercise program based on Mechanical Diagnosis and Therapy (MDT) versus a waitlist comparator, where patients were allowed to continue receiving usual care.⁶¹ One trial evaluated gait training (guided strategies to optimize knee movements during treadmill walking with computerized motion analysis with visual feedback) versus usual care.⁶² Five trials (2 new trials) tested exercise programs as a part of physiotherapy care compared to usual care or sham.⁴⁸^,⁵⁹^,⁶⁷^–⁶⁹ The duration of exercise programs ranged from 2 to 26 weeks; the number of exercise sessions ranged from 4 to 36. One new trial compared neuromuscular reeducation exercise with pharmacological intervention.¹⁰²^,¹⁰³

Sample sizes ranged from 50 to 786 (total sample=3,633). Across the trials, the majority of patients were female (51% to 100%) with mean ages ranging from 56 to 75 years. Seven trials (2 new trials) specifically included patients with bilateral knee OA.⁴⁹^,⁵²^–⁵⁴^,⁶⁶^,⁶⁸^,⁶⁹ Six trials (1 new trials) were conducted in the United States or Canada,⁵¹^,⁵⁶^–⁵⁸^,⁶⁰^–⁶³^,⁶⁸ eight (3 new trials) in Europe,⁵⁵^,⁵⁹^,⁶⁴^,⁶⁵^,⁶⁷^,⁶⁹^,⁷¹^,¹⁰²^,¹⁰³ five in Taiwan,⁴⁹^,⁵²^–⁵⁴^,⁶⁶ two in Australia or New Zealand,⁴⁷^,⁴⁸ one in Brazil⁵⁰ and one new trial in Malaysia.⁷⁰ Most trials had short (7 trials [1 new trial])⁴⁷^,⁵⁵^,⁶¹^,⁶²^,⁶⁵^,⁶⁷^,⁶⁹ or intermediate followup (13 trials [3 new trials]).⁴⁹^,⁵⁰^,⁵²^–⁵⁴^,⁵⁶^,⁶²^–⁶⁴^,⁶⁶^,⁶⁸^,⁷⁰^,¹⁰²^,¹⁰³ Four trials (1 new trial) reported long-term outcomes.⁵⁶^–⁵⁸^,⁶⁰^,⁶⁴^,⁷¹

Sixteen trials (4 new trials) were rated fair quality (one at short-term followup⁶²),⁴⁷^,⁴⁸^,⁵¹^,⁵²^,⁵⁴^–⁶¹^,⁶⁵^,⁶⁸^–⁷⁰^,¹⁰²^,¹⁰³ and nine trials (1 new trial) poor quality,⁴⁹^,⁵⁰^,⁵³^,⁶³^,⁶⁴^,⁶⁶^,⁶⁷^,⁷¹ including one at intermediate-term followup⁶² (Appendix E). In the fair-quality trials, the main methodological limitation was a lack of blinding for the patients or care providers. Additional limitations in the poor-quality trials included unclear randomization and allocation concealment methods, unclear use of intention to treat, unclear baseline differences between intervention groups, and attrition not reported or unacceptable.

Exercise Compared With Usual Care, No Treatment, Sham, or an Attention Control

Functional Outcomes. Exercise was associated with a small improvement short-term in function (assessed across various measures) compared with usual care, no treatment, or sham intervention (8 trials [1 new trial], pooled SMD −0.29, 95% CI −0.46 to −0.11, I²=9.9%),⁴⁸^,⁵⁵^,⁵⁹^,⁶¹^,⁶²^,⁶⁵^,⁶⁷^,⁶⁹ (Figure 33). Estimates were similar following exclusion of poor-quality trials and when analyses were stratified by exercise and control type. In the short term, across three fair-quality trials,⁵⁵^,⁶¹^,⁶⁵ a small improvement in the KOOS Sport and Recreation scale was seen with exercise compared with usual care or no treatment (pooled difference 5.88 on a 0-100 scale, 95% CI 0.28 to 11.27, I²=0%, plot not shown) but there was no clear difference between groups in the KOOS ADL (pooled difference 5.06 on a 0-100 scale, 95% CI −1.99 to 10.65, I²=44.6%, plot not shown).

Exercise was also associated with moderate improvement in function (assessed across various measures) versus usual care, no treatment, or attention control at intermediate term (12 trials [2 new trials], pooled SMD −0.98, 95% CI −1.86 to −0.13, I²=96.5%),⁴⁹^,⁵⁰^,⁵²^–⁵⁴^,⁵⁶^,⁵⁹^,⁶²^,⁶³^,⁶⁶^,⁶⁸^,⁷⁰ (Figure 33). Substantial heterogeneity was present with one outlier trial⁵⁰ of combination exercise versus no treatment in elderly patients (median age 75 years) which had higher (worse) baseline Lequesne Index scores compared with other studies and a larger change from baseline score in the intervention group. Removal of this poor quality trial did not improve heterogeneity but did attenuate the pooled estimate (11 trials [2 new trials], pooled SMD −0.63, 95% CI −1.17 to −0.10, I²=90.8%). Stratification by exercise type and control type may partially explain the heterogeneity. Muscle performance exercise, but not combination exercise (5 trials), was associated with a moderate improvement in function compared with attention control or no treatment (5 trials, pooled SMD −1.44, 95% CI −2.08 to −0.79)⁴⁹^,⁵²^–⁵⁴^,⁶⁶ and when compared with attention control only (3 trials, pooled SMD −1.12, 95% CI −1.83 to −0.47)⁵²^,⁵⁴^,⁶⁶ and no treatment only (2 poor quality trials, pooled SMD −1.88, 95% CI −3.16 to −0.55).⁴⁹^,⁵³ No difference was seen across studies of exercise versus usual care (5 trials [1 new trial], pooled SMD 0.05, 95% CI −0.16 to 0.26).⁵⁶^,⁵⁹^,⁶²^,⁶³^,⁷⁰

Analyses confined to trials that evaluated function on the 0-24 point Lequesne Index also suggests a moderate improvement in intermediate-term function with exercise compared with attention control or no treatment (6 trials, pooled difference −3.42, 95% CI −5.77 to −1.07, I²=97%, plot not shown).⁴⁹^,⁵⁰^,⁵²^–⁵⁴^,⁶⁶ Again, removal of the poor quality outlier trial⁵⁰ did not impact the heterogeneity, but yielded a slightly lower effect estimate (5 trials, pooled difference −2.40, 95 CI −3.32 to −1.44), still consistent with a moderate effect for exercise. Results were similar when analyses were stratified according to muscle performance exercise, use of attention control, and study quality (when only the two fair-quality trials were retained).

One fair-quality trial (n=101 with knee OA)⁴⁷ compared combined exercise programs to usual care for intermediate-term function using the Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC). The exercise group had improvement in function from baseline, which was not statistically significant (mean change from baseline −12.7, 95% CI −27.1 to 1.7), while the usual care group had no change in function (mean change from baseline 1.6, 95% CI −10.5 to 13.7). Data were insufficient to determine effect size or include in the meta-analysis.

One, new fair-quality trial showed no significant difference between combined exercise and usual care at intermediate term for the KOOS Sport and Recreation (difference −18.2 on a 0-100 scale, 95% CI −41.5 to 5.1) or KOOS ADL (difference −5.4 on a 0-100 scale, 95% CI −18.3 to 7.4).⁷⁰

One trial separately analyzed participants free of disability for ADLs at baseline (n=250) and followed them to compare cumulative incidence of disability over 15 months. The aerobic exercise group had decreased risk of disability compared to the attention control group, RR 0.53 (95% CI 0.33, 0.85), as did the muscle performance exercise group compared to the attention control group, RR 0.60 (95% CI 0.38, 0.97).⁵⁷

A small improvement in function long-term was seen across four trials (2 new trials) of exercise compared with usual care, attention control, or waitlist (pooled SMD −0.22, 95% CI −0.34 to −0.08, I²=0%), two fair⁵⁶^,⁶⁸ and two poor quality⁶⁴^,⁷¹ (Figure 33). Following exclusion of the two poor quality trials the difference was slightly attenuated and no longer statistically significant (pooled SMD −0.18, 95% CI −0.38 to 0.03, I²=0%). No difference between groups was seen when exercise was compared with a waitlist control only (2 trials, pooled difference −0.17, 95% CI −0.45 to 0.15). A single, new poor-quality trial found no long-term difference in KOOS Sport and Recreation (difference 2.3 on a 0-100 scale, 95% CI −7.9 to 12.5) or KOOS ADL (difference 0.90 on a 0-100 scale, 95% CI −4.1 to 5.9) for muscle performance exercise compared with waitlist.⁷¹

Pain Outcomes. One fair-quality trial found no statistical difference between exercise or sham procedure in the proportion of patients who reported clinically relevant reductions (≥1.75 points) in VAS pain on movement (prior week) [58% (34/59) vs. 42% (27/65); RR 1.4, 95% CI 1.0 to 2.0] or VAS global improvement in pain [59% (35/59) vs. 50% (33/65); RR 1.2, 95% CI 0.8 to 1.6] in the short term.⁴⁸ Exercise was associated with a small improvement in short-term pain compared with usual care, no treatment, waitlist or sham in eight (1 new) trials (pooled difference on a 0-10 scale −0.47, 95% CI −0.86 to −0.10, I²=42%) (Figure 34). Seven trials (1 new trial) were fair quality⁴⁸^,⁵⁵^,⁵⁹^,⁶¹^,⁶²^,⁶⁵^,⁶⁹ and one was poor quality.⁶⁷ The estimate was similar following exclusion of the poor-quality trial (pooled difference −0.45, 95% CI −0.86 to −0.04). Across studies comparing exercise with usual care, results were also similar (5 trials, pooled difference −0.53, 95% CI −1.07 to −0.02).⁵⁵^,⁵⁹^,⁶¹^,⁶²^,⁶⁷

Exercise was associated with moderately greater improvement in intermediate-term pain compared with usual care, attention control, waitlist or no treatment across pain measures (11 trials [2 new trials], pooled difference −1.34, 95% CI −2.12 to −0.54, I²=90% on a 0-10 scale) across six fair-quality trials (2 new trials)⁵²^,⁵⁴^,⁵⁶^,⁵⁹^,⁶⁸^,⁷⁰ and five poor-quality trials⁴⁹^,⁵³^,⁶²^,⁶³^,⁶⁶ (Figure 34). Following exclusion of the poor quality trials the difference between groups was attenuated and no longer statistically significant (pooled SMD −0.98, 95% CI −2.09 to 0.12). Results differed somewhat by type of exercise and type of control. Five trials (2 new trials) showed no difference between combination exercise and usual care or waitlist⁵⁶^,⁵⁹^,⁶³; however, a substantial improvement in pain was seen for muscle performance exercise compared with attention control or no treatment (5 trials, pooled difference on 0-10 scale −2.53, 95% CI −3.23 to −1.80)⁴⁹^,⁵²^–⁵⁴^,⁶⁶ and when compared with attention control only (3 trials, pooled difference −2.18, 95% CI −3.15 to −1.24)⁵²^,⁵⁴^,⁶⁶ and with no treatment only (2 poor quality trials, pooled difference −3.01, 95% CI −4.00 to −1.90).⁴⁹^,⁵³ No difference was seen across studies of exercise versus usual care (5 trials [1 new trial], pooled SMD −0.29, 95% CI −0.80 to 0.13).⁵⁶^,⁵⁹^,⁶²^,⁶³^,⁷⁰

Exercise resulted in a small improvement in long-term pain versus usual care, waitlist or attention control (pooled difference −0.30 on a 0 to 10 scale, 95% CI −0.49 to 0.00, I²=0%), in three fair-quality trials (2 new trials)⁵⁶^,⁶⁸^,⁷¹ and one large, poor-quality trial⁶⁴ (Figure 34).

Most trials evaluated pain using a traditional 0 to 10 VAS. A small improvement in short-term pain favoring exercise was observed across four trials (3 fair [one new trial], 1 poor quality, pooled difference −0.83, 95% CI −1.49 to −0.19, I²=33%)⁴⁸^,⁵⁹^,⁶⁷^,⁶⁹; the effect estimate was similar after exclusion of the poor quality trial (pooled difference −0.84, 95% CI −1.73 to 0.02).⁶⁷ Estimates confined to combination exercise showed a slightly greater effect size and remained significant (3 trials, pooled difference −1.14, 95% CI −1.73 to −0.41).⁵⁹^,⁶⁷^,⁶⁹ Findings for intermediate-term pain showed a moderate improvement with exercise (7 trials, pooled difference −2.04, 95% CI −2.86 to −1.13, I²=81%).⁴⁹^,⁵²^–⁵⁴^,⁵⁹^,⁶³^,⁶⁶ The pooled estimate was similar when four poor-quality trials⁴⁹^,⁵³^,⁶³^,⁶⁶ were excluded, leaving three fair-quality trials (pooled difference −1.97, 95% CI −3.45 to −0.44).⁵²^,⁵⁴^,⁵⁹ When results were stratified by exercise type, muscle performance exercise resulted in a large effect size (5 trials, pooled difference −2.53, 95% CI −3.23 to −1.80)⁴⁹^,⁵²^–⁵⁴^,⁶⁶ while results for combination exercise showed no difference versus usual care (2 trials, pooled difference −0.54, 95% CI −1.55 to 0.51).⁵⁹^,⁶³ Stratification by control type among studies reporting VAS pain yielded similar findings to those across multiple measures. No trial employing VAS reported on long-term pain.

Other Outcomes. Health-related quality of life (QoL) outcomes had mixed results (Table 24). Two fair-quality trials found no association between exercise and short-term QoL on the KOOS 0 to 100 scale (pooled difference 1.8, 95% CI −2.5 to 6.0, I²=0%, plot not shown).⁵⁵^,⁶¹ A fair-quality trial (n=65) reported no differences in mean change for short term SF-36 PCS (mean change of 3.0 [95% CI −5.9 to 16.3] versus −0.7 [95% CI −14.8 to 9.8]) and SF-36 MCS (mean change of 0.7 [95% CI −18.1 to 13.2] vs. −0.7 [95% CI −16.8 to 12.8]).⁶⁵ One fair-quality trial (n=158) reported similar health-related QoL scores between a combined exercise group and usual care using averaged intermediate- and long-term scores. The adjusted mean (standard error [SE]) SF-36 PCS were 37.6 (0.9) vs. 35.3 (0.8), respectively, and adjusted mean (SE) SF-36 MCS were 54.1 (0.8) vs. 53.7 (0.8), respectively.⁶⁰ A poor-quality trial (n=50) reported intermediate-term SF-36 scores for individual domains. Functional capacity, physical role, bodily pain, general health, and vitality showed small improvement with exercise versus attention control.⁵⁰

A fair-quality trial (n=438) reported no difference in depressive symptoms compared with attention control (2.59 vs. 2.80, p=0.27) for muscle performance exercise, while aerobic exercise was associated with fewer depressive symptoms on the Center for Epidemiologic Studies Depression (CES-D) questionnaire compared to attention control (2.12 vs. 2.80, p<0.001).⁵⁸

There was insufficient evidence to determine effects of duration of exercise therapy or number of sessions on outcomes. No trials reported on changes in opioid use as a result of exercise programs.

Exercise Compared With Pharmacological Therapy or With Other Nonpharmacological Therapies

One new trial (in 2 publications) of exercise therapy versus pharmacological therapy met inclusion criteria. This fair-quality trial (N=93)¹⁰²^,¹⁰³ compared combined exercise with standard recommendations for analgesics and anti-inflammatory drugs and had intermediate-term followup only. More patients who received exercise versus pharmacological therapy achieved a clinically important improvement in function (>10 point improvement on KOOS ADL), 47% (22/47) versus 28% (13/46); RR 1.7, 95% CI 1.0 to 2.9; however the difference did not reach statistical significance. There was no difference between groups for change in function from baseline: KOOS ADL (difference −3.6 on a 0-100 scale, 95% CI −9.2 to 2.1) and KOOS Sport and Recreation (difference −2.9 on a 0-100 scale, 95% CI −11.4 to 5.5). There was also no difference for change in pain from baseline according to the KOOS pain measure (difference 4.2 on a 0-100 scale, 95% CI −10.0 to 1.6), but there was a small difference for change in symptoms favoring exercise, KOOS Symptoms (difference −7.6 on a 0-100 scale, 95% CI −12.7 to −2.6). No difference in change in QoL from baseline was found with the KOOS QoL (difference −1.3 on a 0-100 scale, 95% CI −7.5 to 4.9) and the EQ-5D (difference 2.6, 95% CI −2.9 to 8.1).

Findings for exercise versus other nonpharmacological therapies are addressed in the sections for other nonpharmacological therapies.

Harms

Most trials did not report harms. One trial reported greater temporary, minor increases in pain in the exercise group versus a sham group (RR 14.7, 95% CI 2.0 to 107.7); however, the confidence interval is wide.⁴⁸ Four studies found no difference in worsening of pain symptoms with exercise versus comparators.⁴⁹^,⁵³^,⁶⁵^,⁶⁶ One trial found no difference in falls or deaths.⁵¹ No difference in adverse events (to include abdominal and intestinal symptoms, musculoskeletal symptoms, central nervous system, psychiatric symptoms, skin and subcutaneous symptoms and other) was reported for exercise compared to standard analgesics and anti-inflammatory therapy.¹⁰²^,¹⁰³

Psychological Therapy for Osteoarthritis Knee Pain

Key Points

Two new trials of motivational interviewing and CBT versus usual care and no treatment found no differences between treatment groups in function (pooled difference −2.09 on a 0-68 WOMAC function scale, 95% CI −8.70 to 1.61, I²=63.3%) but a small improvement in pain (pooled difference −0.6 on a 0-20 WOMAC pain scale, 95% CI −1.5 to −0.1, I²=0.0%) favoring the psychological treatments compared to controls in the short term (SOE: low for both function and pain).
Two trials of pain coping skills training and CBT versus usual care found no differences in function (WOMAC physical function, 0-100) or pain (WOMAC pain, 0-100); treatment effects were averaged over short term to intermediate term (difference −0.3, 95% CI −8.3 to 7.8 for function and −3.9, 95% CI −1.8 to 4.0 for pain) and intermediate term to long term (mean 35.2, 95% CI 31.8 to 38.6 vs. mean 37.5, 95% CI 33.9 to 41.2, and mean 34.5, 95% CI 30.8 to 38.2 vs. mean 38.0, 95% CI 34.1 to 41.8), respectively (SOE: low).
One trial of pain coping skills training versus strengthening exercises found no differences in WOMAC physical function scores (0-68 scale) at short term (difference 2.0, 95% CI −2.4 to 6.4) or intermediate term (difference 3.2, 95% CI −0.6 to 7.0) or in WOMAC pain scores (0-20 scale) at short term (difference −0.1, 95% CI −1.2 to 1.0) or intermediate term (difference 0.4, 95% CI −0.8 to 1.6) (SOE: low).
No serious harms were reported in either trial (SOE: low).

Detailed Synthesis

Five trials of psychological therapies for knee OA met inclusion criteria (Table 26 and Appendix D).¹⁰⁹^–¹¹²^,¹³⁴ Three trials were included in the prior AHRQ report¹⁰⁹^,¹¹⁰^,¹³⁴ and two were added for this update.¹¹¹^,¹¹² Two trials (1 new trial) were conducted in the United States,¹¹⁰^,¹¹¹ one in Finland,¹⁰⁹ and two (1 new trial) in Australia.¹¹²^,¹³⁴ Sample sizes ranged from 67 to 155 (total sample=593). Across the trials, participants were predominately female (60% to 80%) with mean ages ranging from 58 to 64 years. Three trials (1 new trial)¹⁰⁹^,¹¹⁰^,¹¹² evaluated CBT or pain coping skills training with usual care. The number and duration of psychological sessions varied between the trials (6, 2-hour sessions, 6 online sessions or e18, 1-hour sessions, respectively), as did the total duration of therapy (6 and 24 weeks). Usual care was defined as routine care provided by the patient’s primary care doctor and was not well-described in any trial. Another new trial (n=155) compared motivational interviewing focused on goal setting and physical activity with no treatment.¹¹¹ Motivational interviewing consisted of a longer initial session followed by 5 brief sessions (10-15 minutes) over 24 months. The fifth trial (n=149)¹³⁴ compared pain coping skills training (PCST) (ten 45-minute sessions) with strengthening exercises (ten 25-minute sessions); all sessions were conducted on an individual basis over a treatment period of 12 weeks. Participants randomized to receive PCST were told to practice skills daily and then as needed during followup; those in the exercise group were instructed to perform exercises four times a week during 12-week intervention and three times a week during the followup period.

Four trials (2 new trials) were rated fair quality¹⁰⁹^,¹¹¹^,¹¹²^,¹³⁴ and one was rated poor quality¹¹⁰ (see Appendix E for quality ratings). The primary methodological limitation in the fair-quality trials were the inability to effectively blind care providers, outcome assessors, and/or patients. Additional methodological shortcomings in the poor-quality trial included poor treatment compliance and high attrition (32%).

Psychological Therapies Compared With Usual Care

Four trials (2 new trials)¹⁰⁹^–¹¹² compared psychological therapies with usual care or no treatment. Only the short term results of the two new, fair quality trials (O’Moore, 2018 and Gilbert, 2018) were amenable to pooling.¹¹¹^,¹¹² There was no statistically significant difference between groups at short term for function according to the WOMAC (pooled difference −2.09 on a 0-68 scale, 95% CI −8.70 to 1.61, I²=63.3%) (Figure 35) but there was a small improvement in pain favoring the psychological treatments compared to usual care or no treatment (pooled difference −0.60 on the 0-20 WOMAC pain scale, 95% CI −1.48 to −0.08, I² = 0.0%) (Figure 36).¹¹¹^,¹¹² One of these trials¹¹¹ also reported intermediate and long term results with no statistically significant differences between treatment groups in either the WOMAC pain or function subscales at any timepoint with the exception of a small difference in function favoring usual care at 12 months (difference 3.2, 95% CI 0.1 to 6.2) at 12 months. Regarding quality of life, there was no statistically significant difference between groups at short term for either the SF-12 PCS (2 trials, pooled difference 1.3 on a 0-100 scale, 95% CI −1.1 to 3.6, I²=0.0%)¹¹¹^,¹¹² the or the SF-12 MCS (2 trials, pooled difference 3.7 on a 0-100 scale, 95% CI −7.7 to 16.3, I²=90.8%).¹¹¹^,¹¹²

Two other trials reported outcomes averaged over all post-treatment followup times and therefore were not able to be pooled. The trial of CBT averaged results from 1.5 to 10.5 months post-treatment (spanning short to intermediate term)¹⁰⁹ and the trial of pain coping skills training averaged results from 6 to 12 months post-treatment (spanning intermediate to long term).¹¹⁰ Similar to the pooled results, no significant differences in function or pain were found between the psychological therapy and the usual care groups in either trial. Function was measured using the WOMAC physical function subscale (0-100) in both trials, over the short to intermediate term (difference −0.3, 95% CI −8.3 to 7.8)¹⁰⁹ and intermediate to long term (mean 35.2, 95% CI 31.8 to 38.6 vs. mean 37.5, 95% CI 33.9 to 41.2),¹¹⁰ and using the Arthritis Impact Measurement Scale (AIMS) physical disability subscale in one trial¹¹⁰ (Table 25). Both trials measured pain using the WOMAC pain subscale (0-100), one trial over short- to intermediate-term followup (difference −3.9, 95% CI −11.8 to 4.0)¹⁰⁹ and the other over intermediate- to long-term followup (mean 34.5, 95% CI 30.8 to 38.2 vs. mean 38.0, 95% CI 34.1 to 41.8).¹¹⁰ Results were similar for the AIMS pain subscale and the numeric rating scale (NRS) pain scale, reported by one trial each (Table 25). Neither trial reported any differences between groups in any secondary outcome measure.

No trial evaluated effects of psychological therapies on use of opioid therapies or healthcare utilization.

Psychological Therapies Compared With Pharmacological Therapy

No trial of psychological therapy versus pharmacological therapy met inclusion criteria.

Psychological Therapies Compared With Exercise Therapy

One fair-quality trial¹³⁴ of pain coping skills training versus strengthening exercise found no between-group differences in function or pain in the short term (WOMAC physical function, difference 2.0, 95% CI −2.4 to 6.4 on a 0-68 scale and WOMAC pain, difference −0.1, 95% CI −1.2 to 1.0 on a 0-20 scale) or the intermediate term (WOMAC physical function, difference 3.2, 95% CI −0.6 to 7.0 and WOMAC pain, difference 0.4, 95% CI −0.8 to 1.6) (Table 25). Results were similar for overall pain and pain with walking, both measured on a 0-100 VAS. There were also no differences between groups on any other secondary outcome measure including opioid use at short-term or intermediate-term followup.

Harms

In the four trials of psychological interventions versus usual care,¹⁰⁹^–¹¹² no adverse events were observed. In the fifth trial,¹³⁴ fewer participants in the pain coping skills training group compared with the exercise group experienced pain in the knee (3% vs. 31%, p<0.001) and in other body regions (4% vs. 15%, p=0.02) during treatment; during followup, only the frequency of pain in other body areas differed between groups (0% vs. 11%, respectively, p<0.05; knee pain, 7% vs. 10%, p=0.53). Pain was most mostly mild and transient.

Physical Modalities for Osteoarthritis Knee Pain

Key Points

Ultrasound

Three trials (2 new trials), one good-, one fair- and one poor-quality, found no statistically significant differences between either continuous or pulsed ultrasound or sham in short-term function (pooled difference −2.50 on a 0-24 scale, 95% CI −6.37 to 1.22, I²=94.0%) and short-term pain intensity (pooled difference −1.2 on a 0-10 scale, 95% CI −3.7 to 1.3, I²=91.1%) (SOE: low).
One fair-quality trial found no differences between continuous and pulsed ultrasound versus sham in intermediate-term function (difference −2.9, 95% CI −9.19 to 3.39 and 1.6, 95% CI −3.01 to 6.22, on a 0-68 WOMAC function scale) or pain (difference −1.6, 95% CI −3.26 to 0.06 and 0.2, 95% CI −1.34 to 1.74, on a 0-20 WOMAC pain scale). There was also no difference between groups for VAS pain during rest or on movement (SOE: low).
No adverse events were reported during the two trials (SOE: low).

Transcutaneous Electrical Nerve Stimulation

One trial found no differences between TENS and placebo TENS in intermediate-term function (proportion of patients who achieved a minimal clinically important difference (MCID) on the WOMAC function subscale [≥9.1], 38% vs. 39%, RR 1.2, 95% CI 0.6 to 2.2; and difference −1.9, 95% CI −9.7 to 5.9 on the 0-100 WOMAC function subscale) or intermediate-term pain (proportion of patients who achieved MCID [≥20] in VAS pain, 56% vs. 44%, RR 1.3, 95% CI 0.8 to 2.0; and difference −5.6, 95% CI −14.9 to 3.6 on the 0-100 WOMAC pain subscale) (SOE: low for function and pain).
One trial of TENS reported no difference in the risk of minor adverse events (RR 1.06 (95% CI 0.38 to 2.97) (SOE: low).

Low-Level Laser Therapy

Evidence was insufficient from one small fair-quality and two poor-quality trials to determine effects or harms of low-level laser therapy in the short or intermediate term; No data were available for the long term (SOE: insufficient)

Microwave Diathermy

There was insufficient evidence to determine short-term effects or harms from one small, fair-quality trial (SOE: insufficient).

Pulsed Short-Wave Diathermy

There was insufficient evidence to determine effects or harms from one poor-quality trial in the short term or from another poor quality trial in the long term (SOE: insufficient).

Electromagnetic Field

One fair-quality trial found pulsed electromagnetic fields were associated with small improvements in function (difference −3.48, 95% CI −4.44 to −2.51 on a 0-85 WOMAC ADL subscale) and pain (difference −0.84, 95% CI −1.10 to −0.58 on a 0-25 WOMAC pain subscale) versus sham short-term but differences may not be clinically significant (SOE: low).
More patients who received real versus sham electromagnetic field therapy reported throbbing or warming sensations or aggravation of pain (29% versus 7%); however, the difference was not significant (RR 1.95, 95% CI 0.81 to 4.71) (SOE: low).

Superficial Heat

Evidence was insufficient from one small fair-quality trial to determine effects or harms of trial superficial heat versus placebo in short-term pain (SOE: insufficient).

Braces

There was insufficient evidence from one poor-quality study to determine the effects of bracing versus usual care for intermediate-term and long-term function or pain (SOE: insufficient).
Harms were not reported.

Detailed Synthesis

A total of 15 trials evaluating the use of a physical modality for the treatment of knee OA met inclusion criteria (Table 27 and Appendixes D and E).¹⁵⁰^–¹⁶⁴ Thirteen were included in the prior AHRQ report¹⁵⁰^–¹⁶² and two were added for this update.¹⁶³^,¹⁶⁴ Physical modalities evaluated included ultrasound (both new trials), TENS, low-level laser therapy, microwave diathermy, pulsed short-wave diathermy, electromagnetic fields, superficial heat, and bracing. All but one intervention (bracing vs. usual care)¹⁵² were compared to a sham procedure.

Four RCTs (2 new trials; 1 good-quality, 2 fair-quality, and 1 poor-quality) that evaluated ultrasound for knee OA met the inclusion criteria.¹⁵³^,¹⁶²^–¹⁶⁴ All trials required at least grade 2 radiographic knee OA using the Kellgren–Lawrence criteria for inclusion. One (new) trial evaluated continuous ultrasound,¹⁶⁴ one (new) evaluated pulsed ultrasound¹⁶³ and two trials had both a continuous and a pulsed ultrasound group.¹⁵³^,¹⁶² In three trials, the ultrasound groups received 1 MHz treatments five times per week for 2 weeks at an intensity of either 1 or 1.5 W/cm² and the sham comparators received the same protocol, but the power was switched off.¹⁵³^,¹⁶²^,¹⁶⁴ The forth trial applied daily pulsed ultrasound for 10 days at 0.6 MHz with an average intensity of 120 mW/cm2 and duty cycle of 20% plus participants took diclofenac sodium tablets; the comparator group received sham ultrasound (no power output) plus the diclofenac sodium tablets.¹⁶³ Compliance with the intervention protocols were not reported. Three trials reported short-term outcomes,¹⁶²^–¹⁶⁴ the other intermediate-term outcomes. The methodological shortcomings were unclear blinding of the provider or assessor,¹⁵³^,¹⁶³^,¹⁶⁴ unclear randomization procedures and concealment of treatment allocation¹⁶⁴ and unclear adherence to an intention-to-treat analysis.¹⁶²

We found one good-quality (n=70) trial that compared active TENS with sham TENS for knee OA.¹⁵⁴ Inclusion criteria required a confirmed diagnosis of knee OA using the American College of Rheumatology criteria. The TENS protocol had patients wear a pulsed TENS device 7 hours daily for 26 weeks. The sham TENS groups followed the same protocol as the active treatment, but the device turned off after 3 minutes. Compliance was unacceptable for time the TENS device was worn.

We identified three small trials (n=30, 49, and 60) that investigated low-level laser therapy versus sham laser for knee OA.¹⁵⁰^,¹⁵⁷^,¹⁶⁰ The mean age ranged from 49 to 64 years and most patients were female (62% to 75%). Two studies included patients meeting the American College of Rheumatology criteria for knee OA.¹⁵⁰^,¹⁶⁰ Two trials also required an average pain intensity of greater than 3 or 4 on a 0-10 VAS,¹⁵⁰ while the other trial had an additional inclusion criteria of radiographic knee OA of Kellgren–Lawrence grade of 2 or 3.¹⁶⁰ Treatment duration ranged from 2 to 4 weeks and the number of total sessions from 8 to 10. Low-level laser therapy protocols differed across the trials with doses ranging from 1.2 to 6 Joules per point (range, 5 to 6 points) and length of irradiation from 40 seconds to 2 minutes; all trials used a continuous laser beam. The sham laser comparison groups followed the same respective protocols, but the device was inactive. One trial was rated fair quality¹⁵⁰ and two poor quality.¹⁵⁷^,¹⁶⁰ In the fair-quality trial, blinding of the care provider was unclear. The two poor-quality trials suffered from insufficient descriptions of allocation concealment methods, unclear application of intention to treat, lack of clarity regarding patient blinding, and no reporting of or unacceptable attrition.

One small (n=63), fair-quality trial compared microwave diathermy (three 30-minute sessions per week for 4 weeks) to sham.¹⁵⁶ The inclusion criteria required radiographic knee OA of a Kellgren and Lawrence grade 2 or 3. The power was set to 50 watts. Sham diathermy followed the same protocol, but the machine was set to off. Compliance with the treatment regimen for each group was unclear. Methodological limitations of this study included no blinding of the care providers.

Two trials (n=86 and 115) examined pulsed short-wave diathermy compared to sham diathermy.¹⁵⁵^,¹⁵⁸ The mean age ranged from 62 to 75 years, and the proportion of female participants ranged from 67 to 100 percent. Both trials included patients meeting radiographic criteria for knee OA. Each trial compared two doses of short-wave diathermy to a sham diathermy group; dosages varied by intensity in one trial (mean power output of either 1.8 or 18 Watts for 20 minutes)¹⁵⁸ or by length of session (19 or 38 minutes at 14.5 Watts) in the other.¹⁵⁵ Both trials applied diathermy three times per week for 3 weeks (total of 9 sessions). Each sham diathermy group followed the same treatment protocol, but the electrical current was not applied. Compliance with the treatment regimens was acceptable for both trials. Both trials were rated poor quality due to unclear concealment of treatment allocation, a lack of care provider blinding, and unacceptable attrition.

Two trials (n=90 for both) compared the application of electromagnetic fields to sham interventions for knee OA.¹⁵¹^,¹⁶¹ The mean age of participants was 59 and 60 years, and the proportion of female participants ranged from 48 to 70 percent. The mean duration of chronicity ranged from 9 to 11 years. The good-quality trial enrolled participants meeting the American College of Rheumatology criteria for knee OA.¹⁶¹ The inclusion criteria was not clearly presented in the poor-quality trial.¹⁵¹ The intervention group in the good-quality study received 2 hours of pulsed electromagnetic fields 5 days a week for 6 weeks.¹⁶¹ The poor-quality trial had a musically modulated electromagnetic field group that received 15 daily 30-minute sessions. Music from a connected speaker modulated the parameters of the electromagnetic field. The study also had an extremely low frequency electromagnetic field group that had 15 daily 30 minutes sessions, but the electromagnetic field was set at a frequency of 100 Hz.¹⁵¹ The sham group in each trial followed the same respective treatment protocol, but used a noneffective electromagnetic field during the sessions. Compliance to the treatment sessions was acceptable in both trials. One trial was rated fair quality¹⁶¹ and the other was rated poor quality.¹⁵¹ Methodological limitations in both trials included unclear methods for allocation concealment. Additionally, in the poor-quality trial, there were baseline dissimilarities between groups, no blinding of patients, providers, or outcome assessors, and attrition was not reported.¹⁵¹

A single trial compared superficial heat with placebo (n=52).¹⁵⁹ Participants were included if they had grade 2 or higher using the Kellgren-Lawrence grading for radiographic knee OA. Superficial heat was provided using a knee sleeve with a heat retaining polyester and aluminum substrate. Participants were instructed to wear the sleeve at least 12 hours per day. The placebo sleeves were identical and participants received the same instructions, but the sleeve did not contain the heat retaining substrate; the extent to which patients could be truly blinded is unclear (sleeve may retain body heat and feel warmer). Compliance with wearing the sleeve was acceptable. This trial was rated fair quality due to unclear concealment of treatment allocation, and a lack of clarity regarding whether it was the provider or outcomes assessor that was blinded.

We identified one trial comparing use of a knee brace to usual care (n=118).¹⁵² Inclusion criteria required unicompartmental knee OA, and either a varus or valgus malalignment. Patients in the intervention group were fitted with a commercially available knee brace that allowed medial unloading or lateral unloading. Usual care consisted of patient education and physical therapy and analgesics as needed. Compliance with continued use of the brace was unacceptable. This trial was rated poor quality due to lack of patient, provider, or assessor blinding, and unacceptable attrition.

Physical Modalities Compared With Sham or Usual Care

Ultrasound. Three trials (2 new; one good, one fair, one poor quality) reported function using Lequesne Index and pain (during activity) using VAS over the short term.¹⁶²^–¹⁶⁴ There were no statistically significant differences between real ultrasound versus sham ultrasound in either function (3 trials, pooled difference −2.50 on a 0-24 scale, 95% CI −6.37 to 1.22, I²=94.0%) (Figure 37) or pain intensity (3 trials, pooled difference −1.2 on a 0-10 scale, 95% CI −3.7 to 1.3, I²=91.1%) (Figure 38) using a PL estimate likely due to heterogeneity between studies. Exclusion of the poor quality study¹⁶⁴ resulted slighter larger, but still nonstatistically significant, effects for function (2 trials, SMD −3.4, 95% CI −9.5 to 2.4, plot not shown) and pain (2 trials, pooled difference −1.9, 95% CI −5.1 to 1.1, plot not shown). Stratification by type of ultrasound (continuous vs. pulsed) resulted in similar conclusions regarding function and pain.

Intermediate-term results at 6 months from one fair-quality trial showed no difference on the WOMAC Physical Function subscale (0 to 100) between either the continuous or pulsed ultrasound group versus sham ultrasound (difference −4.5, 95% CI −10.34 to 1.34, and −2.9, 95% CI −9.19 to 3.39, respectively).¹⁵³ Results for pain intensity were not consistent with regard to ultrasound method. The continuous ultrasound group had a small improvement in pain on the WOMAC pain scale compared with sham (difference −1.8, 95% CI −3.34 to −0.26), but no statistical difference was seen between pulsed ultrasound and sham (difference −1.6, 95% CI −3.26 to 0.06). There was no difference between either ultrasound group versus sham ultrasound for VAS pain during rest or on movement (Table 26).

Regarding quality of life, one new trial reported no differences in the short term between the continuous and sham ultrasound groups for change from baseline on the SF-36 PCS (mean change 7.9 vs. 6.1 on a 0-100 scale, p=0.47) and the SF-36 MCS (mean change −0.3 vs. −0.1 on a 0-100 scale, p=0.95).¹⁶⁴

Transcutaneous Electrical Nerve Stimulation. No effect was seen for TENS versus placebo TENS for function or pain over the intermediate term for any outcome measured in one good-quality trial.¹⁵⁴ Function was measured via the WOMAC-function subscale (0 to 100); the proportion of patients who achieved a MCID ≥9.1 was 38 percent versus 39 percent (RR 1.2, 95% CI 0.6 to 2.2) and the difference in mean change scores was −1.9 (95% CI −9.7 to 5.9). Pain was measured using a VAS pain scale (difference 0.9 on a scale of 0 to 10, 95% CI –11.7 to 13.4) and the WOMAC pain subscale (difference −5.6 on a 0 to 100 scale, 95% CI –14.9 to 3.6). The proportion of patients who achieved MCID (≥20) in pain VAS was 56 percent versus 44 percent (RR 1.3, 95% CI 0.8 to 2.0). Health-related quality of life measured with the SF-36 was not different between the two groups for the physical component and mental component score (Table 26).

Low-Level Laser Therapy. One fair-quality trial reported no difference between low-level laser therapy and sham for short-term function based on median Saudi Knee Function Scale scores (range 0-112 with higher scores indicating greater severity), median difference −10 (interquartile range of −23 to −4), p=0.054.¹⁵⁰ There were inconclusive results for intermediate-term function. One fair-quality trial reported the low-level laser therapy group had less functional severity at 6 months compared to sham on the Saudi Knee Function Scale (median difference −21.0, 95% CI −34.0 to −7.0), p=0.006.¹⁵⁰ For the other poor-quality trial, neither the higher dose nor the lower dose low-level laser therapy group differed from sham on the WOMAC physical function (0 to 96) subscale (difference −3.82, 95% CI −9.75 to 2.11 and −0.14, 95% CI −6.59 to 6.31, respectively).¹⁶⁰ However, the evidence was considered insufficient for function.

Low-level laser therapy was associated with moderately less pain over the short term in one fair-quality and one poor-quality trial (pooled difference −2.00, 95% CI −4.15 to 0.04) (Figure 39).¹⁵⁰^,¹⁵⁷ There was no difference between low-level laser therapy versus sham for intermediate-term pain (pooled difference −1.04, 95% CI −3.17 to 1.45).¹⁵⁰^,¹⁶⁰ However, the evidence was considered insufficient for pain.

Microwave Diathermy. Data were insufficient from one small, fair-quality trial evaluating microwave diathermy.¹⁵⁶ The microwave diathermy group showed substantial short-term improvement compared with sham for function (difference −33.2 on a 0-85 scale, 95% CI −42.0 to −24.6, WOMAC ADL subscale) and pain (difference −8.1 on a 0-25 scale, 95% CI −10.7 to −5.3, WOMAC pain subscale). Substantial imprecision was noted.

Pulsed Short-Wave Diathermy. Data were insufficient for pulsed short-wave diathermy compared with sham. There was no difference in short-term function or pain for either the low intensity or high intensity group compared to sham diathermy based on the WOMAC in one poor-quality trial.¹⁵⁶ There was no difference on the WOMAC function subscale (0 to 10) between either the low intensity group versus sham (difference 0.16, 95% CI −1.51 to 1.83), or the high intensity group versus sham (difference −0.02, 95% CI −1.67 to 1.63). There was also no difference on the WOMAC pain subscale (0 to 10) for either the low or high intensity group versus sham (difference 0.15, 95% CI −1.57 to 1.87 and −0.24, 95% CI −2.02 to 1.54, respectively).

The other trial found inconsistent results among the high and low dose groups for long-term function using the KOOS (0 to 100).¹⁵⁵ The low dose group had substantially greater improvement on the KOOS-Daily Activities subscale compared to sham (difference 27.30, 95% CI 13.73 to 40.87), but there was no difference between the high dose group and sham on the KOOS-Daily Activities subscale (difference 10.30, 95% CI −1.24 to 21.84). Neither the low or high dose group differed from sham on the KOOS-recreational activities subscale (Table 26). Regarding pain intensity, the low dose group had moderately better pain NRS (0 to 10) that was not statistically significant (difference −1.8, 95% CI −3.60 to 0.00). The high dose group experienced substantially greater pain reduction than the sham group (difference −2.3, 95% CI −3.68 to −0.92).

Electromagnetic Fields. The fair-quality trial found use of pulsed electromagnetic fields did not appear to provide clinically meaningful short-term improvements in function or pain compared with sham, although statistical significance was achieved. The pulsed electromagnetic field group had better function on the WOMAC ADL subscale (0 to 85) compared with the sham group, (difference −3.48, 95% CI −4.44 to −2.51), and it had lower scores on the WOMAC pain subscale (0 to 25) versus sham (difference −0.84, 95% CI −1.10 to −0.58).¹⁶¹ Based on estimated values from a graph for the poor-quality trial,¹⁵¹ each group using electromagnetic fields had better function and substantially less pain in the short term on the Lequesne Index. The musically modulated electromagnetic field group had moderately better Lequesne Function scores (0-10) versus sham (mean of 6.5 vs. 3.8) and substantially lower Lequesne Pain scores (0 to 10) (mean of 1.4 vs. 6.9). The low frequency electromagnetic field group had similar benefits for function (mean of 7.1 vs. 3.83) and pain (mean of 1.4 vs. 6.85, standard deviation and statistical testing not reported), compared with sham.

Superficial Heat. Evidence from one small fair-quality trial was insufficient to determine the effects of superficial heat on short-term pain. WOMAC pain subscale scores were similar between the heat and placebo group at 1 month post-treatment (13.7 versus 13.9, respectively).¹⁵⁹

Brace. Evidence from one small poor-quality trial was insufficient to determine the effects of brace treatment. There was no difference between bracing and usual care for intermediate-term or long-term function, pain, and quality of life outcomes.¹⁵² Function was measured using the Hospital for Special Surgery (HSS) score (difference 3.2, 95% CI −0.58 to 6.98 for intermediate-term function and difference 3.0, 95% CI −1.05 to 7.05 for long-term function). Pain intensity was assessed using a VAS. The difference was −0.58 (95% CI −1.48 to 0.32) for intermediate-term pain and −0.81 (95% CI −1.76 to 0.14) for long-term pain. Health-related quality of life was measured using the Euro-Qol 5-Dimensions (EQ-5D) (difference 0.01, 95% CI −0.08 to 0.10 for both intermediate-term and long-term health-related quality of life).

Physical Modalities Compared With Pharmacological Therapy or With Exercise Therapy

No trial of physical modalities versus pharmacological therapy or versus exercise met inclusion criteria.

Harms

In general, harms were poorly reported across the physical modality trials. Six trials (2 of low-level laser therapy,¹⁵⁰^,¹⁶⁰ 2 of ultrasound therapy,¹⁵³^,¹⁶² 1 of pulsed short-wave diathermy,¹⁵⁸ and 1 of superficial heat¹⁵⁹) reported that no adverse events or side effects occurred in either group. The good-quality trial that evaluated TENS found no difference between active and sham TENS in the risk of localized, mild rashes (18% vs.17%; RR 1.06, 95% CI 0.38 to 2.97).¹⁵⁴ One trial of microwave diathermy reported two cases of symptom aggravation in the intervention group; the events were transient and neither patient withdrew from the trial.¹⁵⁶ More patients who received real versus sham electromagnetic field therapy reported throbbing or warming sensations or aggravation of pain (29% versus 7%); however, the difference was not significant (RR 1.95, 95% CI 0.81 to 4.71) in one fair-quality trial.¹⁶¹

Manual Therapies for Osteoarthritis Knee Pain

Key Points

There was insufficient evidence from one trial to determine the effects of joint manipulation on intermediate-term function or harms versus usual care or versus exercise due to inadequate data to determine effect sizes or statistical significance (SOE: insufficient).
There was insufficient evidence from one trial to determine the effects of massage versus usual care on short-term function, pain, or harms, or to evaluate the effect of varying dosages of massage on outcomes (SOE: insufficient).

Detailed Synthesis

Two trials were identified that met inclusion criteria and evaluated manual therapies for the treatment of knee OA,⁴⁷^,¹⁸⁴ (Table 28 and Appendixes D and E). Both trials were included in the prior AHRQ report. Patients in both trials were required to have radiographically established knee OA meeting the American College of Rheumatology criteria.

One fair-quality trial (N=117 with knee OA) compared manual therapy with usual care (continued routine care from general practitioner and other providers) and with combination exercise.⁴⁷ The manual therapy intervention consisted of nine 50-minute sessions. Seven were delivered in the first 9 weeks and two booster sessions at week 16. All participants were prescribed a home exercise program three times per week. Compliance with the intervention was acceptable in all groups, and the methodological shortcoming of this trial was a lack of blinding for the patients and care providers. Only intermediate-term outcomes were reported.

One fair-quality trial (N=125) compared four different dosages of massage therapy with usual care (continued current treatment).¹⁸⁴ The massage protocol consisted of standard Swedish massage strokes applied in each intervention group over 8 weeks. The dosage varied from 240 to 720 minutes based on the frequency (once or twice per week) and duration of massage (30-60 minutes per session). Compliance was acceptable in all groups, and the methodological shortcoming of this trial was a lack of blinding for the patients and care providers in the usual care arm. Only short-term outcomes were reported.

Manual Therapies Compared With Usual Care

Manual Therapy. Data were insufficient from one fair-quality trial (n=58 with knee OA)⁴⁷ to evaluate effects of joint manipulation versus usual care over the intermediate term. Although the manual therapy group showed a statistically significant improvement from baseline in function as measured by the WOMAC (mean change −31.5 on a 0-240 scale, 95% CI −52.7 to −10.3), whereas the usual care group showed no improvement (mean change 1.6, 95% CI −10.5 to 13.7), insufficient data was provided to calculate an effect estimate (number of patients with knee OA in each group were not provided). Pain outcomes were not reported.

Massage. Data were insufficient from one fair-quality trial (n=125) to evaluate the short-term effects of massage therapy (4 different dosages) compared with usual care.¹⁸⁴ Function was measured using the WOMAC total and physical function subscale scores (both 0 to 100 scales) and pain was measured using the WOMAC pain subscale and the VAS (both 0 to 10). No significant effects were seen in any outcome measure at 4 months postmassage treatment versus usual care (Table 27). Authors reported a trend for greater magnitude of change in function and pain with higher massage dosages versus lower massage dosages and versus usual care (statistical tests not provided).

Manual Therapies Compared With Pharmacological Therapy

No trial of manual therapy versus pharmacological therapy met inclusion criteria.

Manual Therapies Compared With Exercise Therapy

The trial evaluating manual therapy also included an exercise group that received aerobic warm-up, muscle strengthening, muscle stretching, and neuromuscular control exercises (n=59 with knee OA).⁴⁷ Both groups showed improvement from baseline in function (WOMAC) over the intermediate term, but the change was statistically significant in the manual therapy group only (mean change of −31.5, 95% CI −52.7 to −10.3 versus −12.7, 95% CI −27.1 to 1.7) for exercise. However, insufficient data was provided to calculate an effect estimate (number of patients with knee OA in each group were not provided). Pain outcomes were not reported.

Harms

No serious treatment-related adverse events occurred in either trial⁴⁷^,¹⁸⁴; one nontrial-related death was reported in the usual care group in the trial evaluating manual therapy.⁴⁷

Mind-Body Therapies for Osteoarthritis Knee Pain

Key Points

Data were insufficient from two small, unblinded trials to determine the effects or harms of tai chi versus attention control in the short or intermediate terms. No data on long-term outcomes were available (SOE: insufficient).

Detailed Synthesis

Two small trials (n=31 and 40) of tai chi versus attention control in older adults met the inclusion criteria²¹⁵^,²¹⁶ (Table 29 and Appendix D). Both trials were included in the prior AHRQ report. Tai chi was practiced 40 to 60 minutes two or three times per week for 24 or 36 sessions. Attention control consisted of group education classes with one trial²¹⁶ including 20 minutes of stretching for sessions 18 to 24. Blinding was not possible in either trial and was the primary methodological limitation in one fair-quality trial.²¹⁶ Additional methodological concerns in the other poor-quality trial included unclear concealment of treatment allocation and high attrition²¹⁵ (Appendix E).

Mind-Body Therapies Compared With Attention Control

There is no clear difference between tai chi and an attention control on functional outcomes across the two trials over the short term on a WOMAC physical function 0- to 85-point scale (difference 1.03, 95% CI −9.87 to 11.93)²¹⁵ or WOMAC physical function 0- to 1700-point scale (difference −183.2, 95% CI −372.6 to 6.2),²¹⁶ or at intermediate term in one of the trials (difference −105.3, 95% CI −294.7 to −84.1, 0 to 1700 scale).²¹⁶ Results for short-term pain improvement were inconsistent with no difference between groups on WOMAC pain scale in one trial (difference 0.39 on a 0-35 point scale, 95% CI −4.21 to 4.99)²¹⁵ and the other marginally favoring tai chi on 0 to 500 point WOMAC pain scale (difference −67.0, 95% CI −131.8 to −2.1),²¹⁶ but demonstrating no difference between the groups in 0 to 10 VAS pain (difference −0.65, 95% CI −2.31 to 1.02).²¹⁶ There were no differences between groups at intermediate term in this latter trial (WOMAC pain 0 to 500 scale, difference −183.2, 95% CI −372.6 to 6.2).²¹⁶ One trial noted improvement in health-related quality of life (SF-36) in the intermediate term only and depression (CES-D) and self-efficacy in the short and intermediate terms.

Mind-Body Therapies Compared With Pharmacological Therapy or With Exercise Therapy

No trial of mind-body therapy versus pharmacological therapy or versus exercise met inclusion criteria.

Harms

In the two trials of mind-body interventions, harms were poorly reported. One trial reported no serious adverse events²¹⁶ and the other reported sporadic complaints of muscle soreness and foot or knee pain.²¹⁵

Acupuncture for Osteoarthritis Knee Pain

Key Points

There were no differences between acupuncture versus control interventions (sham acupuncture, waitlist, or usual care) on function in the short term (4 trials [excluding outlier trial], pooled SMD −0.05, 95% CI −0.32 to 0.38) or the intermediate term (4 trials, pooled SMD −0.15, 95% CI −0.31 to 0.02, I²=0%) (SOE: low for short term; moderate for intermediate term). Stratified analysis showed no differences between acupuncture and sham treatments (4 trials) but moderate improvement in function compared with usual care (2 trials) short term.
There were no differences between acupuncture versus control interventions (sham acupuncture, waitlist, or usual care) on pain in the short term (6 trials, pooled SMD −0.27, 95% CI −0.67 to 0.12, I²=79%) or clinically meaningful differences in the intermediate term (4 trials, pooled SMD −0.16, 95% CI −0.32 to −0.01, I²=0%) (SOE: low for short term; moderate for intermediate term). Short-term differences were significant for acupuncture versus usual care but not for acupuncture versus sham acupuncture.
Data from one poor-quality trial were insufficient to determine the effects of acupuncture versus exercise (SOE: insufficient).
There was no difference in the risk of serious adverse events between any form of acupuncture and the control group. Worsening of symptoms (7% to 14%) and mild bruising, swelling, or pain at the acupuncture site (1% to 18%) were most common; one case of infection at an electroacupuncture site was reported (SOE: moderate).

Detailed Synthesis

Nine trials of acupuncture for knee OA were identified that met inclusion criteria⁶⁷^,²³⁸^–²⁴⁵ (Table 30 and Appendix D). All of the trials were included in the prior AHRQ report. Four trials evaluated traditional acupuncture,⁶⁷^,²⁴⁰^,²⁴²^,²⁴⁴ four electroacupuncture,²³⁸^,²³⁹^,²⁴¹^,²⁴³ and two laser acupuncture.²⁴⁰^,²⁴⁵ Three trials compared acupuncture with usual care (provision of educational leaflets, instructions to remain on current oral medications, or no changes to their ongoing treatments)⁶⁷^,²³⁸^,²⁴² and one trial each to no treatment²⁴⁰ or to waitlist control.²⁴³ Six trials compared acupuncture with sham procedures, which consisted of inactive laser treatment (red light on but no power applied),²⁴⁰^,²⁴⁵ superficial needling, or acupuncture performed at nonmeridian sites,²³⁹^,²⁴³^,²⁴⁴ or nonpenetrating sham acupuncture.²⁴¹ No trials of acupuncture versus pharmacological therapy or exercise were identified. Sample sizes ranged from 30 to 527 (total sample 1,811). Duration of acupuncture treatment ranged from 2 to 12 weeks, with the number of sessions ranging from 6 to 16. Four studies were conducted in Europe,⁶⁷^,²⁴¹^,²⁴²^,²⁴⁴ three in the United States,²³⁸^,²³⁹^,²⁴³ and one study each was conducted in Australia²⁴⁰ and Turkey.²⁴⁵ Short-term outcomes were reported by six trials⁶⁷^,²³⁸^,²⁴¹^,²⁴³^–²⁴⁵ and intermediate-term outcomes by four²³⁹^,²⁴⁰^,²⁴²^,²⁴⁴; no trial reported outcomes over the long term.

Trials were rated good quality (for the comparison of acupuncture versus sham only).²⁴⁰^,²⁴³ Seven trials were rated fair quality (to include the comparison of acupuncture with no treatment/waitlist in the two trials described previously)²³⁸^–²⁴¹^,²⁴³^–²⁴⁵ and two were considered poor quality⁶⁷^,²⁴² (Appendix E). The primary methodological shortcoming in the fair-quality trials was lack of blinding; additionally, the poor-quality trials suffered from unclear allocation concealment methods and high rates of attrition (30% to 35%).

Acupuncture Compared With Usual Care, Waitlist, or Sham

Functional Outcomes. There was no difference between acupuncture versus control interventions (sham acupuncture, usual care, waitlist, no treatment) on WOMAC function score in the short term (5 trials, pooled SMD −0.17, 95% CI −0.71 to 0.38, I²=86%)²³⁸^,²⁴¹^,²⁴³^–²⁴⁵ (Figure 40). All trials were considered fair quality. Removal of one outlier trial (Berman 1999)²³⁸ attenuated the effect estimate size (4 trials, pooled SMD −0.05, 95% CI −0.32 to 0.38); results remained insignificant. No differences were found when the results were analyzed by the type of acupuncture used: electroacupuncture (3 trials, pooled SMD −0.34, 95% CI −1.17 to 0.46),²³⁸^,²⁴¹^,²⁴³ standard needle acupuncture (SMD −0.28, 95% CI −0.55 to 0.00),²⁴⁴ or laser acupuncture (SMD 0.55, 95% CI −0.01 to 1.10)²⁴⁵ compared with control interventions. When stratified by control type no differences were found between any form of acupuncture and sham treatment (4 trials, pooled SMD −0.02, 95% CI −0.28 to 0.39);²⁴¹^,²⁴³^–²⁴⁵ however, when acupuncture was compared with waitlist and usual care, estimates suggested moderate improvement in function (2 trials, pooled SMD −0.74, 95% CI −1.40 to −0.24, plot not shown).²³⁸^,²⁴³ In one small, fair-quality trial²⁴⁵ of low-level laser acupuncture the authors reported a difference in WOMAC function score that favored the sham control (Table 29).

Similarly, based on WOMAC total score, there were no differences in short-term function between acupuncture and sham, waitlist, and usual care across trials (4 trials, pooled SMD −0.30, 95% CI −0.81 to 0.21, I²=85%, plot not shown).⁶⁷^,²³⁸^,²⁴⁴^,²⁴⁵ Removal of one outlier trial (Berman 1999)²³⁸ attenuated the effect estimate size (3 trials, pooled SMD −0.10, 95% CI −0.54 to 0.49); results remained insignificant. Stratification by acupuncture type, control type, and exclusion of one poor-quality trial yielded similar estimates. Results according to other measures of function were mixed. In two small, fair-quality trials authors reported significant results (Table 29), one favoring electroacupuncture compared with usual care based on the Lequesne Index (0 to 24 scale),²³⁸ and the second favoring the sham control comparing low-level laser acupuncture based on the WOMAC total score.²⁴⁵ Five additional trials reported no differences between acupuncture and any of the control conditions across other measures of function⁶⁷^,²⁴⁰^–²⁴²^,²⁴⁴ (Table 29).

In the intermediate term, there was no difference between acupuncture versus control conditions (sham acupuncture, usual care, waitlist) on the WOMAC function score (4 trials, pooled SMD −0.15, 95% CI −0.31 to 0.02, I²=0%),²³⁹^,²⁴⁰^,²⁴²^,²⁴⁴ (Figure 40). Estimates were similar when stratified by study quality, acupuncture type, and control type; however, sensitivity analyses were limited by the small number of trials. Similarly, no differences in WOMAC total score were found for standard needle acupuncture versus usual care or sham at intermediate-term followup (2 trials, pooled SMD −0.23, 95% CI −0.49 to 0.03, I²=0%, plot not shown).²⁴²^,²⁴⁴ Across other measures of function, no differences were seen at intermediate term between standard needle acupuncture versus sham acupuncture on the Pain Disability Index (difference −3.5 on a 0-70 scale, 95% CI −7.7 to 0.5) in one fair-quality trial²⁴⁴ or versus usual care on the Oxford Knee Score (difference 3.6 on a 12 to 60 scale, 95% CI −9.8 to 2.6) in one small poor-quality trial.²⁴²

No trials reported data on long-term function.

Pain Outcomes. There was no difference between acupuncture versus control interventions (sham acupuncture, usual care, waitlist) on pain in the short term (6 trials, pooled SMD −0.27, 95% CI −0.67 to 0.12, I²=79%)⁶⁷^,²³⁸^,²⁴¹^,²⁴³^–²⁴⁵ (Figure 41). All but one trial used the WOMAC pain score. Removal of one outlier trial (Berman 1999)²³⁸ attenuated the effect estimate size (5 trials, pooled SMD −0.15, 95% CI −0.29 to 0.00); results remained insignificant. Estimates were similar after exclusion of one poor-quality trial and for stratification by acupuncture type and for analyses of VAS or NRS instead of WOMAC pain score if more than one pain measure was reported. When stratified by control type, no differences were seen between acupuncture and sham acupuncture (4 trials, pooled SMD −0.06, 95% CI −0.24 to 0.14);²⁴¹^,²⁴³^–²⁴⁵ however, when acupuncture was compared with waitlist or usual care, the estimate suggested moderate effects on pain (2 trials, pooled SMD −0.68, 95% CI −1.28 to −0.15).²³⁸^,²⁴³

There were no clinically meaningful differences between acupuncture and control interventions for pain in the intermediate term (4 trials, pooled SMD −0.16, 95% CI −0.32 to −0.01, I²=0%)²³⁹^,²⁴⁰^,²⁴²^,²⁴⁴; individually no trial reached statistical significance (Figure 41). Stratification based on acupuncture type, type of control intervention, and study quality yielded similar results.

No trial reported data on long-term pain.

Other Outcomes. Data on the effects of acupuncture on quality of life were limited (plots not shown). A small effect favoring acupuncture versus control conditions (sham acupuncture, usual care, waitlist, no treatment) was seen for the SF-12/SF-36 PCS (0-100 scale) in both the short term (2 trials, pooled difference 1.6, 95% CI 0.08 to 3.11, I²=0%)²⁴³^,²⁴⁴ and the intermediate term (2 trials, pooled difference 1.94, 95% CI 0.03 to 3.86, I²=0%),²⁴⁰^,²⁴⁴ but no difference was seen in the SF-12/SF-36 MCS (0-100 scale) at either timepoint: short term (2 trials, pooled difference 1.14, 95% CI −0.27 to 2.56, I²=0%)²⁴³^,²⁴⁴ and intermediate term (2 trials, pooled difference −0.25, 95% CI −4.05 to 3.54, I²=70.8%).²⁴⁰^,²⁴⁴ For individual trials, the effects were small and not statistically significant for either outcome (SF-12 or SF-36 PCS or MCS). There were no differences between acupuncture and control interventions on other quality of life measures or on measures of anxiety or depression over either the short or intermediate term (Table 29).

In one trial,²⁴⁰ a small (1%) change in opioid use at intermediate term was seen with needle acupuncture (decrease from 1% to 0%), laser acupuncture (decrease from 3% to 2%), and sham acupuncture (decrease from 1% to 0%) while use remained the same in the no treatment group (Table 29).

Acupuncture Compared With Pharmacological Therapy

No trial of acupuncture versus pharmacological therapy met inclusion criteria.

Acupuncture Compared With Exercise Therapy

Data were insufficient from one poor-quality trial (n=120)⁶⁷ to evaluate the effects of weekly acupuncture versus 60 minutes of combination exercise (strengthening, aerobics, stretching, and balance training) for 6 weeks for knee OA (Table 29 and Appendix D). Methodological limitations included lack of patient or care provider blinding, unclear adherence, unacceptable attrition, and differential loss to followup (Appendix E). There were no differences between groups with regard to function on the Oxford Knee Score questionnaire (difference −0.7, 95% CI −3.5 to 2.1 on 12-60 scale) or WOMAC score (difference −1.0, 95% CI −6.7 to 4.7; scale not provided by author). Similarly there was no difference between treatments for VAS pain on a 0 to 10 scale (difference 0.22, 95% CI −0.67 to 1.11) or for anxiety or depression based on the Hospital Anxiety and Depression Scale.

Harms

All trials reported adverse events. One trial reported similar rates of serious adverse events in patients who received real versus sham acupuncture (2.1% vs. 2.7%, respectively; RR 0.75, 95% CI 0.13 to 4.39), to include hospitalizations and one case of death from myocardial infarction in the control group; none were considered to be related to the study condition or treatment.²⁴⁴ All other events reported were classified as mild and there was no apparent difference in risk of adverse events between any form of acupuncture and the control groups. The most common adverse events reported were worsening of symptoms (7% to 14%) in three trials²⁴⁰^,²⁴²^,²⁴³ and mild bruising, swelling, or pain at the acupuncture site (1% to 18%) in five trials.⁶⁷^,²⁴⁰^,²⁴²^–²⁴⁴ One trial reported one case of an infection at the electroacupuncture site (n=455 for real and sham acupuncture groups).²⁴³ In only one trial did an adverse event (not treatment related) lead to withdrawal: one patient (3%) in the acupuncture group had a flare-up of synovitis (nonseptic).²⁴¹

Exercise for Osteoarthritis Hip Pain

Key Points

Exercise was associated with a small improvement in function versus usual care in the short term (3 trials, pooled SMD −0.33, 95% CI −0.58 to −0.11, I²=0%), intermediate term (2 trials, pooled SMD −0.28, 95% CI −0.55 to 0.02, I²=0%), and long term (1 trial, SMD −0.37, 95% CI −0.74 to −0.01) (SOE: low for short and intermediate term, insufficient for long term).
Exercise tended toward small improvement in short-term pain compared with usual care (3 trials, pooled SMD −0.30, 95% CI −0.70 to −0.02, I²=0%) but the results were no longer significant at intermediate term (2 trials, pooled SMD −0.14, 95% CI −0.40 to 0.12, I²=0%) or long term (1 trial, SMD −0.25, 95% CI −0.62 to 0.11) (SOE: low for short and intermediate term, insufficient for long term).
Evidence for harms was insufficient in trials of exercise with only two trials describing adverse events. However, no serious harms were reported in either trial (SOE: insufficient).

Detailed Synthesis

Four trials of exercise therapy for hip OA met the inclusion criteria (Table 31 and Appendix D).⁴⁷^,⁷²^–⁷⁴ All of the trials were included in the prior AHRQ report. Three trials evaluated participants with chronic hip pain diagnosed as OA using American College of Radiology criteria⁴⁷^,⁷²^,⁷⁴ and one assessed participants with hip OA diagnosed clinically who were on a waitlist for hip replacement.⁷³ Sample sizes ranged from 45 to 203 (total sample=455). Across trials, participants were predominately female (>50%) with mean ages ranging from 64 to 69 years. Three trials were conducted in Europe⁷²^–⁷⁴ and the other in New Zealand.⁴⁷

All trials compared exercise with usual care, defined as care routinely provided by the patient’s primary care physician, which could include physical therapy referral. Two trials also provided education about hip OA to all participants.⁷²^,⁷⁴ The exercise interventions included 8 to 12 supervised sessions of 30 to 60 minutes duration once per week over 8 to 12 weeks; the interventions were comprised of strengthening and stretching exercises (all studies), as well as neuromuscular control exercises in one trial⁴⁷ and endurance exercise in another.⁷⁴ All trials reported compliance rates with the scheduled exercise sessions between 76 and 88 percent. However, in one trial,⁴⁷ although 88 percent of patients completed more than 80 percent of the scheduled sessions, only 44 percent of participants returned logbooks to demonstrate compliance with the recommended home exercises.

Three trials were rated fair quality⁴⁷^,⁷²^,⁷⁴ and one was rated poor quality⁷³ (Appendix E). In all trials, the nature of the intervention and control precluded blinding of participants and researchers; patient-reported outcomes were therefore not blinded. Additionally, in the poor-quality trial,⁷³ concealed allocation was unclear and outcomes were poorly reported, as were attrition rates, which were substantial for pain (68%) and function (73%) outcomes.

Exercise Compared With Usual Care

Exercise was associated with a small improvement in function versus usual care in the short term (3 trials, pooled SMD −0.33, 95% CI −0.58 to −0.11, I²=0.0%),⁷²^–⁷⁴ intermediate term (2 trials, pooled SMD −0.28, 95% CI −0.55 to 0.02, I²=0.0%)⁷²^,⁷⁴ and long term (1 trial, SMD −0.37, 95% CI −0.74 to −0.01)⁷² (Figure 42). The intermediate-term findings were consistent with the additional trial not included in the meta-analysis (authors did not provide sufficient data),⁴⁷ although the small improvement in function in this trial did not reach statistical significance in those with hip OA. The small number of trials precluded meaningful sensitivity analysis.

Exercise tended toward small improvement in short-term pain compared with usual care (3 trials, pooled SMD −0.30, 95% CI −0.70 to −0.02, I²=0%)⁷²^–⁷⁴ (Figure 43), but not at intermediate term (2 trials, pooled SMD −0.14, 95% CI −0.40 to 0.12, I²=0%).⁷²^,⁷⁴ There was moderate heterogeneity between studies and the short-term improvement in pain was observed in only one poor-quality study,⁷³ whereas the two fair-quality studies did not demonstrate any significant differences in short-term pain relief.⁷²^,⁷⁴ There were no identifiable differences in methodology between the studies to explain these inconsistent findings, although the poor-quality study only reported pain outcomes for 68 percent of participants, which may have biased results. There was no difference between exercise and usual care in the long term based on a single study (SMD −0.25, 95% CI −0.62 to 0.11).⁷² The small number of trials precluded meaningful sensitivity analysis.

Data on effects of exercise on quality of life were limited and were reported in only two trials.⁷³^,⁷⁴ One fair-quality trial⁷⁴ found no differences in health-related quality of life between groups in the short term and intermediate term and one poor-quality study⁷³ found no differences between groups in the short term. One fair-quality study found no differences between groups in terms of opioid use at any time point (proportion of patients using tramadol or codeine daily: 7.0% vs. 3.5% at 3 months, 8.6% vs. 5.2% at 9 months, and 7.0% vs. 7.4% at 21 months, p=0.73), but did report slightly fewer followup physical therapy visits in the exercise group in the intermediate and long terms⁷² (Table 30).

There was insufficient evidence to determine effects of duration of exercise therapy or number of sessions on outcomes.

Exercise Compared With Pharmacological Therapy or With Other Nonpharmacological Therapies

No trial of exercise versus pharmacological therapy met inclusion criteria. Findings for exercise versus other nonpharmacological therapies are addressed in the sections for other nonpharmacological therapies.

Harms

Only two exercise trials reported on harms, and neither reported adverse events in either the exercise group or usual care groups.⁴⁷^,⁷³

Manual Therapies for Osteoarthritis Hip Pain

Key Points

Manual therapy was associated with small improvements in short-term (difference 11.1, 95% CI 4.0 to 18.6, 0-100 scale Harris Hip Score) and intermediate-term (difference 9.7, 95% CI 1.5 to 17.9) function versus exercise (SOE: low).
Manual therapy was associated with a small effect on pain in the short term (difference −0.72 [95% CI −1.38 to −0.05] for pain at rest and −1.21 [95% CI −2.29 to −0.25] for pain walking) versus exercise (SOE: low). The impact on pain is not clear at intermediate term; there was no difference in pain at rest (adjusted difference −7.0, 95% CI −20.3 to 5.9, 0-100 scale) but there was small improvement in pain while walking (adjusted difference −12.7, 95% CI −24.0 to −1.9) (SOE: insufficient).
No trials evaluated manual therapies versus pharmacological therapy.
One trial reported that no treatment-related serious adverse events were detected and in the other, no difference in study withdrawal due to symptom aggravation was seen between manual therapy and exercise (RR 1.42, 95% CI 0.25 to 8.16) (SOE: low).
There were insufficient data to determine the effects or harms of manual therapy compared with usual care at intermediate term. No effect size could be calculated (SOE: insufficient).

Detailed Synthesis

We identified two trials (n=69 and 109) of manual therapy for hip OA that met inclusion criteria (Table 32 and Appendix D).⁴⁷^,¹⁹³ Both trials were included in the prior AHRQ report. Mean patient age ranged from 66 to 72 years and females comprised 49 to 72 percent of the populations. Both trials required a diagnosis of hip OA meeting the American College of Rheumatology (ACR) criteria for inclusion. The duration of manual therapy ranged from 5 to 16 weeks with a total of nine sessions in both groups; in one trial this included seven sessions over the first 9 weeks and two booster sessions at week 16.⁴⁷ One trial compared manual therapy to usual care (continued routine care from a general practitioner and other providers)⁴⁷ and both trials compared manual therapy to combination exercise programs.⁴⁷^,¹⁹³ The number of exercise sessions matched the manual therapy group of that respective study. All participants were prescribed a home exercise program three times per week. One trial reported short-term outcomes¹⁹³ and both reported intermediate-term outcomes. One trial was conducted in New Zealand⁴⁷ and the other in the Netherlands.¹⁹³

Both trials were rated fair quality (Appendix E). Compliance with the intervention was acceptable in all groups, and the methodological shortcomings of these trials included a lack of blinding for the patients and care providers.

Manual Therapies Compared With Usual Care

A single fair-quality trial (n=69 with hip OA)⁴⁷ found that manual therapy resulted in an improvement in function at intermediate term using the total WOMAC score (0 to 240) in the manual therapy group (mean change from baseline −22.9, 95% CI −43.3 to −2.6), while the usual care group showed little change from baseline (mean change −7.9, 95% CI −30.9 to 15.3). Lack of information on the number of patients precluded calculation of effect size, and results of statistical testing between groups was not presented.

Manual Therapies Compared With Pharmacological Therapy

No trial of manual therapy versus pharmacological therapy met inclusion criteria.

Manual Therapies Compared With Exercise

One trial found that manual therapy resulted in a small improvement in short-term function compared with exercise (adjusted difference on the 0-100 scale Harris Hip Score [HHS] of 11.1, 95% CI 4.0 to 18.6). Regarding intermediate-term function, manual therapy conferred a small benefit in both trials. The adjusted difference on the HHS was 9.7 (95% CI 1.5 to 17.9) in one trial.¹⁹³ The other trial compared function using the total WOMAC score (0 to 240), and the manual therapy group experienced a statistically significant improvement from baseline (mean change of −22.9, 95% CI −43.3 to −2.6), while the exercise group did not (mean change −12.4, 95% CI −27.1 to 2.3).⁴⁷

Only one of the trials reported pain outcomes. Manual therapy was associated with a small improvement in short-term pain at rest and during walking compared with exercise (adjusted differences on a VAS (0 to 10) of −0.72, 95% CI −1.38 to −0.05, and −1.21, 95% CI −2.29 to −0.25, respectively).¹⁹³ Intermediate-term pain results were inconsistent. A moderate effect on VAS pain during walking was seen following manual therapy compared to exercise (adjusted difference −1.27, 95% CI −2.40 to −0.19), but there was no difference for pain at rest (adjusted difference −0.70, 95% CI −2.03 to 0.59).¹⁹³

There was no difference in one trial¹⁹³ between manual therapy and exercise for short-term or intermediate-term quality of life measured with the SF-36 physical function, role physical, or bodily pain subscales (Table 31).

Harms

No trial-related serious adverse events were detected in one trial,⁴⁷ and there was no difference in symptom aggravation leading to withdrawal (5% vs. 4%; RR 1.42, 95% CI 0.25 to 8.16) in the other trial.¹⁹³

Exercise for Osteoarthritis Hand Pain

Key Points

Data from one poor-quality trial were insufficient to determine the effects or harms (though no serious harms were reported) of exercise versus usual care in the short term (SOE: insufficient).

Detailed Synthesis

One Norwegian trial (n=130) that evaluated the effects of strengthening and range of motion exercise (3 times weekly for 3 months plus 4 group sessions) versus usual care (treatment recommended by the patient’s general practitioner) met inclusion criteria (Table 33 and Appendix D).⁷⁵ This trial was included in the prior AHRQ report and was rated poor quality due to lack of patient blinding, baseline differences in mental health conditions, and large differential attrition between groups (exercise 29% vs. usual care 7%) (Appendix E). Only short-term data was reported.

Exercise Compared With Usual Care

Data were insufficient from one poor-quality trial. No differences between exercise and usual care were observed for function according to the Functional Index for Hand OsteoArthritis (adjusted difference −0.5 on a 0-30 scale, 95% CI −1.9 to 0.8), or for pain (adjusted difference −0.2 on a 0 to 10 VAS pain scale, 95% CI −0.8 to 0.3) at 3 months.⁷⁵ Similarly, there were no differences between groups in the proportion of Osteoarthritis Research Society International Outcome Measures in Rheumatology (OARSI OMERACT) responders (30% versus 28%). There were also no differences between groups in any secondary outcome measure, including the patient-specific function scale, hand stiffness, or patient global assessment of disease activity.

The effects of exercise on use of opioid therapies or healthcare utilization were not reported. There was insufficient evidence to determine effects of duration of exercise therapy or number of sessions on outcomes.

Exercise Compared With Pharmacological Therapy or Other Nonpharmacological Therapies

Harms

In this trial,⁷⁵ no serious adverse events were reported; 8/130 (6%) patients reported increased pain (3 in hand, 5 in neck/shoulders) but adverse events were not reported by group.

Physical Modalities for Osteoarthritis Hand Pain

Key Points

One good-quality study of low-level laser treatment versus sham found no differences in function (difference 0.2, 95% CI −0.2 to 0.6) or pain (difference 0.1, 95% CI −0.3 to 0.5) in the short term (SOE: low).
Data were insufficient from one fair-quality trial to determine effects or harms of heat therapy using paraffin compared to no treatment on function or pain in the short term (SOE: insufficient).
No serious harms were reported in the trial of low-level laser therapy (SOE: low).

Detailed Synthesis

We identified two trials of physical modality use for hand OA (Table 34 and Appendixes D and E).¹⁶⁵^,¹⁶⁶ Both were included in the prior AHRQ report. One good-quality double-blind Canadian trial (N=88)¹⁶⁵ compared three, 20-minute sessions of low-level laser treatment to a sham laser probe over a 6-week period. Identical treatment procedures were used in each group. All participants attended three sham laser treatment sessions prior to randomization to ensure ability to comply with the treatment protocol.

One fair-quality trial (n=46) conducted in Turkey compared 15 minutes of paraffin wrapping 5 days per week for 3 weeks with a no treatment control group.¹⁶⁶ Both groups received information about joint protection strategies. Methodological limitations included lack of patient blinding, unclear compliance with treatment, and poorly reported analyses.

Physical Modalities Compared With Sham or No Treatment

Low-Level Laser Therapy. In the one good-quality trial of low-level laser treatment versus sham (n=88),¹⁶⁵ there were no differences in short-term function (difference 0.2 on a 0-4 Australian Canadian Osteoarthritis Hand Index [AUSCAN] functional subscale, 95% CI −0.2 to 0.6) or pain (difference 0.1 on a 0-4 AUSCAN pain subscale, 95% CI −0.3 to 0.5) at 4.5 months. Likewise, no difference was seen between groups in improvement based on patient global assessment.

Paraffin Treatment. One fair-quality trial (N =56)¹⁶⁶ of paraffin heat treatment demonstrated no difference compared with no treatment on the AUSCAN function scale (0-36) (difference −4.0, 95% CI −8.6 to 0.6 at short-term [2.25-month] followup). Regarding pain, no clear difference was identified between the groups over the short term as there was inconsistency across measures used and analyses for outcomes were poorly reported; findings were considered insufficient.¹⁶⁶ While heat treatment was slightly favored based on the AUSCAN pain subscale (difference −3 on a 0-20 scale, 95% CI −5.5 to −0.5), it was not statistically significant in the author’s intention-to-treat (ITT) analysis (p=0.07). VAS pain at rest suggested more improvement with heat therapy versus control in the ITT analysis (median 0 vs. 5.0 on a 0-10 scale, p<0.001); however, there was no clear difference between groups on VAS pain during ADL (median 5.0 vs. 7.0, p=0.09 for per protocol analysis, p=0.05 for ITT).

No trial evaluated effects of physical modalities on use of opioid therapies or healthcare utilization.

Physical Modalities Compared With Pharmacological Therapy or With Exercise Therapy

No trial of a physical modality versus pharmacological therapy or versus exercise met inclusion criteria.

Harms

Only the low-level laser therapy trial reported adverse events; no serious harms were reported.¹⁶⁵ One patient (2%) who received low-level laser treatment experienced erythema at the site.

Multidisciplinary Rehabilitation for Osteoarthritis Hand Pain

Key Points

One fair-quality trial of multidisciplinary rehabilitation versus waitlist control found no differences between groups over the short term in function (adjusted difference 0.49, 95% CI −0.09 to 0.37 on 0-36 scale) or pain (adjusted difference 0.40, 95% CI −0.5 to 1.3 on a 0-20 scale), or with regard to the proportion of OARSI OMERACT responders (OR 0.82, 95% CI 0.42 to 1.61) (SOE: low for all outcomes).
Data on harms were insufficient, although no serious adverse events were reported in the one trial of multidisciplinary rehabilitation versus waitlist control (SOE: insufficient).

Detailed Synthesis

One fair-quality trial (n=147) compared four, 2.5- to 3-hour group-based sessions, delivered by an occupational therapist and a specialized nurse, consisting of self-management techniques, ergonomic principles, daily home exercises, and splint (optional) versus a waitlist control,²⁶¹ (Table 35 and Appendix D). Waitlist control consisted of one 30-minute explanation of OA followed by a 3-month waiting period. Effect estimates were adjusted for baseline function or pain, body mass index (BMI), gender, and presence of erosive arthritis. Methodological limitations included lack of patient blinding and unreported compliance to treatment (Appendix E). This trial was included in the prior AHRQ report.

Of note, this intervention appeared to focus on functional restoration and while it met our broad definition of multidisciplinary rehabilitation (see footnote in Table 1), it was not consistent with how multidisciplinary rehabilitation is generally delivered clinically.

Multidisciplinary Rehabilitation Compared With Waitlist

No short-term (3 months) differences in function on the AUSCAN functional subscale (adjusted difference 0.49, 95% CI −0.09 to 0.37 on 0-36 scale) or on the AUSCAN pain subscale (adjusted difference 0.40, 95% CI −0.5 to 1.3, scale 0-20) were reported.²⁶¹

There was no difference in the proportion of OARSI OMERACT responders (odds ratio [OR] 0.82, 95% CO 0.42 to 1.61) between groups or on any secondary outcome measure, including ADLs (Canadian Occupational Measurement Scales), health-related quality of life (SF-36), arthritis self-efficacy, pain coping, muscle strength, or joint mobility.²⁶¹

The effect of multidisciplinary rehabilitation on use of opioid therapies or healthcare utilization was not evaluated in any of the included studies.

Multidisciplinary Rehabilitation Compared With Pharmacological Therapy or With Exercise Therapy

No trial of a multidisciplinary rehabilitation program versus pharmacological therapy or versus exercise met inclusion criteria.

Harms

No serious adverse events were reported. One patient reported a swollen hand and increased pain after the second treatment session.²⁶¹

Key Question 4. Fibromyalgia

For fibromyalgia, 47 RCTs (in 54 Publications) were included in the prior AHRQ report (N=4225). Three trials were rated good quality, twenty trials fair quality, and twenty-four trials poor quality. The prior AHRQ report found exercise, CBT, myofascial release, massage, tai chi, qigong, acupuncture, and multidisciplinary rehabilitation (MDR) associated with small to moderate improvements in function and pain over the short and intermediate term compared with an attention control, sham, no treatment or usual care. Strength of evidence was low to moderate. In the long term, small improvement in function continued for MDR and in pain for massage (low strength of evidence). CBT compared with pregabalin was associated with a small improvement in function but not pain in the short term.

For this update, we identified 11 new RCTs (in 12publications) (N=1194). Ten were rated fair quality and one was rated poor quality. The new trials evaluated exercise (1 trial), psychological therapies (CBT and electromyography [EMG] biofeedback) (6 trials), mindfulness practices (1 trial), mind-body practices (Tai chi) (1 trial) and acupuncture (2 trials). The Key Points summarize the main findings based on the evidence included in the prior report and new trials; the Key Points note where new trials contributed to findings.

Exercise for Fibromyalgia

Key Points

Exercise was associated with a small improvement in function compared with attention control, no treatment, or usual care in the short term (7 trials, pooled difference −7.68 on a 0 to 100 scale, 95% CI −13.04 to −1.84, I²=60%) (SOE: low) and intermediate term (8 trials, pooled difference −6.04, 95% CI −9.25 to −3.01, I²=0%) (SOE: moderate). There were no clear effects in the long term (3 trials, pooled difference −4.33, 95% CI −10.46 to 1.97, I²=0%) (SOE: low).
Exercise was associated with a small improvement in VAS pain (0 to 10 scale) compared with usual care, attention control, or no treatment in the short term (6 trials [excluding outlier trial], pooled difference −0.88, 95% CI −1.33 to −0.27, I²=1.5%), and at intermediate term (8 trials [1 new], pooled difference −0.51, 95% CI −0.92 to −0.06), I²=0%) but no effect long term (4 trials, pooled difference −0.18, 95% CI −0.77 to 0.42, I²=0%) (SOE: moderate for all time frames).
There was insufficient evidence from one small, poor-quality trial to determine the effects of aerobic exercise versus pharmacological therapy (paroxetine) on pain in the intermediate term (SOE: insufficient). There were no data on short- or long-term effects.
Data on harms were insufficient. Most trials of exercise did not report on adverse events at all. One trial reported one nonstudy-related adverse event. Two trials reported no adverse events. (SOE: insufficient).

Detailed Synthesis

Twenty-two trials (reported in 24 publications) of exercise therapy for fibromyalgia met inclusion criteria⁷⁶^–⁹⁹ (Table 36 and Appendix D). This included one new trial not included in the prior AHRQ report.⁹⁹ The exercise interventions varied across the trials and included combinations of different exercise types (12 trials),⁷⁷^,⁷⁸^,⁸⁰^,⁸³^,⁸⁵^,⁸⁹^,⁹¹^,⁹²^,⁹⁴^–⁹⁷^,⁹⁹ aerobic exercise (10 trials),⁷⁹^,⁸¹^,⁸²^,⁸⁴^,⁸⁶^–⁸⁸^,⁹⁰^,⁹²^,⁹³^,⁹⁸ muscle performance exercise/strength training (1 trial),⁸⁶ and Pilates (1 trial).⁷⁶ The duration of exercise therapy ranged from 1 to 8 months across the trials and the total number of exercise sessions ranged from 4 to 96 (at a frequency of 1 to 5 times per week). Many trials also included instruction for home exercise practice. Exercise was compared to usual care in nine trials,⁷⁹^,⁸⁰^,⁹⁰^–⁹²^,⁹⁶^–⁹⁹ no treatment in six trials,⁸³^–⁸⁶^,⁸⁹^,⁹⁴^,⁹⁵ attention control in five trials,⁷⁶^,⁷⁸^,⁸¹^,⁸²^,⁸⁷^,⁸⁸ and to waitlist,⁷⁷ sham (i.e., transcutaneous electrical stimulation),⁹³ and pharmacological care⁹³ in one trial each (the latter two control conditions were separate arms of the same trial). Usual care generally included medical treatment for fibromyalgia and continued normal daily activities (which often specifically excluded the exercise intervention being evaluated). Attention control conditions consisted of fibromyalgia education sessions, social support, general guidance on coping strategies, relaxation and stretching exercises, and physical activity planning.

Sample sizes ranged from 32 to 166 across the trials (total sample=1,428). Patient mean age ranged from 35 to 57 years, and the majority were female (89% to 100%). Thirteen trials were conducted in Europe,⁷⁹^,⁸³^,⁸⁵^,⁸⁸^–⁹²^,⁹⁴^–⁹⁹ five in North America,⁷⁸^,⁸⁰^–⁸²^,⁸⁴^,⁸⁷ two in Brazil,⁷⁷^,⁸⁶ and two in Turkey.⁷⁶^,⁹³

Twelve trials were rated fair quality⁷⁶^,⁷⁷^,⁷⁹^–⁸²^,⁸⁶^,⁸⁸^,⁸⁹^,⁹²^,⁹⁶^,⁹⁸^,⁹⁹ and 10 poor quality⁷⁸^,⁸³^–⁸⁵^,⁸⁷^,⁹⁰^,⁹¹^,⁹³^–⁹⁵^,⁹⁷ (Appendix E). Methodological limitations in the fair-quality trials were primarily related to unclear allocation concealment methods and lack of blinding (the nature of interventions precluded blinding of participants and researchers). Additionally, poor-quality trials also suffered from unclear randomization methods and high rates of attrition and/or differential attrition.

Exercise Compared With Usual Care, Waitlist, an Attention Control, or No Treatment

Functional Outcomes. Exercise was associated with a small improvement in function short term compared with usual care, an attention control, or no treatment based on Fibromyalgia Impact Questionnaire (FIQ) total scores, which reflect fibromyalgia impact on function as well as symptoms such as pain, fatigue, stiffness, anxiety, and depression, (7 trials, pooled difference −7.68 on a 0 to 100 scale, 95% CI −13.04 to −1.84, I²= 59.9%)⁷⁶^,⁷⁷^,⁸⁰^,⁸³^,⁸⁶^,⁸⁷^,⁸⁹ (Figure 44). The estimate across fair-quality trials (i.e., not including the poor-quality trials) was somewhat higher (5 trials, pooled difference −9.91, 95% CI −15.75 to −4.07).⁷⁶^,⁷⁷^,⁸⁰^,⁸⁶^,⁸⁹

Exercise was associated with a small improvement in intermediate-term function versus controls for FIQ total score (8 trials, pooled difference on 0-100 scale, −6.04, 95% CI −9.25 to −3.01, I²= 0%)⁸⁰^,⁸²^–⁸⁴^,⁸⁸^,⁹¹^,⁹²^,⁹⁴ (Figure 44). Estimates were slightly smaller across the fair-quality trials only (4 trials, pooled difference −4.04, 95% CI −7.90 to −0.03).⁸⁰^,⁸²^,⁸⁸^,⁹² Stratification by exercise type yielded similar results for combination exercise (7 trials, pooled difference −5.75, 95% CI −9.29 to −2.54),⁸⁰^,⁸²^,⁸³^,⁸⁸^,⁹¹^,⁹²^,⁹⁴ but there was no clear difference between aerobic exercise and no treatment or usual care (2 trials, pooled difference −8.13, 95% CI −16.24 to 0.28).⁸⁴^,⁹² Estimates were consistent with a slightly greater effect of exercise on function when compared with usual care (3 trials, pooled difference −6.13, 95% CI −11.71 to −1.06)⁸⁰^,⁹¹^,⁹² or no treatment (3 poor quality trials, pooled difference −9.97, 95% CI −16.24 to −3.45),⁸³^,⁸⁴^,⁹⁴ but there was no clear difference in two fair-quality trials using attention controls (pooled difference −3.25, 95% CI −99.32 to 5.20).⁸²^,⁸⁸

Exercise no longer had an effect on long-term function compared with controls based on the FIQ total score (3 trials, pooled difference on 0 to 100 scale, −4.33, 95% CI −10.46 to 1.97, I²= 0%)⁸²^,⁹¹^,⁹⁶ (Figure 44). There were no clear differences in estimates when analyses were stratified according to the type of exercise (2 trials of combination exercise, pooled difference −4.45, 95% CI −14.39 to 6.24),⁸²^,⁹¹ type of comparison (2 trials of usual care, pooled difference −5.34, 95% CI −13.4 to 2.32),⁹¹^,⁹⁶ or after the exclusion of one poor-quality trial (2 trials, pooled difference −3.11, 95% CI −11.26 to 5.86).⁸²^,⁹⁶ Findings are based on a small number of trials.

Pain Outcomes. Exercise had a moderately greater effect on pain (0 to 10 VAS) in the short term compared with usual care, attention control, or no treatment (7 trials, pooled difference −1.08, 95% CI −1.75 to −0.32, I²=53.1%)⁷⁶^–⁷⁸^,⁸⁰^,⁸³^,⁸⁵^,⁸⁶ (Figure 45). Substantial heterogeneity was noted with one outlier trial of belly dance (combination exercise) versus waitlist control, reporting substantially higher estimates.⁷⁷ Excluding the outlier trial reduced heterogeneity and led to an effect size consistent with a small effect (6 trials, pooled difference −0.88, 95% CI −1.33 to −0.27, I²=1.5%) Estimates were similar when stratified by exercise type and control type. Across the fair-quality trials, the estimate was somewhat larger (4 trials, pooled difference −1.44, 95% CI −2.4 to −0.49, including the outlier).⁷⁶^,⁷⁷^,⁸⁰^,⁸²^,⁸⁶

There was a small improvement in VAS pain with exercise at intermediate term (8 trials [1 new], pooled difference −0.51, 95% CI −0.92 to −0.06), I²=0%)⁸⁰^,⁸²^,⁸³^,⁹⁰^,⁹³^,⁹⁴^,⁹⁷^,⁹⁹ (Figure 45). Removal of poor-quality trials⁸³^,⁹⁰^,⁹³^,⁹⁴ and stratification by exercise and control types yielded similar estimates (pooled differences ranged from −0.40 to −0.71) with no clear differences identified.

There was no effect of exercise on pain long term (4 trials, pooled difference −0.18 on a 0-10 scale, 95% CI −0.77 to 0.42, I²=0%)⁷⁸^,⁸²^,⁹⁶^,⁹⁸ (Figure 45). Similar estimates were obtained and no clear differences were seen following exclusion of one poor quality-trial or for the comparisons of aerobic exercise with usual care or combination exercise with attention control; pooled differences ranged from −0.05 to −0.26.

Other Outcomes. Data on the effects of exercise on anxiety, depression, and quality of life were often poorly reported (Table 35) and results are mixed. Exercise had no clear effect in the short term on measures of mental health, depression, anxiety, psychological distress, or sleep disturbance VAS across five trials,⁷⁶^–⁸⁰ with only one small poor-quality trial favoring exercise on the EQ-5D anxiety/depression scale.⁸⁵ Similarly, exercise had no clear effect on quality of life.

At intermediate term, exercise was associated with a small improvement in depression measured by the Beck Depression Inventory (BDI) compared with no treatment or usual care (4 trials, pooled difference −4.9 on a 0-63 scale, 95% CI −7.55 to −2.47, I²= 33.1%, plot not shown)⁸⁴^,⁹¹^–⁹³; three of the four trials were poor quality. Results were similar for aerobic exercise (3 trials, pooled difference −5.34, 95% CI −8.42 to −3.03) but no difference between groups was seen in the pooled estimate for the two trials using combination exercise or when any exercise was compared with usual care only (2 trials). Across various other measures, exercise had no clear effect on depression in five trials⁷⁸^,⁷⁹^,⁸²^,⁸⁸^,⁹⁰; however, one poor-quality trial favored exercise based on the FIQ depression subscale versus usual care.⁹⁴ Results for anxiety were mixed: two trials (one fair- and one poor-quality)⁸⁸^,⁹⁰ reported no difference between groups while two small, poor-quality trials reported a greater improvement in anxiety on the State-Trait Anxiety Inventory (STAI) and the FIQ anxiety subscale with exercise versus usual care.⁸⁴^,⁹⁴ Exercise was associated with improved quality of life (SF-36 questionnaire) in three small trials,⁹¹^,⁹²^,⁹⁵ but not in a fourth larger fair-quality trial⁸⁸ (Table 35). Exercise had no clear effect on psychological problems in two trials⁷⁸^,⁸⁰ or sleep in three trials.⁷⁸^,⁸³^,⁹⁰ One trial reported no between-group difference in analgesic medication use by 6 months, although patients randomized to aerobic exercise showed a significant reduction from baseline use.⁹³

Long term, exercise had no clear effect on measures of depression, anxiety, or psychological problems in all but one poor-quality trial.⁹¹ This same trial also reported improvement in SF-36 total scores, whereas one larger fair-quality trial did not.⁷⁹ No differences between groups in healthcare utilization were seen in the 2 months prior to the final assessment at 18 months in one trial⁹⁶ (Table 35).

Exercise Compared With Pharmacological Therapy

One small, poor-quality trial (N=32 analyzed) comparing 1.5 months of aerobic exercise (40 minutes on bicycle ergometer three times per week) versus paroxetine 20 mg daily found no between-group difference in pain on VAS at intermediate-term followup (difference −0.26 on a 0-10 scale, 95% CI −1.46 to 0.94). Regarding secondary outcomes, no differences were seen for depression (BDI) or mean analgesic consumption over the intermediate term, although the exercise group showed a greater reduction from baseline in analgesic use compared with the paroxetine group.

Exercise Compared With Other Nonpharmacological Therapies

Findings for exercise versus other nonpharmacological therapies are addressed in the sections for other nonpharmacological therapies.

Harms

Most trials of exercise did not report on adverse events. One trial reported one nonstudy-related adverse event.⁸⁵ Two trials reported no adverse events.⁸⁶^,⁸⁹

Psychological Therapies for Fibromyalgia

Key Points

There was no clear difference between CBT versus usual care or waitlist in short-term function (3 trials [1 new], pooled difference −6.14 on 0-100 FIQ total scale, 95% CI −16.86 to 3.74, I²=70.6%). At intermediate term, CBT was associated with a moderate improvement in function (3 trials [1 new], pooled difference −12.82 on 0-100 FIQ total scale, 95% CI −24.07 to −2.44, I²=94.2%) versus waitlist or usual care. CBT was associated with improved function intermediate term (mean difference −1.8 on 0-10 FIQ Physical Impairment Scale, 95% CI −2.9 to −0.70) compared with attention control in an additional trial, however two new trials found no difference between CBT and waitlist on the Pain Disability Index or West Haven -Yale Multidimensional Pain Inventory (MPI) pain interference subscale.. Evidence from two poor-quality trials was insufficient to determine effects on long-term function (SOE: low for short term and intermediate term, insufficient for long term).
CBT was associated with a small improvement in pain (on a 0-10 scale) compared with usual care or waitlist in the short term (4 trials [1 new], pooled difference −0.62, 95% CI −1.08 to −0.14) but not at intermediate-term (6 trials [4 new], pooled difference −0.55, 95% CI −1.13 to 0.06). There was no difference in clinically important improvement at intermediate term (≥50% on the Brief Pain Inventory) between CBT (8.3%) or emotional awareness and expression therpay (EAET) (22.5%) and usual care (12%) in one new fair quality trial. Evidence from one poor-quality trial was insufficient to determine effects on long-term pain (SOE: low for short term and intermediate term, insufficient for long term).
Data were insufficient to determine the effects of EMG biofeedback on function and pain compared with attention controls in the short and long term (1 poor-quality trial and one new fair-quality trial) and with usual care in the intermediate term (1 poor-quality trial), and for the impact of guided imagery versus attention control in the short term (1 poor-quality trial) (SOE: insufficient for all comparisons and time points).
At intermediate term, CBT was associated with a small improvement in function versus pregabalin (plus duloxetine as needed) in two trials [1 new]; differing effect size magnitudes for the trials (−4.0 vs. −15.6, FIQ total score, 0-100 scale) resulted in substantial heterogeneity for the pooled effect estimate making it unreliable (pooled difference −9.81, 95% CI −23.83 to 4.21, I²=96%) (SOE: low). There was no difference across these trials for VAS pain at intermediate term (2 trials [1 new], pooled difference −0.31 on a 0-10 scale, 95% CI −1.15 to 0.51, I²=63.5%) (SOE: low)
There was insufficient evidence to determine the impact on pain and function for the following: CBT versus pharmacological treatment (amitriptyline) over the short term (fair-quality trial) and electroencephalography (EEG) biofeedback versus pharmacological treatment (escitalopram) over the short and intermediate term (poor-quality trial) (SOE: insufficient). Long-term data were not reported.
There was insufficient evidence to determine the effects of psychological therapies versus exercise on function and pain in the short term (1 small trial of biofeedback), intermediate term (2 trials of CBT and biofeedback), and long term (3 trials of CBT, biofeedback, and relaxation for function; 4 trials of CBT [2], biofeedback, and relaxation for pain). All trials were considered poor quality (SOE: insufficient for function and pain at all time points).
Data on harms were insufficient. Adverse events were poorly reported across the trials but were overall minor and occurred at similar frequencies between groups. In one trial, however, fewer patients randomized to stress management (4.8%) compared with usual care (50%) withdrew from the trial, citing increased depression and worsening of symptoms, respectively. In another (new) trial comparing acceptance and commitment therapy (ACT) with pregabalin (plus duloxetine as needed) several mild adverse events were noted in the pharmacological therapy group, most commonly nausea (25%) and dry mouth (23%) (SOE: insufficient).

Detailed Synthesis

A total of 20 trials (in 22 publications) of psychological therapy for fibromyalgia met inclusion criteria (Table 37 and Appendix D).⁷⁸^,⁹⁷^,⁹⁸^,¹¹³^–¹²⁷^,¹³⁰^,¹³¹^,¹³⁵^,¹³⁶ Fourteen trials (across 15 publications) were included in the previous AHRQ report⁷⁸^,⁹⁷^,⁹⁸^,¹¹³^–¹²⁰^,¹³⁰^,¹³¹^,¹³⁵^,¹³⁶ and six trials (across 7 publications)¹²¹^–¹²⁴ were added for this update. Fourteen trials (5 new trials; across 16 publications) featured a CBT component,⁹⁸^,¹¹³^–¹¹⁷^,¹¹⁹^–¹²⁴^,¹²⁶^,¹²⁷^,¹³⁰^,¹³⁶ four trials included biofeedback (EMG or EEG),⁷⁸^,⁹⁷^,¹²⁵^,¹³¹ and one trial each included relaxation training¹³⁵ and guided imagery¹¹⁸ (Table 36 and Appendix D). The various psychological interventions were compared with usual care, waitlist control or attention control groups (15 trials [5 new], 17 publications),⁷⁸^,⁹⁷^,⁹⁸^,¹¹³^–¹²¹^,¹²⁴^–¹²⁷ pharmacological therapy (4 trials [1 new], 5 publications),¹¹³^,¹²²^,¹²³^,¹³⁰^,¹³¹ or exercise therapy (5 trials).⁷⁸^,⁹⁷^,⁹⁸^,¹³⁵^,¹³⁶

The majority of subjects in all the trials were female (range 90% to 100%, many trials were limited to females) and mean ages ranged from 32 to 56 years. Sample sizes ranged between 32 and 230 subjects (total sample=1,822). Therapy duration and frequency in CBT trials ranged from 6 weekly sessions to 20 sessions over 6 months. CBT was delivered in groups in 12 trials (4 new trials)¹¹³^,¹¹⁵^–¹¹⁷^,¹¹⁹^–¹²⁴^,¹²⁶^,¹³⁰^,¹³⁶ and by telephone¹¹⁴ in another. In one trial,¹²⁷ CBT appeared to be delivered individually. Most CBT trials were of CBT as traditionally delivered for the treatment of pain problems. The exceptions included two trials (in 4 publications) ACT;¹¹⁶^,¹¹⁹^,¹²²^,¹²³ two trials that evaluated CBT for pain and CBT for pain and insomnia;¹²¹^,¹²⁷ one trial of stress management therapy which that included presentations on stress mechanisms and training in pain coping and relaxation strategies;⁹⁸ and one trial of CBT for managing stress and pain.¹²⁶ These interventions were considered to be similar to standard CBT, however. Session lengths ranged from 30 minutes up to 3 hours.

In the six trials of biofeedback and associated interventions, therapy duration ranged from 4 to 16 weeks and was delivered individually in the four biofeedback trials and in groups for the remaining two trials. The frequency ranged from one to five times per week with sessions as short as 25 minutes and as long as 3 hours.

Short-term outcomes (<6 months) were reported by five trials (1 new trial) of CBT,¹¹⁴^–¹¹⁶^,¹¹⁹^,¹²¹^,¹³⁰ three trials (1 new trial) of biofeedback⁷⁸^,¹²⁵^,¹³¹ and one trial of guided imagery.¹¹⁸ Intermediate outcomes (6 to <12 months) were reported by eight CBT trials (4 new trials)¹¹³^,¹¹⁵^,¹¹⁷^,¹²²^–¹²⁴^,¹²⁶^,¹²⁷^,¹³⁶ and one trial of biofeedback.⁹⁷ Long-term outcomes (≥12 months) were reported by four CBT trials,⁹⁸^,¹¹⁷^,¹²⁰^,¹³⁶ one biofeedback trial⁷⁸ and one trial of relaxation therapy.¹³⁵ Studies were conducted in Spain (5 trials),¹¹³^,¹¹⁵^,¹²¹^–¹²³^,¹³⁶ the United States (5 trials),⁷⁸^,¹¹⁴^,¹²⁰^,¹²⁴^,¹²⁷ Sweden (3 trials),¹¹⁶^,¹¹⁹^,¹²⁶^,¹³⁵ the Netherlands (2 trials),⁹⁷^,¹¹⁸ Germany (2 trials),¹¹⁷^,¹²⁵ and one trial each in Brazil,¹³⁰ Norway⁹⁸ and Turkey.¹³¹

Among the 14 CBT trials, seven (4 new trials) were considered fair quality,¹¹³^,¹¹⁶^,¹¹⁹^,¹²²^–¹²⁴^,¹²⁶^,¹²⁷^,¹³⁰ while the remaining seven (1 new trial) were rated poor quality⁹⁸^,¹¹⁴^,¹¹⁵^,¹¹⁷^,¹²⁰^,¹²¹^,¹³⁶ (Appendix E). Among the remaining trials of biofeedback, relaxation, and guided imagery interventions, all were rated poor quality⁷⁸^,⁹⁷^,¹¹⁸^,¹³¹^,¹³⁵ except for one new biofeedback trial which was considered to be fair-quality.¹²⁵ Methodological shortcomings included lack of blinding in fair-quality and poor-quality trials, and unclear allocation concealment methods, poor compliance, and high attrition in the poor-quality trials. In all trials, the nature of the intervention types precluded blinding of participants.

Psychological Therapies Compared With Usual Care, Waitlist, or Attention Control

Fifteen trials (5 new trials) compared psychological interventions versus usual care, waitlist, or attention control.⁷⁸^,⁹⁷^,⁹⁸^,¹¹³^–¹²¹^,¹²⁴^–¹²⁷ Nine trials were considered poor quality and six [5 new trials]¹¹⁶^,¹¹⁹^,¹²²^–¹²⁷ were considered fair quality. ACT is considered a form of CBT and was included in CBT-specific analyses.

Functional Outcomes. Across all types of psychological interventions, two poor quality trials reported on clinically meaningful improvement in short-term function (Table 37). Significantly more patients in the CBT group attained a clinically important improvement (≥14% on the FIQ total, 0-100 scale) from baseline compared with usual care (RR 2.8, 95% CI 1.3 to 6.1) in one trial,¹¹⁵ but there was no significant difference in a smaller trial (RR 2.2, 95% CI 0.5 to 9.3).¹¹⁴

Examining mean differences in followup scores short-term, there was no clear difference in function across psychological therapies versus usual care, waitlist or attention control (5 trials [2 new], pooled difference −2.82 on a 0-100 FIQ total scale, 95% CI −9.79 to 2.81, I²=70.6%).¹¹⁵^,¹¹⁶^,¹¹⁸^,¹¹⁹^,¹²¹^,¹²⁵ Analysis confined to CBT trials (including ACT) showed no clear difference in function compared with usual care or waitlist in the short term (3 trials [1 new], pooled difference −6.14 on a 0-100 scale, FIQ total, 95% CI −16.86 to 3.74, I²=70.6%).¹¹⁵^,¹¹⁶^,¹¹⁹^,¹²¹ Two trials were fair quality (Figure 46). Analysis of differences in change scores on the FIQ were similar in magnitude (data not shown). The prior AHRQ review reported a small improvement in function with CBT versus usual care or waitlist based on two trials.¹¹⁵^,¹¹⁶^,¹¹⁹ No differences between groups were seen in the trials of guided imagery (difference 1.2 on a 0-100 FIQ total scale, 95% CI −0.2 to 2.6)¹¹⁸ and EMG biofeedback. In one study of EMG biofeedback versus attention control, median change from baseline was 6.0 for both groups on the Arthritis Impact Measurement Scales (AIMS) physical activity subscale (0-10 scale).⁷⁸ In a new fair-quality trial of EMG biofeedback,¹²⁵ there was no difference on the FIQ as compared with an attention control condition.

At intermediate term, one poor quality trial reported that substantially more CBT patients achieved a clinically important functional improvement (≥14% on the FIQ total, 0-100 scale) compared with usual care (RR 2.9, 95% CI 1.9 to 17.8).¹¹⁵ For analysis of mean differences in intermediate term scores, CBT/ACT was associated with moderate improvement in function (3 trials [1 new], pooled difference −12.82 on 0-100 scale, FIQ total, 95% CI −24.07 to −2.44, I²=94.2%)¹¹³^,¹¹⁵^,¹²²^,¹²³ versus waitlist or usual care. All trials favored CBT (2 fair, 1 poor quality) but differed in magnitude of benefit. Pooled effect size was attenuated (small improvement with CBT) and no longer significant due to heterogeneity across the two trials of CBT versus usual care in the prior report (pooled difference −9.35, 95% CI −26.95 to 5.02, I²=84.5%).¹¹³^,¹¹⁵ Both trials individually showed CBT had a statistically greater effect on function than usual care, but the effects differed in magnitude and we reported as a small improvement in function in the prior report (Figure 46). Findings from an additional trial suggested a greater improvement in function with CBT compared with attention control based on a 0 to 10 FIQ Physical Impairment Scale (difference −1.8, 95% CI −2.9 to −0.70).¹¹⁷ A new fair-quality trial¹²⁷ of CBT for pain and CBT for insomnia versus waitlist found no difference between groups on the Pain Disability Index. A new fair-quality trial¹²⁶ of a CBT stress management program versus waitlist also found no difference on the West Haven-Yale Multidimensional Pain Inventory (MPI) pain interference subscale. There was no clear difference between biofeedback and usual care on function on the Sickness Impact Profile (SIP) physical score in one trial (mean change −1.6, 95% CI −3.4 to 0.2 versus −0.6, 95% CI −2.9 to 1.7, respectively, on a 0-100 scale).⁹⁷

Data from two poor-quality trials were insufficient to determine the long-term effects of psychological therapies on function. One trial reported that CBT resulted in greater improvement compared with attention control on the FIQ Physical Impairment Scale (difference −1.8 on a 0-10 scale, 95% CI −2.85 to −0.745).¹¹⁷ A trial of biofeedback versus usual care reported the same median change in the AIMS Physical Activity subscale (6.0) in both groups.⁷⁸

Pain Outcomes. Psychological interventions (CBT/ACT and EMG biofeedback) were associated with a small improvement in pain compared with usual care, waitlist, or attention control, based on mean differences at short-term followup (5 trials [1 new], pooled difference −0.62, 95% CI −1.02 to −0.20, I²=0%)⁷⁸^,¹¹⁴^–¹¹⁶^,¹¹⁹^,¹²¹ (Figure 47). Results based on the mean difference of change scores were similar, but not statistically significant (data not shown). The estimate was similar when only trials of CBT were considered (4 trials [1 new], pooled difference −0.62, 95% CI −1.08 to −0.14, plot not shown).¹¹⁴^–¹¹⁶^,¹¹⁹^,¹²¹ One poor quality trial reported no difference between CBT and usual care in the proportion of patients with clinically important improvement in pain short-term (≥30% improvement on 0-10 NRS, RR 1.5, 95% CI 0.4 to 5.7).¹¹⁵ The addition of the new poor quality CBT trial¹²¹ resulted in no changes in conclusions from the prior AHRQ report for short term results.

At intermediate term, one poor quality trial reported no difference in the proportion of patients showing a clinically important improvement in pain (≥30% on 0-10 NRS, RR 1.3 95% CI 0.4 to 4.2)¹¹⁵; similarly, one new fair quality trial reported no differences in clinically important improvement (≥50% on Brief Pain Inventory) with CBT (8.3%) or EAET (22.5%) versus usual care (12%).¹²⁴ In analyses based on mean differences in scores, psychological interventions (CBT, ACT, EMG biofeedback, and combined CBT and EAET) were associated with a small benefit for pain compared with usual care, attention control or waitlist (7 trials [4 new], pooled difference −0.62, 95% CI −1.14 to −0.09, I²=65.7%),⁹⁷^,¹¹³^,¹¹⁵^,¹²²^–¹²⁴^,¹²⁶^,¹²⁷ (Figure 47). Effect sizes at intermediate term were slightly smaller in a subanalysis of therapies versus usual care only (3 trials [1 new], pooled difference −0.52, 95% CI −1.4 to −0.15).⁹⁷^,¹¹³^,¹¹⁵ Pooling only the six CBT trials, the effect was slightly smaller (6 trials [4 new] pooled difference −0.55, 95% CI −1.12 to 0.06)¹¹³^,¹¹⁵^,¹²²^–¹²⁴^,¹²⁶^,¹²⁷ with no clear difference between CBT and usual care, waitlist or attention control. Similarly, there was no clear difference in a subanalysis confined to the five fair quality trials, all of which were of CBT (5 trials [4 new], pooled difference −0.48, 95% CI −1.11 to 0.24).¹¹³^,¹²²^–¹²⁴^,¹²⁶^,¹²⁷ In the prior AHRQ report, there was no clear difference between CBT and usual care across two studies although each tended to favor CBT. The addition of the four new fair quality studies does not change the conclusion of no clear difference. In one new trial, the author-developed EAET, compared with attention control, was not associated with lower pain intensity at intermediate term based on the proportion of patients achieving a 50 percent or greater reduction in pain (22.5% vs. 12.0%, p=0.07) or the mean difference in pain scores using the Brief Pain Inventory 0-10 scale (−0.54, 95% CI −1.2 to 0.1), but was associated with improved fibromyalgia symptoms (difference −2.9, 95% CI −4.9 to −0.8 on the FM symptom scale, scale unclear).¹²⁴

Three trials⁷⁸^,⁹⁸^,¹²⁰ reported long term effects on pain. A pooled analysis of two of these trials found no difference between these psychological therapies (CBT or biofeedback/relaxation training) and attention control or usual care (2 trials, pooled difference 0.04, 95% CI −0.89 to 0.98, I²=0%)⁷⁸^,⁹⁸; however, evidence across these two poor-quality trials was considered insufficient (Figure 47). The third trial found no difference between CBT and usual care in the proportion of participants achieving a clinically meaningful change of 12 points from baseline on the McGill Pain Questionnaire (MPQ) Sensory Scale (RR 0.54, 95% CI 0.14 to 2.2).¹²⁰

Other Outcomes. Results for secondary outcomes were mixed across trials of CBT and ACT on secondary outcomes (Table 36). Five trials were fair quality;¹¹⁶^,¹¹⁹^,¹²²^–¹²⁴^,¹²⁶^,¹²⁷ the rest were poor quality.

In one fair-quality trial of ACT versus waitlist there were no differences between groups over the short term on the BDI, STAI-State scale or Short-Form-36 (SF-36) PCS; ACT was associated with improvement in the SF-36 MCS.¹¹⁶^,¹¹⁹ In a new fair-quality trial of EMG biofeedback,¹²⁵ there was no difference on SF-36 scores compared with an attention control condition.

Five fair-quality trials of CBT/ACT reported intermediate term outcomes. A comparison of CBT versus usual care found no differences on the Hospital Anxiety and Depression Scale (HAM-D) and Hamilton Anxiety Rating Scale (HAM-A).¹¹³ A new trial of ACT versus waitlist found a benefit of ACT for the 0-100 EQ5D VAS health status rating (difference 12.2, 95% CI 7.9 to 16.5), Hospital Anxiety and Depression Scale-Anxiety (HADS-A) (difference −3.42, 95% CI −4.7 to −2.1), and Hospital Anxiety and Depression Scale-Depression (HADS-D) (difference −3.5, 95% CI −4.4 to −2.5).¹²²^,¹²³ A new trial of CBT versus education attention control¹²⁴ found no difference on the Short Form-12 Physical scale, Satisfaction with Life Scale, Pittsburgh Sleep Quality Index (PSQI), Positive Affect Negative Affect Schedule (PANAS)-positive score, PANAS-negative score, Center for Epidemiologic Studies Depression Scale (CES-D), Generalized Anxiety Disorder-7, or PROMIS Fatigue Short-Form. A new fair-quality trial of CBT for insomnia, CBT for pain, and waitlist found benefits of both CBT interventions for measures of sleep, but not depression or anxiety.¹²⁷ A new fair-quality trial of CBT stress management versus waitlist found benefits of CBT for measures of affective distress and depression, but not sleep.¹²⁶ Across the poor-quality trials, results were mixed across various secondary outcomes measures (Table 36).

Two poor-quality studies compared EMG biofeedback to attention control conditions; neither found differences on secondary outcomes, including the Symptoms Checklist 90-Revised Global Severity Index, SIP psychosocial score, global assessment of well-being, CES-D, and a sleep scale.⁷⁸^,⁹⁷

Psychological Therapies Compared With Pharmacological Therapy

Three fair-quality trials¹¹³^,¹²²^,¹²³^,¹³⁰ and one poor-quality trial¹³¹ compared a psychological therapy with pharmacological treatment. Two small trials reported functional outcomes over the short term with differing results. No effect was seen for CBT (plus amitriptyline) compared with amitriptyline alone at 3 months in one fair-quality trial (difference −4.10, 95% CI −18.40 to 10.20 on the FIQ total score [0 to 100 scale]).¹³⁰ One poor-quality trial, comparing EEG biofeedback with escitalopram, reported improved mean FIQ total scores (0-100 scale) in the biofeedback group at 4 to 5 months followup (difference −29.00, 95% CI −38.58 to −19.42).¹³¹ Substantial heterogeneity of the interventions, the medication comparators and quality of the trials precluded meaningful pooling for this outcome (Figure 48).

Intermediate-term function was reported by two fair-quality trials (1 new trial)¹¹³^,¹²²^,¹²³; both found benefits for CBT (including ACT) compared with pregabalin (plus duloxetine for depressed patients) according to the FIQ Total scale (0-100). One found a small improvement in function favoring CBT (difference −4.00 on a 0-100 scale, 95% CI −7.44 to −0.56)¹¹³; the other found a moderate improvement for function associated with CBT (difference −15.62, 95% CI −19.03 to −12.21).¹²²^,¹²³ The pooled estimate suggests a small improvement in function (pooled difference −9.81, 95% CI −23.83 to 4.21, I²=96% but substantial heterogeneity due to the differences in effect magnitudes is noted) (Figure 48). It is unclear how many patients in the pharmacological group received concomitant duloxetine for major depressive disorder.

No differences in pain short-term were seen between groups in the trial of CBT versus amitriptyline (difference −0.7 on a 0-10 VAS, 95% CI −2.8 to 1.4),¹³⁰ whereas a moderate improvement was seen for EEG biofeedback compared with escitalopram (difference −2.7 on a 0-10 VAS, 95% CI −3.7 to −1.7) in the poor-quality trial.¹³¹ Trials were not pooled given heterogeneity of both the intervention and medication comparators.

At intermediate-term, no difference between CBT/ACT versus pregabalin was observed (2 trials [1 new] pooled difference −0.31, 95% CI −1.15 to 0.51, I²= 63.5%).¹¹³^,¹²²^,¹²³

Regarding secondary outcomes, EEG biofeedback was associated with significantly better outcomes on various measures of anxiety, depression, and quality of life compared with escitalopram short term in the poor-quality trial.¹³¹ The two fair-quality trials evaluating CBT (versus amitriptyline and versus pregabalin)¹¹³^,¹³⁰ found no differences between groups over the short or intermediate term, with the exception of a benefit of CBT for SF-36 Mental Health scores at short-term followup in one trial (difference 13.7 on a 0-100 scale, 95% CI 0.07 to 27.3).¹³⁰ In the fair quality trial of ACT versus pregabalin (plus duloxetine for patients who were depressed), at intermediate term there was a benefit of ACT on the EQ-5D VAS measure of self-assessed health state (0-100 scale, with higher scores indicating better health; difference 9.6, 95% CI 5.2 to 14.0); the 0-21 HADS-A anxiety scale (difference −1.0, 95% CI −1.8 to −0.06); and the 0-21 HADS-D depression scale (difference −1.7, 95% CI −2.6 to −0.8). Across the two studies of CBT versus pregabalin (plus duloxetine as needed),¹¹³^,¹²²^,¹²³ there was no difference between therapies on depression (measured by the HADS depression scale and the Hamilton Depression scale) intermediate term (difference −0.43, 95% CI −1.13 to 0.28, I²=93%). Two trials examined effects of pregabalin (plus duloxetine as needed) on measures of anxiety, with no difference across these studies at intermediate term followup (difference −0.23, 95% CI −0.69 to 0.23, I²= 0%).

Psychological Therapies Compared With Exercise

Five poor-quality trials compared psychological interventions with exercise; two trials evaluated CBT,⁹⁸^,¹³⁶ two trials evaluated biofeedback,⁷⁸^,⁹⁷ and one evaluated relaxation training¹³⁵ (Table 36). All trials were included in the prior AHRQ report.

Data were insufficient from one poor-quality trial to determine the effects of biofeedback versus combination exercise on function. The trial reported improved function based on the AIMS physical activity subscale (median change from baseline 6.0 versus 4.0, p<0.05).⁷⁸ Intermediate-term data from two poor-quality trials were insufficient to determine effects of psychological therapies on function and no clear differences in function were seen for CBT (difference −0.6, 95% CI −12.6 to 11.4 on 0-100 FIQ total score)¹³⁶ or biofeedback (mean change −1.6, 95% CI −3.4 to 0.2 vs. −0.6, 95% CI −2.9 to 1.7 on 0-100 SIP Physical score)⁹⁷ versus combination exercise. Similarly, no clear differences between psychological therapies and exercise were seen across three trials at longer term and evidence was considered insufficient. Results from two trials were not statistically significant (CBT vs. combination exercise [difference 0.1, 95% CI −10.5 to 10.7 on 0-100 FIQ total scale]¹³⁶ and relaxation training versus strength training [difference −1.7, 95% CI −9.3 to 5.9, on 0-100 FIQ Total Score]).¹³⁵ The third trial of biofeedback versus combination exercise reported improvement in function, but limited data were provided (median change from baseline, 6.0 versus 4.0, p<0.05).⁷⁸

Data were insufficient from one poor-quality trial to determine the effects of biofeedback versus combination exercise pain (median change from baseline, 5.2 vs. 5.4 on 0-10 VAS).⁷⁸ Across two poor-quality trials at intermediate term, no clear differences were seen for CBT (difference −1.0, 95% CI −2.8 to 0.8)¹³⁶ or biofeedback (mean change −0.6, 95% CI −6.5 to 5.3 vs. −5.5, 95% CI −10.9 to −0.1, p=not statistically significant [NS])⁹⁷ compared with combination exercise; evidence was considered insufficient. There were no clear differences between any of the psychological therapies and exercise for pain on a 0 to 10 scale across four trials long term, including CBT versus combination exercise (difference 0.3, 95% CI −2.0 to 1.3)¹³⁶ or aerobic exercise (difference 2, 95% CI −11.6 to 15.6),⁹⁸ biofeedback versus combination exercise (median change: 5.2 vs. 5.5, p=NS),⁷⁸ and relaxation training versus strength training (difference 2.9, 95% CI −5.5 to 11.3).¹³⁵

There were generally no significant differences on measures of mental health, depression or anxiety, or on SF-36 scales, at any time frame across five poor-quality trials.⁷⁸^,⁹⁷^,⁹⁸^,¹³⁵^,¹³⁶ Some trials did not provide data for determination of effect sizes between treatment groups or report results of significance tests (Table 36).

Harms

Only seven trials (3 fair-quality and 4 poor-quality, 2 new) reported harms, which were poorly described in general. Two trials compared CBT with usual care; in one, there were no withdrawals due to adverse events in the CBT group compared with two (3.6%) in the control group (not further described)¹¹³ and in the other there were two withdrawals, one in each group, due to painfulness of the nociceptive flexion reflex test used as an outcome measure (not as part of treatment).¹¹⁴ Two trials compared psychological therapies with attention controls. One trial reported that 4.8 percent of patients in the CBT group versus 50 percent in the control group withdrew from the study (withdrawal attributed to depression [CBT group] and symptom worsening [control group]).¹¹⁷ The other trial (a new trial) reported no adverse events for CBT or attention control (education) but did note that brief symptom exacerbation (i.e., increased pain or sleep problems) was occasionally reported by patients who received the EAET intervention¹²⁴; 4% of patients in the CBT and EAET groups (vs. 2.6% in the control group) withdrew due to treatment not of interest or fit and one (1.3%) patient in the CBT group withdrew after being diagnosed with cancer. In another trial that compared CBT with waitlist,¹²²^,¹²³ 5.9% and 3.9% of CBT patients withdrew due to lack of efficacy or patient decision, respectively, compared with no patients in the waitlist group. One trial of stress management versus usual care reported one withdrawal due to cancer (unrelated to the treatment) in the intervention group compared with no withdrawals or adverse events in the control group.⁹⁸

Two of the above trials also compared psychological therapy to pharmacological therapy, specifically pregabalin (with duloxetine as needed). One trial evaluated CBT and reported no withdrawals due to adverse events in the CBT group compared with three (5.5%) in the pharmacotherapy group (2 due to digestive problems and 1 due to dizziness).¹¹³ An additional new trial compared ACT versus pregabalin and reported withdrawal due to lack of efficacy (5.9% vs. 1.9, respectively) or patients decision (3.9% vs. 0%, respectively); adverse events reported in the pregabalin group only included nausea (25%), dry mouth (23%), drowsiness, headache and fatigue (21% each) and constipation (19%).¹²²^,¹²³

Two trials of psychological therapies versus exercise reported harms. One trial reported no adverse effects with relaxation therapy, but five (7.5%) adverse effect reports following strengthening exercises (due to increased pain), resulting in three withdrawals (out of 67 randomized) from the trial.¹³⁵ The other trial reported one withdrawal due to cancer (unrelated to the treatment) in the intervention group compared with three withdrawals in the exercise group (1 death, 1 gastritis, 1 ischialgia).⁹⁸

Physical Modalities for Fibromyalgia

Key Points

One fair-quality parallel trial found no differences between magnetic mattress pads compared with sham or usual care in intermediate-term function (difference on the 0 to 80 scale FIQ −5.0, 95% CI −14.1 to 4.1 vs. sham and −5.5, 95% CI −14.4 to 3.4 vs. usual care) or pain (difference −0.6, 95% CI −1.9 to 0.7 and −1.0, 95% CI −2.2 to 0.2, respectively on a 0 to 10 NRS) (SOE: low). Data from one small, poor-quality crossover trial were insufficient to determine the effects of a magnetic mattress versus sham on function and pain in the short term (SOE: insufficient).
There were no differences in adverse events between the functional and sham magnetic mattress pad groups (data not reported); none of the events were deemed to be related to the treatments (SOE: low).

Detailed Synthesis

Two trials,¹⁶⁷^,¹⁶⁸ one parallel and one cross-over design, evaluating the efficacy of magnetic fields for the treatment of fibromyalgia met inclusion criteria (Table 38 and Appendix D). Both trials were included in the prior AHRQ report. In both trials, the majority of patients were female (93% and 100%) and the mean ages were 45 and 50 years; symptom duration was 6 years in one trial and was not reported by the other trial. Due to the differences in trial designs we could not pool the data; therefore, these trials are reported separately.

One parallel trial (N=119),¹⁶⁷ conducted in the United States, compared two different magnetic mattress pads (one with a low, uniform magnetic field of negative polarity and the other a low, static magnetic field that varied spatially and in polarity) versus sham (mattress pads with demagnetized magnets) and versus usual care (management by primary care provider). All pads were used for 6 months and outcomes were measured immediately post-treatment. This trial was rated fair quality due to deviations from the randomization protocol and high attrition rate (21%) (Appendix E).

A second small, crossover trial (N=33)¹⁶⁸ evaluated the effects of an extremely low frequency magnetic mattress compared with a sham mattress (no magnetic field delivered). The trial was conducted in Italy. The intervention periods were 1 month and the washout period between the first and second period was 1 month; no further information was provided about the washout period. Outcomes were measured 1 month after the end of each treatment cycle (i.e., at the beginning of the second treatment cycle, after a 1 month washout, and 1 month after the end of the second treatment cycle). This trial was rated poor quality due to unclear randomization sequence generation and allocation concealment, and loss-to-followup of greater than 20% through the second treatment period; additional sources of bias in this crossover trial include no details regarding handling of missing data and no analysis of carryover effect.

Physical Modalities Compared With Usual Care or Sham

The magnetic mattress pads offered no intermediate-term benefit for either function or pain compared with both sham and usual care in the one parallel trial.¹⁶⁷ The difference between groups on the 0 to 80 scale FIQ at 6 months was −5.0 (95% CI −14.1 to 4.1) (versus sham) and −5.5 (95% CI −14.4 to 3.4) (usual care). Regarding pain, the between-group differences were −0.6 (95% CI −1.9 to 0.7) and −1.0 (95% CI −2.2 to 0.2), respectively, on a 0 to 10 NRS. When the intervention groups were considered separately, only the magnetic mattress pad designed to expose the body to a uniform magnetic field of negative polarity resulted in lower FIQ and NRS pain scores compared with controls; however, the differences between groups were not statistically significant.

The crossover trial¹⁶⁸ reported statistically significant improvement in both function and pain favoring the magnetic mattress 1 month after the end of both treatment periods (i.e., over the short term); however, the evidence is considered insufficient. For patients that received magnetic therapy during the first and second (i.e. after crossing-over) treatment periods, mean FIQ scores were 19.2 and 25.1 on a 0-100 scale, respectively, compared with 57.9 and 53.9 for those receiving sham during the same treatment periods (p<0.001 for both). For VAS pain, respective scores were 2.2 and 3.1 versus 5.3 and 4.6 on a 0-10 scale (p<0.001 for both). Results were similar for both the Fibromyalgia Assessment Scale and the Health Assessment Questionnaire (Table 37).

Physical Modalities Compared With Pharmacological Therapy or Exercise

No trial of physical modality versus pharmacological therapy or versus exercise met inclusion criteria.

Harms

In the parallel trial, there were no differences in adverse events between the magnetic mattress pad and sham pad groups.¹⁶⁷ Type of adverse events was not reported, but none of the events were judged to be due to magnetic treatments. The crossover trial only stated that no side effects were recorded during the study.¹⁶⁸

Manual Therapies for Fibromyalgia

Key Points

Myofascial release therapy was associated with a small improvement in intermediate-term function as measured by the FIQ (mean 58.6 [standard deviation, SD, 16.3] vs. 64.1 [SD 18.1] on a 100 point scale, p=0.048 for the group effect in repeated measures analysis of variance [ANOVA]), but not long-term function (mean 62.8 [SD 20.1] vs. 65.0 [SD 19.8], p=0.329), compared with sham in one fair-quality trial (SOE: low). Short-term function was not reported.
There was insufficient evidence to determine the effects of myofascial release therapy on short-term pain (1 poor-quality trial) and intermediate-term pain (1 fair-quality and 1 poor-quality trial) compared with sham; there were inconsistencies in effect estimates between the intermediate-term trials (SOE: insufficient).
Myofascial release therapy was associated with small improvement in pain long term compared with sham, based on the sensory domain (mean 18.2 [SD 8.3] vs. 21.2 [SD 7.9] on a 0-33 scale, p=0.038 for group by repeated measures ANOVA) and evaluative domain (mean 23.2 [SD 7.6] vs. 26.7 [SD 6.9] on a 0-42 scale, p=0.036) of the MPQ in one fair-quality trial; there were no differences for the affective domain of the MPQ or for VAS pain (SOE: low).
Data were insufficient for harms; however, no adverse effect occurred in one fair-quality trial (SOE: insufficient)

Detailed Synthesis

Two trials (N=64 and 94)¹⁸⁵^,¹⁸⁶ evaluating myofascial release therapy versus sham therapy for fibromyalgia met inclusion criteria (Table 39 and Appendix D). Both trials were included in the prior AHRQ report. Mean patient ages were 48 and 55 years. Baseline pain history characteristics were poorly described in both trials. The duration of myofascial release therapy was 20 weeks in both trials; sessions ranged in length from 60 to 90 minutes and were conducted twice or once a week. The sham conditions included short-wave and ultrasound electrotherapy or sham (disconnected) magnotherapy. Both trials reported intermediate-term outcomes; short-term and long-term outcomes were also reported by one trial each. One trial was rated fair quality and the other poor quality (Appendix E). Unclear allocation concealment methods and lack of blinding were the major methodological shortcoming in both trials. Additionally, the poor-quality trial did not describe the randomization process employed.

Myofascial Release Therapy Compared With Sham

Myofascial release therapy was associated with a small improvement in intermediate-term function compared with sham as measured by the FIQ (mean 58.6 [standard deviation, SD 16.3] vs. mean 64.1 [SD 18.1] on a 100 point scale, p=0.048 for the group by time effect in repeated measures ANOVA) in one fair-quality trial¹⁸⁵; this effect did not persist to the long term (62.8 [SD 20.1] vs. 65.0 [SD 19.8], p=0.329, at 12 months). Function was not reported over the short term.

Regarding pain outcomes, one poor-quality trial reported a small effect for myofascial release compared with sham therapy over the short term (mean 8.4 vs. mean 9.4 on a 0-10 VAS at 1 month, p=0.048 for group by time repeated measures ANOVA).¹⁸⁶ Intermediate-term results were inconsistent across the trials as measured on a 0 to 10 VAS pain scale with one fair-quality trial reporting a small improvement in pain for myofascial release versus sham (mean 8.25 [SD 1.13] vs. mean 8.94 [SD 1.34], p=0.043)¹⁸⁵ at 6 months and the other (poor quality) reporting no significant difference between groups (8.8 vs. 9.7, p=NS) (Figure 49).¹⁸⁶ Additional pain measures were reported over the intermediate-term by the fair-quality trial, all of which showed a small benefit in favor of myofascial release: FIQ pain (8.5 [SD 0.7] vs. 8.0 [SD 1.3], p=0.042 for group by time repeated measures ANOVA) and the MPQ sensory (17.3 [SD 7.8] vs. 20.7 [SD 7.1] on a 0-33 scale, p=0.04), affective (4.5 [SD 2.9] vs. 5.2 [SD 3.8] on a 0-12 scale, p=0.04) and evaluative (21.9 [SD 7.2] vs. 26.2 [SD 6.8] on a 0-42 scale, p=0.02) dimensions.¹⁸⁵ This effect persisted at long-term followup for the sensory and evaluative dimension of the MPQ only; no differences were seen between groups regarding VAS pain of the affective dimension of the MPQ at long term following in this trial (Table 38).

Depression, anxiety, and sleep outcomes were evaluated in one poor-quality trial, with significant improvement seen short term in the myofascial release versus the sham group on some subscales of the Short-Form-36 and on the sleep duration subscale of the PSQI,¹⁸⁶ but no differences between groups on the STAI or BDI (Table 38); at intermediate followup, only PSQI sleep duration was significantly improved following myofascial release versus sham.

Manual Therapy Compared With Pharmacological Therapy or Exercise

No trial of manual therapy versus pharmacological therapy or versus exercise met inclusion criteria.

Harms

In one trial, no patient experienced an adverse effect (details not reported).¹⁸⁵ No information on harms was reported by the other trial.

Mindfulness Practices for Fibromyalgia

Key Points

No clear short-term effects of MBSR were seen on function compared with waitlist or attention control (difference 0 to 0.06 on a 0-10 scale) in two trials (one fair and one poor quality). Clinically meaningful improvement in function (≥14% on the FIQ total, 0-100 scale) was not different for MBSR versus either comparator (SOE: moderate).
No clear short-term effects of MBSR were seen on pain (difference 0.1 on a 0-100 VAS pain scale in one poor quality trial; difference −1.38 to −1.59 on the affective and −0.28 to −0.71 on the sensory dimension [scales not reported] of the Pain Perception Scale in one fair-quality trial) compared with waitlist or attention control in two trials (SOE: moderate). Intermediate-term and long-term outcomes were not reported.
In one new trial, meditation awareness training (MAT) was associated with a small intermediate-term improvement in function (adjusted difference −7.9, 95% CI −8.2 to −4.3 on FIQ 0-100 scale) and a small improvement in pain (adjusted difference −3.0, 95% CI −4.1 to −1.9 on the 0-45 SF-MPQ Pain Perception Index) versus attention control (SOE: low).
No trial of mindfulness practices versus pharmacological therapy or versus exercise met inclusion criteria.
Harms were not reported.

Detailed Synthesis

We identified three trials (4 publications) of mindfulness practices for fibromyalgia that met inclusion criteria (Table 40 and Appendix D).²⁰⁰^–²⁰³ Two trials (3 publications)²⁰⁰^–²⁰² of mindfulness-based stress reduction (MBSR) practices were included in the prior AHRQ report and one new trial²⁰³ of “Meditation Awareness Training” (MAT) was included for this update. In both MBSR trials, the intervention was modeled after the program developed by Kabat-Zinn. The intervention lasted 8 weeks, with weekly 2.5-hour sessions, daily homework assignments, and a single 7-hour session. Sample sizes ranged from 90 to 168 (total sample=406), age ranged from 48 to 53 years, and all participants were female. Both studies compared MBSR versus waitlist control; one trial²⁰¹ also compared MBSR to an attention control group that consisted of education, relaxation, and stretching. Both studies reported only short-term outcomes. One study was conducted in the United States²⁰⁰^,²⁰² and the other in Germany.²⁰¹ The third trial (N=148, mean age 47, 83% female) compared MAT, a mindfulness-based intervention, with an attention control condition (education only).²⁰³ MAT consisted of one 2-hour session per week for 8 weeks plus a CD of guided meditations to facilitate daily practice. Weekly sessions included a presentation, a facilitated group discussion, and guided educational exercises, with no practice or discussion of meditation. This trial was conducted in England.

Two trials (1 MBSR and 1 MAT) were considered fair quality²⁰¹^,²⁰³ and the other MBSR trial was considered poor quality²⁰⁰^,²⁰² (Appendix E). Methodological shortcomings in all trials were the lack of long-term followup and the inability to blind patients and providers. The poor-quality study also had a high rate of overall attrition as well as differential attrition between the groups.

Mindfulness Practices Compared With Waitlist or Attention Control

There were no clear short-term effects of MBSR on any function or pain measure reported compared with waitlist or attention control. Both trials compared MBSR to waitlist and reported function using the FIQ; one reported the physical function subscale (difference 0 on a 0-10 scale, 95% CI −0.32 to 0.32)²⁰⁰ and the other reported the total score (difference −0.06 on a 0-10 scale, 95% CI −0.75 to 0.63).²⁰¹ The latter fair-quality trial also reported the proportion of patients who achieved a 14percent or greater improvement in FIQ total scores: 30 percent versus 22 percent, RR 1.37 (95% CI 0.83 to 1.94).²⁰¹ Regarding pain, one trial reported a mean difference of 0.1 (95% CI −9.96 to 10.16) on a 0 to 100 VAS pain scale²⁰⁰ between the MBSR and waitlist groups, while the other reported on affective (difference −1.59, 95% CI −5.01 to 1.83) and sensory (difference −0.28, 95% CI −2.30 to 1.74) domains of the Pain Perception Scale (scale not reported).²⁰¹ Estimates for function and pain were similar for the comparison of MBSR versus attention control in the fair-quality trial²⁰¹ (Table 39). The new fair-quality trial of MAT versus educational attention control reported only intermediate term outcomes. There were small improvements in function on the 0-100 FIQ-R (adjusted difference −7.9, 95% CI −8.24 to −4.25) and in pain on the 0-45 SF-MPQ Pain Perception Index (adjusted difference −3.0, 95% CI −4.1 to −1.9) associated with MAT compared with attention control.²⁰³

Secondary outcomes (measures of depression, anxiety, sleep, fatigue) did not differ significantly between MBSR and waitlist or attention control in either trial²⁰⁰^–²⁰² (Table 39). The fair-quality trial compared medication use (analgesics, anti-depressants, and sleep medication) between baseline and short-term followup; only antidepressant medication was reduced significantly from baseline (46% to 35%, p=0.01) but there was no group effect (data not reported).²⁰¹ In the trial of MAT versus education attention control,²⁰³ there was an intermediate-term benefit for MAT on the 0-21 PSQI sleep measure (adjusted difference −2.3, 95% CI −2.9 to −1.6) and the 0-100 DASS measure of depression, anxiety and stress (adjusted difference −4.9, 95% CI −6.3 to −3.4).

Mindfulness-Based Stress Reduction Therapy Compared With Pharmacological Therapy or Exercise

No trial of MBSR versus pharmacological therapy or versus exercise met inclusion criteria.

Harms

Neither trial reported harms.

Mind-Body Therapy for Fibromyalgia

Key Points

Over the short term, two trials of mind-body practices reported small improvement in function for qigong compared with waitlist (difference −7.5, 95% CI −13.3 to −1.68) and for tai chi compared with attention control (difference −23.5, 95% CI −30 to −17) based on 0 to 100 scale total FIQ score; heterogeneity may be explained by duration and intensity of intervention and control conditions. Significantly more participants in the tai chi group also showed clinically meaningful improvement on total FIQ (RR 1.6, 95% CI 1.1 to 2.3) consistent with a small effect (SOE: low).
Qigong and tai chi were associated with moderately greater improvement in pain (0-10 scale) compared with waitlist and attention control in the short term (2 trials, pooled difference −1.44, 95% CI −2.96, −0.23, I²=46%). Significantly more participants in the tai chi group also showed clinically meaningful improvement on VAS pain (RR 2.0, 95% CI 1.1 to 3.8) consistent with a small effect (SOE: low).
There was no evidence regarding effects of mind-body practices versus waitlist or attention control in the intermediate or long term.
In one new trial, compared with aerobic exercise, tai chi was associated with a small improvement in function 3 to 6 months postintervention (difference in change scores −5.5, 95% CI −0.6 to −10.4, FIQ-R 0-100 scale), but the effect did not persist from intermediate to longer term (6-12 months) (difference in change scores −2.7, 95% CI −2.3 to 7.7) (SOE: low). Analyses confined to two 60-minute sessions of tai chi per week for 24 weeks versus comparable sessions per weeks of aerobic exercise suggest moderate functional improvement at intermediate term (difference in change scores −16.2, 95% CI −8.7 to −23.6, 0-100 FIQ-R scale) that was sustained long-term (difference in change scores −11.1, 95% CI −2.7 to −19.6). There were no differences between tai chi overall and exercise with regard to opioid use at intermediate (OR 0.89, 95% CI 0.28 to 2.80) or long term (OR 1.08, 95% CI 0.33 to 3.51).
Data for harms were insufficient. However, one trial reported two adverse events (in two patients) judged to be possibly related to qigong practice: an increase in shoulder pain and plantar fasciitis; neither participant withdrew from the study. One trial of tai chi reported no adverse events while the second (new) trial reported that, across all intensities of tai chi vs. aerobic exercise, there were no severe treatment-related adverse events and 5.3% (8/151) versus 5.3% (4/75) mild/moderate treatment-related adverse events, respectively (SOE: insufficient).

Detailed Synthesis

Three trials²¹⁷^,²¹⁸^,²²³ that evaluated mind-body therapies for fibromyalgia met inclusion criteria (Table 41 and Appendix D). Two trials were included in the prior AHRQ report²¹⁷^,²¹⁸ and one was added for this update.²²³ Sample sizes ranged from 66 to 226 (total sample=392). Across trials, the participants were predominately female (87% to 96%), with mean ages between 51 to 52 years. Prior to study enrollment, participants in both trials were being treated with several drugs from major analgesic and adjuvant drug groups such as analgesics/NSAIDs (53% to 73%), antidepressants (35% to 48%), and anticonvulsants (21% to 27%); in one trial, approximately 30 percent of participants were taking opioids and many participants had tried a variety of other therapies (including acupuncture, chiropractic, naturopathic/homeopathic/osteopathic therapies, massage therapy, and psychological therapies).²¹⁷

One trial compared Qigong (3 consecutive half-day training sessions, then weekly practice/review sessions for 8 weeks plus daily at-home practice for 45 to 60 minutes) to a waiting list control condition.²¹⁷ Another trial compared tai chi (two 60-minute sessions/week for 12 weeks) to an attention control condition (40 minutes of wellness education and 20 minutes of supervised stretching exercises).²¹⁸ In the Qigong trial, the mean self-reported practice time per week for all participants who completed the trial was 4.9 hours at 2 months, 2.9 hours at 4 months, and 2.7 hours at 6 months.²¹⁷ In the tai chi study, the average percent of sessions attended during the 12-week intervention was 77 percent for the tai chi group and 70 percent for the control group.²¹⁸ The third trial²²³ compared three different intensities (one 60-minute session/week for 12 weeks vs. two 60-minute sessions/week for 12 weeks vs. one 60-minute session/week for 24 weeks vs. two 60-minute sessions/week for 24 weeks) of Yang style tai chi to an aerobic exercise intervention consisting of two 60-minute sessions per week for 24 weeks. Patients in the tai chi group attended 62% of all possible classes (67% vs. 65% vs. 57% vs. 58% by intensity, respectively) and those in the exercise group attended 40%. In all three trials, patients were instructed to continue the practice at home throughout the followup period. The two trials comparing Qigong and tai chi with a waitlist and an attention control reported only short-term outcomes while the third trial comparing tai chi with exercise reported only long-term outcomes. Both tai chi trials were conducted in the United States²¹⁸^,²²³ and the Qigong trial in Canada.²¹⁷

All trials were rated fair quality (Appendix E). Due to the nature of the intervention and control groups, blinding was not possible in these trials. Other methodological concerns included unacceptable attrition overall (30% at 12 months) and differential attrition (e.g., 11% in the most frequent tai chi group vs. 24% in the comparable exercise group at 12 months) in the new tai chi trial and differential attrition between groups in the Qigong trial (intervention 19% vs. waitlist 4% at 6 months).²¹⁷

Mind-Body Therapies Compared With Waitlist or Attention Control

All trials were included in the prior AHRQ report. Short-term improvement in function on 0 to 100 scale total FIQ score was reported for qigong (small improvement, difference −7.51, 95% CI −13.33 to −1.69)²¹⁷ and for tai chi (substantial improvement, difference −23.50, 95% CI −29.98 to −17.02)²¹⁸ compared with waitlist or attention control. Substantial heterogeneity (I²=92%) precluded meaningful pooling for this outcome (Figure 50). Significantly more participants in the tai chi group also showed clinically meaningful improvement (reduction of ≥8.1 points from baseline) on total FIQ (RR 1.6, 95% CI 1.1 to 2.3), consistent with a small effect. Tai chi and qigong were associated with a moderate improvement in pain (0 to 10 scale) compared with wait list or attention control (2 trials, pooled difference −1.44, 95% CI −2.96 to −0.23, I²=45.6%) (Figure 51). Significantly more participants in the tai chi group also showed clinically meaningful improvement (reduction of ≥2 points from baseline) in VAS pain (RR 2.0, 95% CI 1.1 to 3.8), consistent with a small effect. Heterogeneity may in part be due to differences in duration and intensity of the intervention.

Mind-body therapy resulted in significant improvement in most secondary outcomes measured. Tai chi participants showed clinically meaningful improvement in depressive symptoms as measured by the CES-D (RR 1.8, 95% CI 1.1 to 2.9), in sleep quality as measured by the PSQI (RR 2.5, 95% CI 1.1 to 5.6), and in quality of life as measured by the SF-36 PCS (RR 3.4, 95% CI 1.4 to 8.1) and MCS (RR 2.0, 95% CI 1.0 to 4.0) compared with controls; similar results were seen for mean followup scores on these measures (Table 40).²¹⁸ In the second trial,²¹⁷ compared to a waitlist control, qigong resulted in significantly improved quality of life as measured by the SF-36 PCS (difference in change from baseline 4.4, 95% CI 1.5 to 7.3) and in sleep quality as measured by the PSQI (difference in change from baseline −2.2, 95% CI −3.6 to −0.8). The change in SF-36 MCS scores did not differ between groups.

Mind-Body Therapies Compared With Pharmacological Therapy or Exercise

No trials comparing mind-body therapies with pharmacological therapy met inclusion criteria in the prior report; no new studies were identified for this update.

One new trial of different frequencies and durations of tai chi versus aerobic exercise was identified.²²³ Tai chi was associated with a small improvement in function 3 to 6 months postintervention (difference in change scores −5.5, 95% CI −0.6 to −10.4, FIQ-R, 0-100 scale) when all tai chi groups were combined versus twice weekly aerobic exercise at 6 months. At 12 months (6 to 12 months postintervention), there was no difference between the combined tai chi groups and the exercise group (difference in change scores −2.7, 95% CI −2.3 to 7.7). When analysis was confined to two 60-minute sessions of tai chi per week for 24 weeks, a moderate improvement in function based on 0-100 FIQ-R at intermediate term (difference in change scores −16.2, 95% CI −8.7 to −23.6) was seen and improvement was sustained long-term (difference in change scores −11.1, 95% CI −2.7 to −19.6) versus a comparable number of sessions/weeks of aerobic exercise. Once-weekly tai chi for 24 weeks was also associated with improved function at intermediate term and long term versus twice-weekly aerobics for 24 weeks but effect sizes were slightly smaller versus twice-weekly sessions (−7.5 and −1.9 respectively, CI’s not reported) and consistent with small improvement in function.

There were no differences between tai chi overall and exercise with regard to opioid use at intermediate (OR 0.89, 95% CI 0.28, 2.80) or long term (OR 1.08, 95% CI 0.33, 3.51). Two weekly 60 minute tai chi sessions, versus a comparable number of aerobic exercise sessions, were associated with improved HADS-A anxiety (difference 1.6, 95% CI 0.1 to 3.1) and 0-10 PGAS global assessment (difference 1.5, 95% CI 0.4 to 2.5), but no difference on the SS symptom severity (difference 0.7, 95% CI −0.3 to 1.8), HAQ (difference 1.8, 95% CI −5.9 to 9.4), BDI depression (difference 4.6, 95% CI −0.5 to 9.7), HADS-D depression (difference 1.6, 95% CI 0.0 to 3.2), SF-36 MCS (difference 2.2, 95% CI −2.7 to 7.1), SF-36 PCS (difference 3.0, 95% CI −0.7 to 6.8) or PSQI (difference 0.9, 95% CI −0.7 to 2.5) measures.

Harms

In the trial of qigong,²¹⁷ there were two adverse events judged to be possibly related to the practice. One participant reported an increase in shoulder pain and another experienced plantar fasciitis; neither participant withdrew from the study. In the trial of tai chi, no adverse events were reported.²¹⁸ In the new trial,²²³ across all intensities of tai chi versus aerobic exercise, there were no severe treatment-related adverse events and 5.3% (8/151) versus 5.3% (4/75) mild/moderate treatment-related adverse events, respectively.

Acupuncture for Fibromyalgia

Key Points

Acupuncture was associated with a small improvement in function compared with sham acupuncture as evaluated by the FIQ Total Score (0 to 100) at short-term (3 trials [1 new], pooled difference −9.21, 95% CI −13.65 to −5.78, I²=0%) and intermediate-term followup (2 trials, pooled difference −9.82, 95% CI −14.35 to −3.01, I²=27.4%) (SOE: moderate).
There was no effect of acupuncture versus sham acupuncture on pain (0 to 10 scale) in the short term (4 trials [1 new], pooled difference −0.86, 95% CI −2.73 to 0.92, I²=88.9%) or intermediate term (3 trials, pooled difference −0.65, 95% CI −1.15 to 0.17, I²=45.5%). Across control conditions (sham or attention control), there was also no effect of acupuncture (5 trials [two new], pooled difference −1.14, 95% CI −2.66 to 0.33, I²=91.6%) (SOE: low).
Results for secondary outcomes across trials of acupuncture versus sham were inconsistent.
No data on long-term effects were reported.
Discomfort and bruising were the most common adverse events. Across two trials, discomfort was reported by 37% to 70% of those receiving true or sham acupuncture. Across two trials, bruising was reported in 6% (1/16) to 30% (29/96) of patients who received true or sham acupuncture. Vasovagal symptoms (occurring in 4% of participants who received acupuncture in one trial) and dizziness/nausea were less common adverse events associated with acupuncture (SOE: moderate).

Detailed Synthesis

Five trials of acupuncture for fibromyalgia were identified that met inclusion criteria (Table 42 and Appendix D).²⁴⁶^–²⁵⁰ Three trials²⁴⁶^–²⁴⁸ were included in the prior AHRQ report and two trials²⁴⁹^,²⁵⁰ were added for this update. Four trials (2 new trials) evaluated traditional Chinese needle acupuncture²⁴⁶^,²⁴⁸^–²⁵⁰ and the fifth evaluated acupuncture with electrical stimulation.²⁴⁷ Four studies compared acupuncture to sham²⁴⁶^–²⁴⁹; the fifth compared it to an education attention control.²⁵⁰ One study²⁴⁶ employed three different types of sham treatments (needling for an unrelated condition, sham needling, and simulated acupuncture); one employed two different types of sham procedures (sham needling and simulated acupuncture)²⁴⁹; one used sham needling²⁴⁷; and one used simulated acupuncture.²⁴⁸ Sample sizes ranged from 30 to 164 (total sample=412), mean ages from 35 to 56 years, and the proportion of females ranged from 95 percent to 100 percent. The duration of acupuncture treatment ranged from 3 to 12 weeks, with the total number of sessions ranging from six to 24. All studies except two reported short-term and intermediate-term outcomes; the two new trials reported only short-term outcomes.²⁴⁹^,²⁵⁰ No trial had long-term followup. Three trials were conducted in the United States,²⁴⁶^,²⁴⁷^,²⁵⁰ one in Spain²⁴⁸ and one in Turkey.²⁴⁹

All trials except two were considered good quality; the two new trials were considered fair-quality²⁴⁹^,²⁵⁰ (Appendix E). The primary limitation across trials was lack of acupuncturist blinding to treatment allocation; for one new fair-quality trial, the intention-to-treat principle was not followed.²⁴⁹ No trial reported long term outcomes.

Acupuncture Compared With Sham or Attention Control

Acupuncture was associated with a small improvement in function compared with sham acupuncture as evaluated by the FIQ Total Score (0 to 100) at short-term followup (3 trials, pooled difference −9.21, 95% CI −13.65 to −5.78, I²=0%)²⁴⁷^–²⁴⁹ and intermediate-term followup (2 trials, pooled difference on 0-100 scale, −9.82, 95% CI −14.35 to −3.01, I²=27.4%)²⁴⁷^,²⁴⁸ (Figure 52). There was, however, no effect of acupuncture versus sham acupuncture on pain (0-10 scale) in the short term (4 trials, pooled difference −0.86, 95% CI −2.73 to 0.92, I²=88.9%)²⁴⁶^–²⁴⁸ or intermediate term (3 trials, pooled difference −0.65, 95% CI −1.15 to 0.17, I²=45.5%)²⁴⁶^–²⁴⁸ (Figure 53). Results based on mean difference in change scores were similar (data not shown). These conclusions are the same as in the previous report. All trials versus sham, except one, were considered good quality; the new trial²⁴⁹ was considered fair quality. In the new trial, acupuncture was also compared with simulated acupuncture; at short term, a moderate improvement in function (difference −11.9, 95% CI −23.1 to −0.8, FIQ 0-100) and large improvement in pain (difference −3.7, 95% CI −5.1 to −2.4, VAS 0-10) were reported.²⁴⁹ Another new, small trial of group acupuncture versus education attention control found a benefit at short term on VAS pain²⁵⁰; however, across control conditions (sham or attention control), there was no effect of acupuncture short term (5 trials [2 new], pooled difference −1.14, 95% CI −2.66 to 0.33, I²=91.6%).²⁴⁶^–²⁵⁰ Substantial heterogeneity was noted and may be due to a variety of factors including differences in intervention delivery across studies and lack of blinding (attention control).

Results for secondary outcomes across trials of acupuncture versus sham were inconsistent. In the trial of acupuncture versus three different types of sham acupuncture,²⁴⁶ there was no significant benefit of acupuncture versus the combined sham groups on the SF-36 MCS score, a measure of sleep quality, or a measure of overall well-being. In the trial of six acupuncture treatments over 2 to 3 weeks, there was a benefit for true versus sham acupuncture at 1 and 7 months on the FIQ subscale of anxiety, but not depression, sleep, or well-being.²⁴⁷ In the trial of one 20-minute session per week for 9 weeks plus pharmacological treatment as prescribed by a general practitioner, there was a benefit for true versus sham acupuncture at 1 month for the SF-12 MCS scale (mean relative change 30.6%, 95% CI 19.7 to 41.5 vs. 13.9%, 95% CI 5.4 to 22.5; Cohen’s d=0.38, p=0.01), and at 9.75 months for the Hamilton Rating Scale for Depression (mean relative change −19.1%, 95% CI −34.2 to −3.9 vs. −5.9%, 95% CI −16.6 to −4.8, Cohen’s d=0.22, p=0.01) and the SF-12 Mental Component scale (mean relative change, 23.0%, 95% CI 13.7 to 32.4 vs. 9.4%, 95% CI 1.9 to 16.9; Cohen’s d=0.36, p=0.01).²⁴⁸ In the new trial of acupuncture versus sham and simulated acupuncture,²⁴⁹ comparing acupuncture versus sham short-term, there was a benefit for acupuncture on the 0-100 NHP sleep measure (difference −38.2, 95% CI −55.9 to −20.6) and the 0-40 BDI depression measure (difference −21.2, 95% CI −29.5 to −13.0). Comparing acupuncture versus simulated acupuncture short-term, there was a benefit of acupuncture on the NHP sleep scale (difference −53.6, 95% CI −71.6 to −35.7) and 0-63 BDI (difference −25.2, 95% CI −32.4 to −18.1).²⁴⁹

Acupuncture Compared With Pharmacological Therapy or Exercise

No trial of acupuncture versus pharmacological therapy or versus exercise met inclusion criteria.

Harms

Discomfort and bruising were the most common reported adverse events. In one trial,²⁴⁶ 89 of 96 treated (true or sham acupuncture) participants reported adverse events; 35 of 96 (37%) reported discomfort at needle insertion sites, 29 of 96 (30%) reported bruising, 3 of 96 (3%) reported nausea, and one of 96 (0.3%) felt faint at some point during the study. For patients assigned to simulated acupuncture, five of 19 (29%) had significantly less discomfort than those in directed acupuncture (14 of 23, 61%), acupuncture for unrelated condition (15 of 22, 70%) or sham needling (14 of 22, 64%); p=0.02. In one trial,²⁴⁷ two of 50 (4%) experienced mild vasovagal symptoms and 1 of 50 (2%) experienced a pulmonary embolism believed to be unrelated to treatment. Mild bruising and soreness were reported to be more common in the true acupuncture group, but rates were not reported. In one study,²⁴⁸ 2.6 percent of sessions led to aggravation of fibromyalgia symptoms and 0.5 percent led to headache. In the true acupuncture group, pain, bruising, and vagal symptoms presented after 4.7 percent of sessions. In one new trial, no serious adverse events were reported but some patients experienced discomfort and bruising at the sites of needle insertion.²⁴⁹ In the other new trial, bruising and dizziness were reported in one patient following acupuncture (of 16 randomized or 6%) versus no patients randomized to attention control.²⁵⁰

Multidisciplinary Rehabilitation for Fibromyalgia

Key Points

More multidisciplinary treatment participants experienced a clinically meaningful improvement in FIQ total score (≥14% change) compared with usual care at short (odds ratio [OR] 3.1, 95% CI 1.6 to 6.2), intermediate (OR 3.1, 95% CI 1.5 to 6.4), and long term (OR 8.8, 95% CI 2.5 to 30.9) in one poor-quality trial. Multidisciplinary treatment was associated with a small improvement in function (based on a 0-100 FIQ total score) versus usual care or waitlist in the short term (3 trials, pooled difference −6.08, 95% CI −14.17 to 0.16, I²=49%), and versus usual care at intermediate term (3 trials, pooled difference −7.77, 95% CI −12.22 to −3.83, I²=0%) and long term (2 trials, pooled difference −8.54, 95% CI −15.00 to −1.30, I²=0%) (SOE: low for short, intermediate and long term).
Multidisciplinary treatment was associated with a small improvement in pain compared with usual care or waitlist at intermediate term (3 trials, pooled difference −0.68, 95% CI −1.10 to −0.27, I²=0%); there were no clear differences compared with usual care or waitlist in the short term (2 trials [excluding an outlier trial], pooled difference on a 0-10 scale −0.24, 95% CI −0.63 to 0.15, I²=0%) or with usual care in the long term (2 trials, pooled difference −0.25, 95% CI −0.79 to 0.36, I²=0%) (SOE: low for short, intermediate and long term).
There were no differences between multidisciplinary pain treatment versus aerobic exercise at long term in one trial for function (difference −1.10, 95% CI −8.40 to 6.20, 0-100 FIQ total score) or pain (difference 0.10, 95% CI −0.67 to 0.87, 0-10 FIQ pain scale) (SOE: low).
Data were insufficient for harms. However, one poor-quality study reported on adverse events, stating that 19 percent of participants randomized to multidisciplinary treatment withdrew (versus 0% for waiting list) and two of these 16 patients gave increased pain as the reason. Reasons for other withdrawals were not given and there was not systematic reporting of adverse events (SOE: insufficient).

Detailed Synthesis

We identified six trials (across 8 publications) of multidisciplinary treatments that met inclusion criteria (Table 43 and Appendix D).⁹⁶^,²⁶²^–²⁶⁸ All the trials were included in the prior AHRQ report. Across trials, sample sizes ranged from 66 to 203 (total sample=801) and participants were predominantly (>90%) female with mean ages between 40 to 50 years. The multidisciplinary treatments included physical therapy or exercise training in all trials, as well as CBT and pharmacological therapy (2 trials)²⁶³^,²⁶⁶; CBT and an educational program (1 trial)²⁶⁸; sociotherapy, psychotherapy, and creative arts therapy (1 trial)⁹⁶; relaxation exercises (1 trial)²⁶⁵; and education and group discussions (1 trial).²⁶² All trials compared multidisciplinary treatment with usual care or waitlist; in addition, one trial compared it with exercise.⁹⁶ Treatment duration ranged from 2 to 12 weeks and the frequency of sessions from once a week to daily (total number of sessions ranged from 12 to 24 with durations between 1.5 to 5 hours). One of the trials included two intervention arms.²⁶⁸ The long-term multidisciplinary arm (2 days of education and exercise followed by 10 weeks of CBT) was determined to be most consistent with interventions employed by the other trials and was included in the pooled estimates below; results for the short-term group (2 days of education, exercise, and CBT programs) were similar to those of the long-term group and can be found in Table 42. Three trials reported outcomes over the short term (3 to 5.5 months),²⁶²^,²⁶³^,²⁶⁸ three over the intermediate term (6 months),²⁶³^,²⁶⁵^,²⁶⁶ and two over the long term (12 and 18 months).⁹⁶^,²⁶³ Five trials were conducted in Europe⁹⁶^,²⁶²^–²⁶⁷ and one trial in Turkey.²⁶⁸

Three trials were judged to be of fair quality⁹⁶^,²⁶²^,²⁶⁸ and three trials were rated poor quality²⁶³^,²⁶⁵^,²⁶⁶ (Appendix E). The nature of the intervention precluded blinding of participants and of people administering the treatments. Additional methodological shortcomings in the poor quality trials included unclear allocation concealment methods and high rates of overall attrition (21% to 43%) and differential attrition (12% to 13%) between groups.

Multidisciplinary Rehabilitation Compared With Usual Care or Waitlist

Clinically important FIQ improvement (≥14% change) was significantly more common for multidisciplinary treatment compared with usual care at short- (odds ratio [OR] 3.1, 95% CI 1.6 to 6.2), intermediate- (OR 3.1, 95% CI 1.5 to 6.4) and long-term followup (OR 8.8, 95% CI 2.5 to 30.9) in one poor-quality trial.²⁶³ Multidisciplinary treatment for fibromyalgia was associated with a small improvement in function versus usual care or waitlist based on a 0 to 100 FIQ total score in the short term (3 trials, pooled difference −6.08, 95% CI −14.17 to 0.16, I²=48.9%),²⁶²^,²⁶³^,²⁶⁸ and versus usual care in the intermediate term (3 trials, pooled difference −7.77, 95% CI −12.22 to −43.83, I²=0%)²⁶³^,²⁶⁵^,²⁶⁶ (Figure 54). The short-term estimate for trials of multidisciplinary treatment versus usual care only was similar (2 trials, pooled difference −9.74, 95% CI −16.38 to −3.83).²⁶³^,²⁶⁸ The slightly smaller effect of multidisciplinary rehabilitation versus usual care persisted over the long term (2 trials, pooled difference on 0-100 scale −8.54, 95% CI −15.00 to −1.30, I²=0%).⁹⁶^,²⁶³ Only one poor-quality trial reported short-term, intermediate-term, and long-term effects on function, showing a significant result for each time frame.²⁶³

Clinically important improvement in pain (≥30% change on a 0-10 scale) was more common for multidisciplinary treatment compared with usual care at intermediate-term followup in one poor-quality trial (OR 3.4, 95% CI 1.0 to 10.8)²⁶³; no statistically significant differences were seen between groups at short- or long-term followup. There were no clear effects of multidisciplinary treatment for fibromyalgia on pain versus usual care or waitlist in the short term (3 trials, pooled difference on a 0-10 scale −0.84, 95% CI −2.56 to 0.64, I²=83.6%),²⁶²^,²⁶³^,²⁶⁸ but statistical heterogeneity was very large (Figure 55). Excluding an outlier trial (difference −2.50, 95% CI −3.73 to −1.27)²⁶⁸ reduced the statistical heterogeneity and resulted in an attenuated effect (pooled difference −0.24, 95% CI −0.63 to 0.15, I²=0%). At intermediate term, multidisciplinary treatment was associated with a small improvement in pain compared with usual care (3 trials, pooled difference 0−10 scale −0.68, 95% CI −1.10 to −0.27, I²=0%).²⁶³^,²⁶⁵^,²⁶⁶ Long term, there were no clear effects of multidisciplinary treatment on pain versus usual care (2 trials, pooled difference −0.25, 95% CI −0.79 to 0.36, I²=0%).⁹⁶^,²⁶³ Only one poor-quality trial reported short-, intermediate-, and long-term effects on pain, showing a significant result for each time frame.²⁶³

Results were mixed across the six trials for effects of multidisciplinary treatment on secondary outcomes. Three trials were fair quality.⁹⁶^,²⁶²^,²⁶⁸ Across the three fair-quality trials, there were no significant differences between multidisciplinary treatment and usual care or waitlist on measures of anxiety (Generalized Anxiety Disorder−10, FIQ anxiety subscale) in two trials⁹⁶^,²⁶² and depression (Major Depression Inventory, FIQ depression subscale, BDI) in three trials⁹⁶^,²⁶²^,²⁶⁸ over short-term or long-term followup. Regarding quality of life, two of these trials reported no differences between groups on the SF-36 PCS and MCS and the EQ-5D⁹⁶^,²⁶² while the third reported significant improvement on the SF-36 PCS but not the MCS.²⁶⁸ One trial reported no difference in healthcare utilization between groups during the 2 months prior to the final measurement at 18 months.⁹⁶

Multidisciplinary Rehabilitation Compared With Pharmacological Therapy

No trial of multidisciplinary rehabilitation versus pharmacological therapy met inclusion criteria.

Multidisciplinary Rehabilitation Compared With Exercise

There was no clear effect of multidisciplinary pain treatment versus aerobic exercise at long term in one fair-quality trial⁹⁶ for physical function on the FIQ physical function scale (difference 0 on a 0−10 scale, 95% CI −0.79 to 0.79) or the FIQ total score (difference −1.10 on a 0−100 scale, 95% CI −8.40 to 6.20). Similarly, there were no significant differences on the FIQ pain scale (difference 0.10 on a 0−10 scale, 95% CI −0.67 to 0.87), or secondary outcomes of quality of life, depression or anxiety, or healthcare utilization, with the exception of physiotherapist consultations, which was higher for the multidisciplinary group in the 2 months prior to the final measurement at 18 months (Table 42).

Harms

Adverse events were poorly reported by the included trials. One trial that compared multidisciplinary treatment (group pool sessions of physiotherapy, relaxation exercises, and exercise) with usual care (physical therapy, drug treatment and, in some cases, psychotherapy)²⁶⁵ reported that 16 of 84 (19%) multidisciplinary participants withdrew (versus 0% for waiting list) and two of these gave increased pain as the reason. Reasons for other withdrawals were not given and there was not systematic reporting of adverse events.

Key Question 5. Chronic Tension Headache

No new trials that evaluated nonpharmacological treatments for chronic tension headache that met our inclusion criteria were identified for this update.

Psychological Therapies for Chronic Tension Headache

Key Points

There is insufficient evidence from three poor quality trials to determine the effects of psychological therapies (CBT, relaxation) on short-term or intermediate-term function or pain compared with waitlist, placebo, or attention control (SOE: insufficient).
There is insufficient evidence from two poor-quality trials to determine the effects of CBT on short-term or intermediate-term function or pain compared with antidepressant medication (SOE: insufficient).
No long-term outcomes were reported and no trials comparing psychological therapies to biofeedback were identified that met inclusion criteria.
Data were insufficient for harms. Results were mixed across two poor-quality trials comparing CBT with antidepressant medication, with one trial reporting a lower risk of “at least mild” adverse events in the CBT group (0% vs. 59%), four of which led to withdrawal from the trial, and the second trial reporting a similar low risk of withdrawal due to adverse events (2% to 6% across groups to include placebo) (SOE: insufficient).

Detailed Synthesis

Three trials, all conducted in the United States,¹²⁸^,¹²⁹^,¹³² of CBT for chronic tension headache met inclusion criteria (Table 44 and Appendix D). Sample sizes ranged from 36 to 104 (total sample=198); the mean age across trials varied from 32 to 42 years and most participants were female (56% to 80%). Duration since the onset of headache pain ranged from 10.7 to 14.5 years. All trials either excluded patients with concomitant migraines or required that they suffer from no more than one migraine per month. Two trials also specifically excluded patients with medication overuse (analgesic-abuse) headaches and required that patients be free from prophylactic headache medication upon study entry.¹²⁹^,¹³²

All three trials evaluated some variation of stress management therapy/cognitive coping skills training with a relaxation component; one trial (n=77) also included an additional relaxation only arm.¹²⁸ In two trials (n=41, 150), patients received three 60-minute sessions of CBT and training in home-based relaxation,¹²⁹^,¹³² and in the third trial (n=77), patients underwent 11 sessions (1-2 per week) of CBT plus progressive muscle relaxation training (session duration varied from 45 to 90 minutes).¹²⁸ In all trials, the interventions were administered by a psychologist or counselor over a 2-month period. Two trials compared CBT with placebo (placebo pill),¹²⁹ attention control (pseudomeditation/body awareness training)¹²⁸ and waitlist (monitoring via phone and clinical visits) control groups.¹²⁸ Two trials compared CBT with amitriptyline (25-75 mg/day).¹²⁹^,¹³² All trials reported short-term results; one trial also provided outcomes at intermediate-term followup.¹²⁹

All three trials were considered poor quality (Appendix E) due to lack of blinding and large differential attrition between groups (in one trial, overall attrition was also substantial¹²⁹). Additionally, randomization, concealment, and intention-to-treat processes were unclear in one trial.¹³²

Psychological Therapy Compared With Waitlist, Placebo, or Attention Control

There was insufficient evidence from three poor-quality trials to draw conclusions regarding the effects of psychological therapies compared with waitlist, placebo, or attention control over the short term or intermediate term.

CBT plus placebo was associated with a small improvement in both short-term and intermediate-term function compared with placebo alone as measured by the Headache Disability Inventory (HDI) (scale 0−100) in one trial (difference 7.3, 95% CI 1.6 to 13.0 at 1 month and 9.3, 95% CI 3.5 to 15.1 at 6 months.¹²⁹ Long-term function was not reported.

Various pain measures were reported across trials. In general, CBT (plus relaxation), but not relaxation alone, appeared to have a small effect on short-term pain compared with waitlist, placebo, or attention control (Table 43). CBT plus relaxation was associated with a small improvement in pain on the Headache Index (HI) at 1 month compared with waitlist, attention control, or placebo across two trials (pooled SMD −0.40, 95% CI −0.79 to 0.00, I²=0%)¹²⁸^,¹²⁹ (Figure 57). Relaxation only conferred no benefit for short-term pain compared with waitlist or attention control in one of these trials (difference −0.21 on a 0-20 HI scale, 95% CI −0.78 to 0.36).¹²⁸ Almost twice as many patients who received CBT plus relaxation achieved at least a 50 percent improvement in headache frequency compared with usual care or waitlist (risk ratio [RR] 1.94, 95% CI 1.03 to 3.66) over the short term in one trial; however, there was no difference between groups when the intervention was relaxation alone (RR 0.98, 95% CI 0.42 to 2.26)¹²⁸ (Figure 56). One trial reported similar favorable results regarding pain over the intermediate-term for CBT plus placebo compared with placebo alone (difference −0.65, 95% CI −1.06 to −0.24) (Figure 57), with the exception of “success” (≥50% improvement from baseline in HI score), which did not differ between groups (Table 43).¹²⁹

Medication use did not differ significantly between the CBT and relaxation therapy groups and waitlist, placebo, or attention control groups over the short-term in two trials.¹²⁸^,¹²⁹ Over the intermediate-term, CBT plus placebo resulted in a significant reduction in analgesic use compared with placebo alone (difference 11.8, 95% CI 1.5 to 22.1).¹²⁹

Psychological Therapy Compared With Pharmacological Therapy

There was insufficient evidence from two poor-quality trials to draw conclusions regarding the effect of CBT versus pharmacological therapy through intermediate-term followup.

There was no effect for CBT plus placebo versus antidepressant medication over the short-term or intermediate-term for function as measured by the HDI (scale 0−100) in one trial (difference 0.1, 95% CI −5.6 to 5.7 at 1 month and 2.4, 95% CI −3.3 to 8.0 at 6 months).¹²⁹ Long-term function was not reported.

Regarding short-term pain, two trials reported HI index scores with differing results. One trial found that CBT plus placebo resulted in less improvement compared with antidepressant medication at 1 month (SMD 0.50, 95% CI 0.11 to 0.89),¹²⁹ whereas the other trial showed an improvement with CBT versus amitriptyline by 1 month, although the difference did not reach statistical significance (SMD −0.59, 95% CI −1.26 to 0.08)¹³² (Figure 57); due to the significant heterogeneity between groups we did not use the pooled estimate. There were no significant differences between CBT and pharmacological treatment for any other pain outcome reported over the short term in both trials¹²⁹^,¹³² or over the intermediate-term in one trial¹²⁹ (Table 43).

Short-term results were mixed regarding medication use with one trial reporting no difference between CBT and amitriptyline¹³² and the other reporting a significant difference between groups favoring antidepressant therapy¹²⁹; however, this difference did not persist to the intermediate term in the latter trial (Table 43).

Psychological Therapy Compared With Biofeedback

No trial of psychological therapy versus biofeedback met inclusion criteria.

Harms

Harms were reported by the two poor-quality trials comparing CBT with antidepressant medication,¹³² and with placebo in one.¹²⁹ No patient who underwent CBT experienced an adverse effect versus 10 of 17 (59%) of those who took medication in one trial;¹³² six events were classified as mild, two as moderate, and two as substantial (no further details provided). Four of these patients withdrew from the trial. The risk of withdrawal due to adverse events was similar across groups in the second trial: CBT (2%) versus antidepressant medication (2%) and placebo (6%); no other information was provided.¹²⁹

Physical Modalities for Chronic Tension Headache

Key Points

There is insufficient evidence from one poor-quality trial to determine the effects occipital transcutaneous electrical stimulation (OTES) on short-term function or pain compared with sham (SOE: insufficient).
No longer-term outcomes were reported and no trials comparing physical modalities to pharmacological therapy or to biofeedback were identified that met inclusion criteria.
Data were insufficient for harms; however, no adverse events occurred in either the real or the sham OTES group in one poor-quality trial (SOE: insufficient).

Detailed Synthesis

Only one Italian trial¹⁶⁹ was identified that investigated the efficacy of OTES versus sham (Table 45 and Appendix D). Patients were excluded if they had undergone prophylactic treatment in the prior 2 months or had previous treatment with OTES. Acute medications use was permitted during the study period, but other methods of pain control or new preventive treatments were prohibited. At baseline, 46 percent of patients were overusing medications. Identical devices and procedures were used for both the real and the sham OTES, and treatment consisted of 30-minute sessions, three times per day for two consecutive weeks. Limited information on the timing of outcomes was provided, but it was assumed that data was collected at 1 and 2 months post-treatment. This trial was rated poor quality due to unclear randomization sequence, failure to control for dissimilar proportion of females between groups, and no reporting of attrition (Appendix E). The focus of the trial was on allodonia, which was not of interest to this report.

Physical Modalities Compared With Sham

There was insufficient data from one poor-quality trial to determine the short-term effects of OTES compared with sham.¹⁶⁹ OTES resulted in greater improvement in function at 2 months as measured by the Migraine Disability Assessment Questionnaire (difference −35.0, 95% CI −42.6 to −27.4, scale 0-21+) and in pain intensity as measured by VAS (difference −5.0 on a 0−10 scale, 95% CI −5.8 to −4.2) The proportion of patients who achieved a 50 percent or greater reduction in headache days also favored OTES (RR 12.4; 95% CI 3.2 to 47.3). Measures of depression and anxiety were both somewhat better following OTES compared with sham at 2 months, however, the between-group difference was only statistically significant for anxiety (Table 44). The proportion of patients overusing medications at 2 months was also significantly lower in the OTES group.

Physical Modalities Compared With Pharmacological Therapy or Biofeedback

No trial of physical modalities versus pharmacological therapy and versus biofeedback met inclusion criteria.

Harms

Authors report that neither adverse events nor side effects occurred in either the real or the sham OTES group in one poor-quality trial.¹⁶⁹

Manual Therapies for Chronic Tension Headache

Key Points

Spinal manipulation therapy, compared with usual care, was associated with small and moderate improvements, respectively, in function (difference −5.0, 95% CI −9.02 to −1.16 on the Headache Impact Test, scale 36-78 and difference −10.1, 95% CI −19.5 to −0.64 on the Headache Disability Inventory, scale 0 to 100) and pain intensity (difference −1.4 on a 0-10 NRS scale, 95% CI −2.69 to −0.16) over the short term in one fair-quality trial (SOE: low). Approximately 25 percent of the patients had comorbid migraine.
There is insufficient evidence from one poor-quality trial to determine the effects of spinal manipulation therapy on short-term pain compared with amitriptyline (SOE: insufficient).
No longer-term outcomes were reported and no trials comparing physical modalities to pharmacological therapy or to biofeedback were identified that met inclusion criteria.
No adverse events occurred in the trial comparing spinal manipulation to usual care, but significantly fewer adverse events were reported following manipulation versus amitriptyline in the other poor-quality trial (4.3% vs. 82.1%; RR 0.05, 95% CI 0.02 to 0.16). The risk of withdrawal due to adverse events was not significantly different (1.4% vs. 8.9%; RR 0.16, 95% CI 0.02 to 1.33). Common complaints were neck stiffness in the manipulation group and dry mouth, dizziness, and weight gain in the medication group (SOE: low).

Detailed Synthesis

Two trials (n=75 and n=126)¹⁸⁷^,¹⁸⁸ that evaluated spinal manipulation therapy (SMT) for the treatment of chronic tension headache met inclusion criteria (Table 46 and Appendix D). The majority of patients in both trials were female (61% to 78%) with mean ages ranging from 40 to 42 years and a mean headache duration of 13 years. Both trials included patients with comorbid migraine as long as their headache problem was determined by a physician to be predominantly tension-type in nature (this included 26% of patients in one trial,¹⁸⁷ proportion not reported in the other trial). In one trial, patients were specifically excluded if they met the criteria for medication overuse or if they had received manual therapy in the 2 months prior to enrollment.¹⁸⁷ At baseline, prophylactic medication use was common. Current or past use of other treatments was not reported.

One Dutch trial compared a maximum of nine, 30-minute sessions of SMT over 8 weeks with usual care (information, reassurance and advice, discussion of lifestyle changes, and analgesics or NSAIDs provided by a general practitioner).¹⁸⁷ The second trial, conducted in the United States, compared 12 SMT sessions of 20 minutes over a 6-week treatment period versus amitriptyline (maximum dose 30 mg/day).¹⁸⁸ Both trials reported only short-term outcomes. One trial was rated fair quality¹⁸⁷ and one poor quality¹⁸⁸ (Appendix E). Due to the nature of the interventions, blinding of patients and researchers was not possible. Additionally, the poor trial had a high rate of differential attrition (7% SMT and 27% amitriptyline).

Manual Therapies Compared With Usual Care

Only short-term data from one fair-quality trial were reported. SMT resulted in small to moderate improvements in function compared with usual care at 4.5 months post-treatment as measured by the Headache Disability Inventory (HDI, scale 0 to 100) and the Headache Impact Test (HIT-6, scale 36 to 78), respectively (difference between groups in change scores from baseline, −10.1, 95% CI −19.5 to −0.64 and −5.0, 95% CI −9.02 to −1.16).¹⁸⁷ Regarding pain outcomes, twice as many patients who received SMT experienced a ≥50% reduction from baseline in the number of headache days (per 2 weeks) compared with usual care: 81.6% versus 40.5%; RR 2.0 (95% CI 1.3, 3.0).¹⁸⁷ Similarly, a statistically greater reduction in the number of headache days (difference between groups in change scores from baseline, −4.9; 95% CI −6.95 to −2.98) and in headache pain intensity (difference in change scores from baseline, −1.4 on a 0 to 10 NRS scale, 95% CI −2.69 to −0.16) was seen following SMT. Given that 29 percent of SMT patients and 22 percent of usual care patients had comorbid migraine, it is unclear how the coexistence of these headache types may have affected the outcome.

The proportion of patients who used any additional healthcare services (e.g., physical therapy, medical specialists, other) was statistically lower in the SMT group compared with the usual care group (Table 45).¹⁸⁷ Authors report no statistically significant differences between treatments in analgesic or NSAID use; data were not provided.

Manual Therapies Compared With Pharmacological Therapy

The evidence was insufficient from one poor-quality trial to determine the effects of spinal manipulation compared with amitriptyline over the short term.¹⁸⁸ The spinal manipulation group showed more improvement compared with the amitriptyline group in daily headache intensity (adjusted difference −1.4, 95% CI −2.3 to −0.3), weekly headache frequency (adjusted difference −4.2, 95% CI −6.5 to −1.9), Short Form-36 Function score (adjusted difference 4.9, 95% CI 0.4 to 9.4), and over-the-counter medication use (difference −0.9, 95% CI −1.5 to −0.3) at 1 month. Attrition in the amitriptyline group was 27 percent, compared with 7 percent in the manipulation group.

Manual Therapies Compared With Biofeedback

No trial of physical modalities versus biofeedback met inclusion criteria.

Harms

No adverse events occurred in the trial comparing spinal manipulation to usual care.¹⁸⁷ The other poor-quality trial reported significantly fewer adverse events following spinal manipulation compared with amitriptyline (4.3% vs. 82.1%; RR 0.05, 95% CI 0.02 to 0.16) but the risk of withdrawal due to adverse events was not significantly different (1.4% vs. 8.9%; RR 0.16, 95% CI 0.02 to 1.33).¹⁸⁸ Patients in the manipulation group complained of neck stiffness which resolved in all cases and common side effects in the amitriptyline group included dry mouth, drowsiness, and weight gain.

Acupuncture for Chronic Tension Headache

Key Points

There is insufficient evidence from two poor quality trials to determine the effects of Traditional Chinese needle acupuncture on short-term (2 trials), intermediate-term (1 trial), or long-term (1 trial) pain compared with sham acupuncture (SOE: insufficient).
Laser acupuncture was associated with a small improvement in pain intensity (median difference −2, IQR 6.3, on a 0-10 VAS scale) and in the number of headache days per month (median difference −8, IQR 21.5) over the short term versus sham in one fair-quality trial (SOE: low).
No trials comparing acupuncture to pharmacological therapy or to biofeedback were identified that met inclusion criteria.
The fair-quality trial evaluating laser acupuncture reported that no adverse events occurred in either group (SOE: low).

Detailed Synthesis

Three small trials (N=30 to 50; total sample=119)²⁵¹^–²⁵³ that evaluated acupuncture versus sham treatment for chronic tension headaches met inclusion criteria (Table 47 and Appendix D). Two trials employed traditional Chinese needle acupuncture,²⁵²^,²⁵³ while one used low-energy laser acupuncture.²⁵¹ The number of acupoints ranged from 6 to 10 across studies. The duration of treatment ranged from 5 to 10 weeks, with the total number of sessions ranging from 8 to 10 (20 to 30 minutes duration, 1 to 3 times per week). Sham treatment consisted of irrelevant acupuncture (superficial needle insertion in areas without acupuncture points) and sham acupuncture (blunt needle that simulates puncturing of the skin, laser power output set to zero).

Across trials, participants were primarily female (49% to 87%), mean ages ranged from 33 to 49 years, and headache frequency from 18 to 27 days per month. Two trials specifically excluded patients with other causes of chronic headache²⁵¹^,²⁵²; the third trial did not note if any of the patients had concomitant headaches.²⁵³ One trial required patients to abstain from all other prophylactic therapies (with the exception of rescue analgesics),²⁵³ and one trial excluded patients who had received any treatment for their headache in the 2 weeks prior to enrollment.²⁵¹ Concomitant (nonnarcotic) medication was permitted in two trials,²⁵²^,²⁵³ the third stated that no patient took concomitant analgesics.²⁵¹ All trials assessed outcomes over the short term; one trial additionally provided intermediate- and long-term data.²⁵³

One trial was rated fair quality²⁵¹ and two poor quality²⁵²^,²⁵³ (Appendix E). In all three trials, random sequence generation and concealment of allocation were not clearly reported and the care providers were not blinded to treatment. Additional methodological concerns in the poor quality trials included unclear application of intention-to-treat methods, and failure to control for disproportionate baseline characteristics or to account for loss to followup in one trial each.

Acupuncture Compared With Sham

None of the trials reported on function. All three trials reported pain outcomes, although the specific measures varied across the trials. The results were mixed depending on the type of acupuncture used. No significant differences were found between needle acupuncture and sham for any pain outcome evaluated during the short term in two small poor-quality trials,²⁵²^,²⁵³ or at intermediate and long-term followup in one of these trials²⁵³ (Table 46). In the third small fair-quality trial,²⁵¹ laser acupuncture resulted in a significant reduction in the number of headache days per month (median −8, interquartile range [IQR] 21.5), in pain intensity on a 0 to 10 VAS scale (median −2, IQR 6.3), and in the duration of attacks (median −4 hours, IQR 7.5) over the short term compared with the sham group, which reported no improvement from baseline on any outcome at the 3-month followup (p<0.001 for all). Substantial heterogeneity (I²=91%) precluded meaningful pooling for this outcome (Figure 58).

Acupuncture Compared With Pharmacological Therapy or Biofeedback

No trial of acupuncture versus pharmacological therapy and versus biofeedback met inclusion criteria.

Harms

Harms were generally not reported. The trial evaluating laser acupuncture reported that no adverse events occurred in either group.²⁵¹

Key Question 6. Differential Efficacy

RCTs that stratified on patient characteristics of interest, permitting evaluation of factors that might modify the effect of treatment, were considered for inclusion. Factors included age, sex, presence of comorbidities (e.g., emotional or mood disorders) and degree of nociplasticity/central sensitization. If a comparison is not listed below there was either no evidence identified that met the inclusion criteria or the included trials did not provide information on differential efficacy or harms. Studies likely had insufficient sample size to evaluate differential efficacy or harms, and evidence was considered insufficient.

Osteoarthritis Knee Pain

Key Point

There is insufficient evidence from one fair-quality trial (across 3 publications) that age, sex, race, BMI, baseline disability, pain, or depression status modify the effects of exercise in patients with OA of the knee. Sample sizes in the subgroup analyses from the Fitness, Arthritis and Seniors Trial (FAST) were likely inadequate to effectively test for modification.

Exercise Compared With Attention Control

One fair-quality trial (n=439) reported across three publications of the FAST⁵¹^,⁵⁷^,⁵⁸ included in Key Question 3 compared muscle performance (i.e., resistance training) and aerobic exercise programs to an attention control and formally evaluated factors that may modify treatment in patients with OA of the knee. Details regarding these study populations are available in the Results section for Key Question 3 and in Appendix D. Two of the reports performed formal tests for interaction; none of the demographic or clinical variables evaluated were found to modify the effect of either type of exercise.⁵⁷^,⁵⁸ One trial explored whether age, sex, race, BMI, baseline disability, or baseline pain modified the effects of exercise on function based on ADL disability measures in a subgroup of patients who were free of ADL disability upon enrollment; however, no data were provided for evaluation.⁵⁷ A second publication looked at whether the effects of exercise on pain, disability, and depression were modified by baseline depression status, that is, high versus low depressive symptomology according to the Center for Epidemiologic Studies Depression scale over time (using an adjusted repeated measures analysis of variance). However, the authors do not provide results that directly examined modification by baseline depression without the time component.⁵⁸ The third FAST publication stratified on age, sex, race, and BMI and did not perform a formal statistical test for interaction.⁵¹ Upon visual inspection, the point estimates across groups and strata are similar, suggesting that the effect of exercise on physical disability and knee pain was not modified by any patient characteristic evaluated.

Osteoarthritis Hip Pain

Key Point

There is insufficient evidence from one fair-quality trial that age, sex, baseline pain, and the presence of radiographic OA modify the effects of exercise in patients with OA of the hip. Study authors only reported on effects that include evaluation of these factors over time. Sample size was likely inadequate to effectively test for modification.

Exercise Compared With Usual Care

One fair-quality trial (n=203) included for Key Question 3 compared combination exercise therapy (strengthening, stretching, and endurance exercises) to usual care and stratified on age, sex, race, and BMI, but it did not formally test for interaction.⁷⁴ Details regarding this study population are available in the Results section for Key Question 3 and in Appendix D. Age, sex, education, self-reported knee OA, and baseline pain and Kellgren & Lawrence radiographic OA scores were defined a priori as subgroups of interest. Although older patients (age ≥65 years), women, patients with a lower NRS pain score at baseline, and patients with radiographic OA showed somewhat larger effects of exercise therapy on function and pain, data were not systematically reported and, based on the data provided, overlapping confidence intervals suggest that the effect of exercise was not modified by any of these variables.

Fibromyalgia

Key Point

There was insufficient evidence from one poor-quality trial that baseline BMI (normal, overweight, obese) modifies the effects of multidisciplinary rehabilitation in patients with fibromyalgia. Study authors only report on effects that include evaluation of these factors over time. Sample size was likely inadequate to effectively test for modification.

Multidisciplinary Rehabilitation Compared With Usual Care

An additional publication (n=130)²⁶⁴ of a poor-quality trial²⁶³ included for Key Question 4 that compared multidisciplinary rehabilitation to usual care assessed potential modification of treatment based on baseline BMI (normal, overweight, obese). No significant interactions were found for the effect of BMI on exercise over time for any pain or function measure evaluated; however, the authors do not provide results that exclude effects of time. Details regarding this study population are available in the section on efficacy and in Appendix D.

Figures

Figure 2 is a flow chart that outlines study retrieval and selection process. It begins with the total number of citations retrieved from the literature searches and ends with the number of studies that satisfied the inclusion criteria of the report. This figure is described further in the Section entitled “Results of Literature Searches”. Briefly, a total of 8516 potentially relevant citations were identified (4996 from the prior AHRQ report and 3520 for this update) and, after removal of duplicates, 6702 (4276 from the prior AHRQ report and 7474 for this update) underwent title and abstract review. After dual review of abstracts and titles, 5900 articles (3083 from the prior AHRQ report and 2817 for this update) were excluded. The remaining 1574 articles (1193 from the prior AHRQ report and 381 for this update) underwent dual review at the full text level and a total of 233 trials (202 from the prior AHRQ report and 31 for this update) in 252 publications (218 from the prior AHRQ report and 34 for this update) met the inclusion criteria and were included in this report update.

Figure 2Literature flow diagram

^a Cochrane databases include the Cochrane Central Register of Controlled Trials and the Cochrane Database of Systematic Reviews

^b Other sources include prior reports, reference lists of relevant articles, systematic reviews, etc.

^c Includes 10 trials that were identified in the 2018 report search.

^d Publications may be included or excluded for multiple interventions

^e Studies checked for inclusion

^f One of the publications identified for update report was a followup study to a trial included in the previous report, therefore this study is not counted in the number of included trials but is counted under total publications.

Figure 3 is bar graph depicting the distribution of individual study quality ratings across the report as a whole and by each chronic pain condition. Overall, 6% of trials were considered good quality, 61% were fair quality, and 33% were poor quality. For low back pain, 6% were good quality, 70% fair quality, and 23% poor quality. For neck pain, 4% were good quality, 59% were fair quality, and 37% were poor quality. For osteoarthritis, 8% were good quality, 63% were fair quality and 29% were poor quality. For fibromyalgia, 5% were good quality, 52% were fair quality, and 43% were poor quality. For tension headache, 22% were fair quality and 78% were poor quality; there were no good quality studies for headache.

Figure 3Overview and distribution of quality analysis ratings

Figure 4 is a forest plot. Standardized mean differences were reported or calculated for 11 short-term studies, with a pooled standardized mean difference of −0.51 (95% confidence interval −0.98 to −0.08) and an overall I-squared value of 88.1%. Standardized mean differences were reported or calculated for five intermediate-term studies, with a pooled standardized mean difference of −0.17 (95% confidence interval −0.39 to 0.02) and an overall I-squared value of 0%. Standardized mean difference was reported or calculated for one long-term study (0, 95% confidence interval −0.38 to 0.38).

Figure 4Exercise versus usual care, an attention control, or a placebo intervention for chronic low back pain: effects on function

AC = attention control; CI = confidence interval; CPGS –BD =Von Korff Chronic Pain Grade Score Back Disability; DP = directional preference; GE= general exercise; MC = motor control; MF = mobility/flexibility; MI = minimal intervention; N = number; NE = no exercise; NM = neuromuscular re-education; ODI = Oswestry Disability Index; PDI = Pain Disability Index; RDQ = Roland-Morris Disability Questionnaire; SD = standard deviation; SMD = standardized mean difference; S. pilates = selective Pilates; Strng=Strength training; UC = usual care; WL = waitlist

Figure 5 is a forest plot. Mean differences were reported or calculated for 11 short-term studies, with a pooled mean difference of −1.27 (95% confidence interval −1.77 to −0.65) and an overall I-squared value of 63.9%. Mean differences were reported or calculated for five intermediate-term studies, with a pooled mean difference of −0.85 (95% confidence interval −1.67 to −0.07) and an overall I-squared value of 50.2%. Mean difference was reported or calculated for one long-term study (−1.55, 95% confidence interval −2.76 to −0.34).

Figure 5Exercise versus usual care, an attention control, or a placebo intervention for chronic low back pain: effects on pain

AC = attention control; CI = confidence interval; CPGS –BD =Von Korff Chronic Pain Grade Score Back Disability; DP = directional preference; ; GE= general exercise; MC = motor control; MD = mean difference; MF = mobility/flexibility; MI = minimal intervention; N = number; NE = no exercise; NM = neuromuscular re-education; ODI = Oswestry Disability Index; PDI = Pain Disability Index; RDQ = Roland-Morris Disability Questionnaire; SD = standard deviation; SMD = standardized mean difference; S. pilates = selective Pilates; Strng=Strength training; UC = usual care; WL = waitlist

Figure 6 is a forest plot. Standardized mean differences were reported or calculated for three short-term studies, with a pooled standardized mean difference of −0.24 (95% confidence interval −0.38 to −0.04) and an overall I-squared value of 0%. Standardized mean differences were reported or calculated for three intermediate-term studies, with a pooled standardized mean difference of −0.24 (95% confidence interval −0.38 to −0.10) and an overall I-squared value of 0%. Standardized mean differences were reported or calculated for three long-term studies, with a pooled standardized mean difference of −0.28 (95% confidence interval −0.43 to −0.13) and an overall I-squared value of 0%.

Figure 6Psychological therapy versus usual care or an attention control for chronic low back pain: effects on function

AC = attention control; CB = cognitive-behavioral therapy; CI = confidence interval; MRDQ = Modified Roland-Morris Disability Questionnaire; N = number; ODI = Oswestry Disability Index; PI = placebo intervention; RDQ = Roland-Morris Disability Questionnaire; RPT = respondent therapy (progressive relaxation); SD = standard deviation; SMD = standardized mean difference; UC = usual care

Figure 7 is a forest plot. Mean differences were reported or calculated for three short-term studies, with a pooled mean difference of −0.75 (95% confidence interval −1.01 to −0.41) and an overall I-squared value of 0.0%. Mean differences were reported or calculated for three intermediate-term studies, with a pooled mean difference of −0.71 (95% confidence interval −0.97 to −0.46) and an overall I-squared value of 0.0%. Mean difference were reported or calculated for three long-term studies, with a pooled mean difference of −0.55 (95% confidence interval −0.92 to −0.23) and an overall I-squared value of 0%.

Figure 7Psychological therapy versus usual care or an attention control for chronic low back pain: effects on pain

AC = attention control; CB = cognitive-behavioral therapy; CI = confidence interval; N = number; PI = placebo intervention; RPT = respondent therapy (progressive relaxation); SD = standard deviation; UC = usual care.

Figure 8 is a forest plot. Standardized mean differences were reported or calculated for three short-term studies, with a pooled standardized mean difference of −0.34 (95% confidence interval −0.75 to −0.02) and an overall I-squared value of 44.6%. Standardized mean differences were reported or calculated for three intermediate-term studies, with a pooled standardized mean difference of −0.40 (95% confidence interval −0.85 to −0.05) and an overall I-squared value of 65.2%.

Figure 8Spinal manipulation versus sham manipulation, usual care, an attention control, or a placebo intervention for chronic low back pain: effects on function

AC = attention control; CI = confidence interval; N = number; ODI = Oswestry Disability Index; PI = placebo intervention; RDQ = Roland-Morris Disability Questionnaire; SD = standard deviation; SMD = standardized mean difference; SP= sham manipulation; UC = usual care; UK BEAM = UK Back pain exercise and manipulation trial; VF = Von Korff functional disability

Figure 9 is a forest plot. Mean differences were reported or calculated for three short-term studies, with a pooled mean difference of −0.36 (95% confidence interval −0.62 to 0.25) and an overall I-squared value of 0%. Mean differences were reported or calculated for three intermediate-term studies, with a pooled mean difference of −0.64 (95% confidence interval −0.93 to −0.35) and an overall I-squared value of 0%.

Figure 9Spinal manipulation versus sham manipulation, usual care, an attention control, or a placebo intervention for chronic low back pain: effects on pain

AC = attention control; CI = confidence interval; N = number; PI = placebo intervention; SD = standard deviation; SP = sham manipulation; UC = usual care; UK BEAM = UK Back pain exercise and manipulation trial

Figure 10 is a forest plot. Standardized mean differences were reported or calculated for three short-term studies, with a pooled standardized mean difference of 0.02 (95% confidence interval −0.28 to 0.30) and an overall I-squared value of 36.7%. Standardized mean differences were reported or calculated for four intermediate-term studies, with a pooled standardized mean difference of 0.01 (95% confidence interval −0.15 to 0.21) and an overall I-squared value of 18.7%.

Figure 10Spinal manipulation versus exercise for chronic low back pain: effects on function

CI = confidence interval; MRDQ = Modified Roland-Morris Disability Questionnaire; N = number; RDQ = Roland-Morris Disability Questionnaire; SD = standard deviation; SMD = standardized mean difference; UK BEAM = UK Back pain exercise and manipulation trial

Figure 11 is a forest plot. Mean differences were reported or calculated for three short-term studies, with a pooled mean difference of 0.31 (95% confidence interval −0.42 to 1.06) and an overall I-squared value of 34%. Mean differences were reported or calculated for four intermediate-term studies, with a pooled mean difference of 0.23 (95% confidence interval −0.14 to 0.59) and an overall I-squared value of 0%.

Figure 11Spinal manipulation versus exercise for chronic low back pain: effects on pain

CI = confidence interval; EXE = exercise; N = number; SD = standard deviation; UK BEAM = UK Back pain exercise and manipulation trial

Figure 12 is a forest plot. Standardized mean differences were reported or calculated for six short-term studies, with a pooled standardized mean difference of −0.38 (95% confidence interval −0.63 to −0.20) and an overall I-squared value of 0%. Standardized mean differences were reported or calculated for three intermediate-term studies, with a pooled standardized mean difference of −0.09 (95% confidence interval −0.26 to 0.12) and an overall I-squared value of 0%.

Figure 12Massage versus sham massage, usual care, or attention control intervention for chronic low back pain: effects on function

AC = attention control; AP = acupressure; CI = confidence interval; FR = foot reflexology; MD = mean difference; MI = minimal intervention; MRDQ = Modified Roland-Morris Disability Questionnaire; MR = myofascial release; N = number; QBDS = Quebec Back Pain Disability Scale; RDQ = Roland-Morris Disability Questionnaire; RS = relaxation/structural; SD = standard deviation; SM = sham massage, SMD = standardized mean difference; UC = usual care

Figure 13 is a forest plot. Mean differences were reported or calculated for five short-term studies, with a pooled mean difference of −0.55 (95% confidence interval −0.88 to −0.23) and an overall I-squared value of 0%. Mean differences were reported or calculated for three intermediate-term studies, with a pooled mean difference of −0.02 (95% confidence interval −0.56 to 0.44) and an overall I-squared value of 0%.

Figure 13Massage versus sham massage, usual care, or attention control for chronic low back pain: effects on pain

AC = attention control; CI = confidence interval; FR = foot reflexology; MI = minimal intervention; MR = myofascial release; N = number; RS = relaxation/structural; SD = standard deviation; SM = sham massage, UC = usual care

Figure 14 is a forest plot. Standardized mean differences were reported or calculated for four short-term studies, with a pooled standardized mean difference of −0.14 (95% confidence interval −0.51 to 0.02) and an overall I-squared value of 0%. Standardized mean difference was reported or calculated for one intermediate-term study (−0.20, 95% confidence interval −0.46 to 0.06). Standardized mean difference was reported or calculated for one long-term study (−0.09, 95% confidence interval −0.35 to 0.16).

Figure 14Mindfulness-based stress reduction versus usual care or an attention control for chronic low back pain: effects on function

AC = attention control; CI = confidence interval; MBSR = mindfulness-based stress reduction; N = number; ODI = Oswestry Disability Index; RDQ = Roland-Morris Disability Questionnaire; SD = standard deviation; SMD = standardized mean difference; UC = usual care

Figure 15 is a forest plot. Mean differences were reported or calculated for five short-term studies, with a pooled mean difference of −0.88 (95% confidence interval −1.82 to 0.08) and an overall I-squared value of 88.8%. Mean difference was reported or calculated for one intermediate-term study (−0.75, 95% confidence interval −1.16 to −0.34). Mean difference was reported or calculated for one long-term study (−0.22, 95% confidence interval −0.63 to 0.19).

Figure 15Mindfulness-based stress reduction versus usual care or an attention control for chronic low back pain: effects on pain

AC = attention control; CI = confidence interval; MBSR = mindfulness-based stress reduction; N=number; SD = standard deviation; UC = usual care

Figure 16 is a forest plot. Standardized mean differences were reported or calculated for eight short-term studies, with a pooled standardized mean difference of −0.45 (95% confidence interval −0.69 to −0.28) and an overall I-squared value of 31.2%. Standardized mean differences were reported or calculated for three intermediate-term studies, with a pooled standardized mean difference of −0.29 (95% confidence interval −0.47 to −0.11) and an overall I-squared value of 0%.

Figure 16Yoga versus attention control or waitlist for chronic low back pain: effects on function

AC = attention control; CI = confidence interval; CPGS=Von Korff Chronic Pain Grade Score; MI = minimal intervention; MRDQ = Modified Roland-Morris Disability Questionnaire; N = number; NY = no yoga; RDQ = Roland-Morris Disability Questionnaire; SD = standard deviation; SMD = standardized mean difference; UC = usual care; WL = waitlist

Figure 17 is a forest plot. Mean differences were reported or calculated for seven short-term studies, with a pooled mean difference of −0.87 (95% confidence interval −1.49 to −0.24) and an overall I-squared value of 64.1%. Mean differences were reported or calculated for two intermediate-term studies, with a pooled mean difference of −1.16 (95% confidence interval −2.16 to −0.27) and an overall I-squared value of 0%.

Figure 17Yoga versus attention control or waitlist for chronic low back pain: effects on pain

AC = attention control; CI = confidence interval; MI = minimal intervention; N=number; NY = no yoga; RDQ = Roland-Morris Disability Questionnaire; SD = standard deviation; UC = usual care; WL = waitlist

Figure 18 is a forest plot. Standardized mean differences were reported or calculated for four short-term studies, with a pooled standardized mean difference of −0.04 (95% confidence interval −0.27 to 0.16) and an overall I-squared value of 0%. Standardized mean difference was reported or calculated for one intermediate-term study (−0.01, 95% confidence interval −0.26 to 0.24).

Figure 18Yoga versus exercise for chronic low back pain: effects on function

CI = confidence interval; CPGS=Von Korff Chronic Pain Grade Score; EXE = exercise; MRDQ = Modified Roland-Morris Disability Questionnaire; N = number; SD = standard deviation; SMD = standardized mean difference

Figure 19 is a forest plot. Mean differences were reported or calculated for five short-term studies, with a pooled mean difference of −0.63 (95% confidence interval −1.68 to 0.45) and an overall I-squared value of 87.5%. Mean differences was reported or calculated for one intermediate-term study (0.30, 95% confidence interval −0.39 to 0.99).

Figure 19Yoga versus exercise for chronic low back pain: effects on pain

CI = confidence interval; EXE = exercise; N = number; SD = standard deviation

Figure 20 is a forest plot. Standardized mean differences were reported or calculated for four short-term studies, with a pooled standardized mean difference of −0.23 (95% confidence interval −0.35 to −0.04) and an overall I-squared value of 24.8%. Standardized mean differences were reported or calculated for three intermediate-term studies, with a pooled standardized mean difference of −0.08 (95% confidence interval −0.42 to 0.28) and an overall I-squared value of 64.0%. Standardized mean difference was reported or calculated for one long-term study (−0.17, 95% confidence interval −0.47 to 0.13).

Figure 20Acupuncture versus sham acupuncture, usual care, attention control, or a placebo intervention for chronic low back pain: effects on function

AC = attention control; CI = confidence interval; HFAQ = Hannover Functional Ability Questionnaire; MI = minimal intervention; MRDQ = Modified Roland-Morris Disability Questionnaire; N = number; NE = no exercise; ODI = Oswestry Disability Index; PDI = Pain Disability Index; SA=sham acupuncture; SD = standard deviation; SMD = standardized mean difference; SNA =standard needle acupuncture; UC = usual care; WL = waitlist

Figure 21 is a forest plot. Mean differences were reported or calculated for five short-term studies, with a pooled mean difference of −0.54 (95% confidence interval −0.91 to −0.16) and an overall I-squared value of 25.2%. Mean differences were reported or calculated for five intermediate-term studies, with a pooled mean difference of −0.22 (95% confidence interval −0.67 to 0.21) and an overall I-squared value of 0%. Mean difference was reported or calculated for one long-term study (−0.83, 95% confidence interval −1.53 to −0.13).

Figure 21Acupuncture versus sham acupuncture, usual care, an attention control, or a placebo intervention for chronic low back pain: effects on pain

AC = attention control; CI = confidence interval; MI = minimal intervention; N = number; NA = needle acupuncture; SA=sham acupuncture; SD = standard deviation; SNA = standard needle acupuncture; UC = usual care; WL = waitlist

Figure 22 is a forest plot. Standardized mean differences were reported or calculated for four short-term studies, with a pooled standardized mean difference of −0.30 (95% confidence interval −0.63 to 0.00 and an overall I-squared value of 58.4%. Standardized mean differences were reported or calculated for four intermediate-term studies, with a pooled standardized mean difference of −0.37 (95% confidence interval −0.69 to −0.08 and an overall I-squared value of 33.7%. Standardized mean differences were reported or calculated for two long-term studies, with a pooled standardized mean difference of −0.04 (95% confidence interval −0.36 to 0.35) and an overall I-squared value of 0%.

Figure 22Multidisciplinary rehabilitation versus usual care for chronic low back pain: effects on function

CI = confidence interval; DRI= Disability Rating Index; indvl = individual; LBPDI = low back pain disability index; LBPRS = low back pain rating scale; MRDQ = Modified Roland-Morris Disability Questionnaire; N = number; RDQ = Roland-Morris Disability Questionnaire; SD = standard deviation; SMD = standardized mean difference

^a Multidisciplinary rehabilitation intensity: 1= high, 2= not high, 3= unclear or not reported

Figure 23 is a forest plot. Mean differences were reported or calculated for four short-term studies, with a pooled mean difference of −0.53 (95% confidence interval −0.86 to −0.11) and an overall I-squared value of 0%. Mean differences were reported or calculated for four intermediate-term studies, with a pooled mean differences of −0.62 (95% confidence interval −1.06 to −0.18) and an overall I-squared value of 0%. Mean differences were reported or calculated for two long-term studies, with a pooled mean differences of −0.35 (95% confidence interval −1.10 to 0.34) and an overall I-squared value of 0%.

Figure 23Multidisciplinary rehabilitation versus usual care for chronic low back pain: effects on pain

CI = confidence interval; indvl = individual; N = number; RDQ = Roland-Morris Disability Questionnaire; SD = standard deviation

^a Multidisciplinary rehabilitation intensity: 1 = high, 2 = not high, 3 = unclear or not reported

Figure 24 is a forest plot. Standardized mean differences were reported or calculated for six short-term studies, with a pooled standardized mean difference of −0.21 (95% confidence interval −0.54 to 0.01) and an overall I-squared value of 31.9%. Standardized mean differences were reported or calculated for six intermediate-term studies, with a pooled standardized mean differences of −1.04 (95% confidence interval −2.82 to 0.71) and an overall I-squared value of 95.9%. Standardized mean differences were reported or calculated for three long-term studies, with a pooled standardized mean difference of −1.82 (95% confidence interval −5.90 to 2.24) and an overall I-squared value of 98.3%.

Figure 24Multidisciplinary rehabilitation versus exercise for chronic low back pain: effects on function

CI = confidence interval; DPQDA = Dallas Pain Questionnaire daily activities; indvl = individual; LBPRS = low back pain rating scale; N = number; ODI = Oswestry Disability Index; QDS = Quebec Disability Scale; RDQ = Roland-Morris Disability Questionnaire; SD = standard deviation; SIP = Sickness Impact Profile; SMD = standardized mean difference

^a Multidisciplinary rehabilitation intensity: 1= high, 2= not high, 3= unclear or not reported

Figure 25 is a forest plot. Mean differences were reported or calculated for six short-term studies, with a pooled mean difference of −0.69 (95% confidence interval −1.15 to −0.22) and an overall I-squared value of 0%. Mean differences were reported or calculated for six intermediate-term studies, with a pooled mean difference of −1.20 (95% confidence interval −2.43 to 0.09) and an overall I-squared value of 95.1%. Mean differences were reported or calculated for three long-term studies, with a pooled mean difference of −1.68 (95% confidence interval −5.25 to 1.97) and an overall I-squared value of 98.2%.

Figure 25Multidisciplinary rehabilitation versus exercise for chronic low back pain: effects on pain

CI = confidence interval; indvl = individual; N = number; SD = standard deviation

^a Multidisciplinary rehabilitation intensity: 1 = high, 2 = not high, 3 = unclear or not reported.

Figure 26 is a forest plot. Standardized mean differences were reported or calculated for four short-term studies, with a pooled standardized mean difference of −0.73 (95% confidence interval −1.84 to 0.36) and an overall I-squared value of 95.1%. Standardized mean difference was reported or calculated for one intermediate-term study (0.14, 95% confidence interval −0.12 to 0.40). Standardized mean difference was reported or calculated for one long-term study (−0.39, 95% confidence interval −0.74 to −0.03).

Figure 26Exercise versus no treatment, waitlist, or an attention control for chronic neck pain: effects on function

CI = confidence interval; COM = combination exercise therapy; MP = muscle performance exercise; NDI = Neck Disability Index; NDS = neck disability scale; NT = no treatment; SD = standard deviation; SMD = standardized mean difference; WL = waitlist.

Figure 27 is a forest plot. Mean differences were reported or calculated for four short-term studies, with a pooled mean difference of −1.33 (95% confidence interval −2.68 to 0.07) and an overall I-squared value of 89.4%. Mean differences were reported or calculated for two intermediate-term studies, with a pooled mean difference of −0.25 (95% confidence interval −0.81 to 0.31) and an overall I-squared value of 0%. Mean differences were reported or calculated for three long-term studies, with a pooled mean difference of 0.07 (95% confidence interval −0.51 to 0.88) and an overall I-squared value of 0%.

Figure 27Exercise versus no treatment, waitlist, or an attention control for chronic neck pain: effects on pain

AC = attention control; CI = confidence interval; COM = combination exercise therapy; MP = muscle performance exercise; MP+NR = muscle performance plus neuromuscular rehabilitation exercise; NT = no treatment; SD = standard deviation; WL = waitlist

Figure 28 is a forest plot. Standardized mean differences were reported or calculated for two short-term studies, with a pooled standardized mean difference of −13.60 (95% confidence interval −26.30 to −6.30) and an overall I-squared value of 0%.

Figure 28Low-level laser therapy versus sham for chronic neck pain: effects on function

CI = confidence interval; LLL = low-level laser therapy; NPAD = Neck Pain and Disability Scale; SD = standard deviation

Figure 29 is a forest plot. Mean differences were reported or calculated for three short-term studies, with a pooled mean difference of −1.89 (95% confidence interval −3.34 to −0.06) and an overall I-squared value of 60.6%.

Figure 29Low-level laser therapy versus sham for chronic neck pain: effects on pain

CI = confidence interval; LLL = low-level laser therapy; SD = standard deviation

Figure 30 is a forest plot. Mean differences were reported or calculated for two short-term studies, with a pooled mean difference of −3.66 (95% confidence interval −6.58 to −0.56) and an overall I-squared value of 10.2%.

Figure 30Massage versus attention control or waitlist for chronic neck pain: effects on function

AC = attention control; CI = confidence interval; CM = classic massage; NDI = Neck Disability Index; SD = standard deviation; SM = Swedish massage; WL = waitlist.

Figure 31 is a forest plot. Standardized mean differences were reported or calculated for five short-term studies, with a pooled standardized mean difference of −0.40 (95% confidence interval −0.67 to −0.14) and an overall I-squared value of 61.0%. Standardized mean differences were reported or calculated for three intermediate-term studies, with a pooled standardized mean difference of −0.19 (95% confidence interval −0.37 to 0.05) and an overall I-squared value of 0%. Standardized mean differences was reported or calculated for one long-term study (−0.23, 95% confidence interval −0.61 to 0.16).

Figure 31Acupuncture versus sham acupuncture, a placebo intervention, or usual care for chronic neck pain: effects on function

ACP = traditional needle acupuncture; CI = confidence interval; EACP = electroacupuncture; NDI = Neck Disability Index; NPQ = Northwick Park Questionnaire; SD = standard deviation; Sham L = sham laser; SMD = standardized mean difference; UC = usual care.

Figure 32 is a forest plot. Mean differences were reported or calculated for five short-term studies, with a pooled mean difference of −0.66 (95% confidence interval −1.46 to 0.11) and an overall I-squared value of 78.4%. Mean differences were reported or calculated for three intermediate-term studies, with a pooled mean difference of 0.40 (95% confidence interval −0.45 to 1.44) and an overall I-squared value of 18.7%. Mean difference was reported or calculated for one long-term study (−0.35, 95% confidence interval −1.34 to 0.64).

Figure 32Acupuncture versus sham acupuncture or a placebo intervention for chronic neck pain: effects on pain

ACP = traditional needle acupuncture; CI = confidence interval; EACP = electroacupuncture; SD = standard deviation; Sham L = sham laser; SMD = standardized mean difference; TENS = transcutaneous electrical stimulation; UC = usual care.

Figure 33 is a forest plot. Standardized mean differences were reported or calculated for eight short-term studies, with a pooled standardized mean difference of −0.29 (95% confidence interval −0.46 to −0.11) and an overall I-squared value of 9.9%. Standardized mean differences were reported or calculated for 12 intermediate-term studies, with a pooled standardized mean difference of −0.98 (95% confidence interval −1.86 to −0.13) and an overall I-squared value of 96.5%. Standardized mean differences were reported or calculated for four long-term studies, with a standardized mean difference of −0.22 (95% confidence interval −0.34 to −0.08) and an overall I-squared value of 0%.

Figure 33Exercise versus usual care, no treatment, sham, or an attention control for osteoarthritis knee pain: effects on function

AC = attention control; APC = Arthritis Impact Measurement Scale (AIMS) physical activity component; CI = confidence interval; COM = combination exercise therapy; KADL = Knee Injury and Osteoarthritis Outcome Score (KOOS) ADL subscore; LI = Lequesne Index; LLFDI = Late Life Function and Disability Index Basic Lower Limb Function Score; ME = mobility exercise; MP = muscle performance exercise; NR = neuromuscular reeducation exercise; NT = no treatment; OKS = Oxford Knee Score; SD = standard deviation; SMD = standardized mean difference; UC = usual care; WOMAC = Western Ontario and McMaster’s Universities Osteoarthritis Index

Figure 34 is a forest plot. Mean differences were reported or calculated for eight short-term studies, with a pooled mean difference of −0.47 (95% confidence interval −0.86 to −0.10) and an overall I-squared value of 41.7%. Mean differences were reported or calculated for 11 intermediate-term studies, with a pooled mean difference of −1.34 (95% confidence interval −2.12 to −0.54) and an overall I-squared value of 89.6%. Mean differences were reported or calculated for four long-term studies, with a pooled mean difference of −0.30 (95% confidence interval −0.49 to −0.00) and an overall I-squared value of 0%.

Figure 34Exercise versus usual care, no treatment, sham, or an attention control for osteoarthritis knee pain: effects on pain

AC = attention control; CI = confidence interval; COM = combination exercise therapy; ME = mobility exercise; MP = muscle performance exercise; NR = neuromuscular re-education exercise; NT = no treatment; SD = standard deviation; SMD = standardized mean difference; UC = usual care

Figure 35 is a forest plot. Mean differences were reported or calculated for two short-term studies, with a pooled mean difference of −2.09 (95% confidence interval −8.70 to 1.61) and an overall I-squared value of 63.3%.

Figure 35Psychological therapies versus usual care or no treatment for osteoarthritis knee pain: effects on function

CI = confidence interval; IB CBT = internet-based cognitive-behavioral therapy; IPI = interviewing-based lifestyle physical activity intervention; N = number; NT = no treatment; SD = standard deviation; UC = usual care; WOMAC = Western Ontario and McMaster Universities Osteoarthritis Index.

Figure 36 is a forest plot. Mean differences were reported or calculated for two short-term studies, with a pooled mean difference of −0.60 (95% confidence interval −1.48 to −0.08) and an overall I-squared value of 0%.

Figure 36Psychological therapies versus usual care or no treatment for osteoarthritis knee pain: effects on pain

CI = confidence interval; IBCBT = internet-based cognitive-behavioral therapy; IPI = interviewing-based lifestyle physical activity intervention; N = number; NT = no treatment; SD = standard deviation; UC = usual care; WOMAC = Western Ontario and McMaster Universities Osteoarthritis Index.

Figure 37 is a forest plot. Mean differences were reported or calculated for three short-term studies, with a pooled mean difference of −2.50 (95% confidence interval −6.37 to 1.22) and an overall I-squared value of 94.0%.

Figure 37Ultrasound versus sham for osteoarthritis knee pain: effects on function

ADL = activities of daily living; CI = confidence interval; Con US = continuous ultrasound; C+P US = continuous and pulsed ultrasound combined; LI = Lequense Index; N = number; PFLI US = pulsed frequency low intensity ultrasound; SD = standard deviation.

Figure 38 is a forest plot. Mean differences were reported or calculated for three short-term studies, with a pooled mean difference of −1.20 (95% confidence interval −3.71 to 1.31) and an overall I-squared value of 91.1%.

Figure 38Ultrasound versus sham for osteoarthritis knee pain: effects on pain

ADL = activities of daily living; CI = confidence interval; Con US = continuous ultrasound; C+P US = continuous and pulsed ultrasound combined; N = number; PFLI US = pulsed frequency low intensity ultrasound; SD = standard deviation; VAS = visual analog scale.

Figure 39 is a forest plot. Mean differences were reported or calculated for two short-term studies, with a pooled mean difference of −2.00 (95% confidence interval −4.15 to 0.04) and an overall I-squared value of 77.3%. Mean differences were reported or calculated for two intermediate-term studies, with a pooled mean difference of −1.04 (95% confidence interval −3.17 to 1.45) and an overall I-squared value of 74.9%.

Figure 39Low-level laser therapy versus usual care or sham for osteoarthritis knee pain: effects on pain

CI = confidence interval; SD = standard deviation; UC = usual care

Figure 40 is a forest plot. Standardized mean differences were reported or calculated for five short-term studies, with a pooled standardized mean difference of −0.17 (95% confidence interval −0.71 to 0.38) and an overall I-squared value of 86.5%. Standardized mean differences were reported or calculated for four intermediate-term studies, with a pooled standardized mean difference of −0.15 (95% confidence interval −0.31 to 0.02) and an overall I-squared value of 0%.

Figure 40Acupuncture versus usual care, waitlist, or sham intervention in osteoarthritis knee pain effects on function

EA = electroacupuncture; LA = laser acupuncture; NR = not reported; SA = sham acupuncture; SNA = standard needle acupuncture; SD = standard deviation; SMD = standardized mean difference; NR = not reported; UC = usual care; WL = waitlist; WOMAC = Western Ontario and McMaster’s Universities Osteoarthritis Index

Figure 41 is a forest plot. Standardized mean differences were reported or calculated for six short-term studies, with a pooled standardized mean difference of −0.27 (95% confidence interval −0.67 to 0.12) and an overall I-squared value of 79.3%. Standardized mean differences were reported or calculated for four intermediate-term studies, with a pooled standardized mean difference of −0.16 (95% confidence interval −0.32 to −0.01) and an overall I-squared value of 0%.

Figure 41Acupuncture versus usual care, waitlist, or sham intervention for osteoarthritis knee pain: effects on pain

EA = electroacupucnture; LA = laser acupuncture; NR = not reported; SA = sham acupuncture; SNA = standard needle acupuncture; SD = standard deviation; SMD = standardized mean difference; NR = not reported; UC = usual care; WL = waitlist; WOMAC = Western Ontario and McMaster’s Universities Osteoarthritis Index

Figure 42 is a forest plot. Standardized mean differences were reported or calculated for three short-term studies, with a pooled standardized mean difference of −0.33 (95% confidence interval −0.58 to −0.11) and an overall I-squared value of 0%. Standardized mean differences were reported or calculated for two intermediate-term studies, with a pooled standardized mean difference of −0.28 (95% confidence interval −0.55 to 0.02) and an overall I-squared value of 0%. Standardized mean difference was reported or calculated for one long-term study (−0.37, 95% confidence interval −0.74 to −0.01).

Figure 42Exercise versus usual care for osteoarthritis hip pain: effects on function

CI = confidence interval; COM = combination exercise therapy; HOOS = Hip disability and Osteoarthritis Outcomes Score; SD = standard deviation; SIP = Sickness Impact Profile physical function score; SMD = standardized mean difference; STRG = strength training exercise; UC = usual care; WOMAC = Western Ontario and McMaster’s Universities Osteoarthritis Index

Figure 43 is a forest plot. Standardized mean differences were reported or calculated for three short-term studies, with a pooled standardized mean difference of −0.30 (95% confidence interval −0.70 to −0.02) and an overall I-squared value of 0%. Standardized mean differences were reported or calculated for two intermediate-term studies, with a pooled standardized mean difference of −0.14 (95% confidence interval −0.40 to 0.12) and an overall I-squared value of 0%. Standardized mean differences was reported or calculated for one long-term study (−0.25, 95% confidence interval −0.62 to 0.11).

Figure 43Exercise versus usual care for osteoarthritis hip pain: effects on pain

CI = confidence interval; COM = combination exercise therapy; SD = standard deviation; SMD = standardized mean difference; STRG = strength training exercise; UC = usual care; VAS = visual analog scale; WOMAC = Western Ontario and McMaster’s Universities Osteoarthritis Index

Figure 44 is a forest plot. Mean differences were reported or calculated for seven short-term studies, with a pooled mean difference of −7.68 (95% confidence interval −13.04 to −1.84) and an overall I-squared value of 59.9%. Mean differences were reported or calculated for eight intermediate-term studies, with a pooled mean difference of −6.04 (95% confidence interval −9.25 to −3.01) and an overall I-squared value of 0%. Mean differences were reported or calculated for three long-term studies, with a pooled mean difference of −4.33 (95% confidence interval −10.46 to 1.97) and an overall I-squared value of 0%.

Figure 44Exercise versus usual care, no treatment, waitlist, or an attention control for fibromyalgia: effects on function

AC = attention control; AR = aerobic exercise; AR & COM = aerobic exercise in one arm and combination exercise in another arm; AR & MP = aerobic exercise in one arm and muscle performance exercise in another arm; CI = confidence interval; COM = combination exercise therapy; MP = muscle performance exercise; MP+NR = muscle performance plus neuromuscular rehabilitation exercise; NT = no treatment; SD = standard deviation; UC = usual care; WL = waitlist

Figure 45 is a forest plot. Mean differences were reported or calculated for seven short-term studies, with a pooled mean difference of −1.08 (95% confidence interval −1.75 to −0.32) and an overall I-squared value of 53.1%. Mean differences were reported or calculated for eight intermediate-term studies, with a pooled mean difference of −0.51 (95% confidence interval −0.92 to −0.06) and an overall I-squared value of 0%. Mean differences were reported or calculated for four long-term studies, with a pooled mean difference of −0.18 (95% confidence interval −0.77 to 0.42) and an overall I-squared value of 0%.

Figure 45Exercise versus usual care, no treatment, waitlist, attention control, or sham for fibromyalgia: effects on pain

AC = attention control; AR = aerobic exercise; AR & MP = aerobic exercise in one arm and muscle performance exercise in another arm; CI = confidence interval; COM = combination exercise therapy; MP = muscle performance exercise; MP+NR = muscle performance plus neuromuscular rehabilitation exercise; NT = no treatment; SD = standard deviation; UC = usual care; WL = waitlist

Figure 46 is a forest plot. Mean differences were reported or calculated for five short-term studies, with a pooled mean difference of −2.82 (95% confidence interval −9.76 to 2.81) and an overall I-squared value of 70.6%. Mean differences were reported or calculated for three intermediate-term studies, with a pooled mean difference of −12.82 (95% confidence interval −24.07 to −2.44) and an overall I-squared value of 94.2%.

Figure 46Psychological therapies versus usual care, waitlist, or attention control for fibromyalgia: effects on function

AC = attention control; ACT = acceptance and commitment therapy; CBT = cognitive-behavioral therapy; CI = confidence interval; GI = guided imagery; SD = standard deviation; UC = usual care; WL = waitlist

Figure 47 is a forest plot. Mean differences were reported or calculated for five short-term studies, with a pooled mean difference of −0.62 (95% confidence interval −1.02 to −0.20) and an overall I-squared value of 0%. Mean differences were reported or calculated for seven intermediate-term studies, with a pooled mean difference of −0.62 (95% confidence interval −1.14 to −0.09) and an overall I-squared value of 65.7%. Mean differences were reported or calculated for two long-term studies, with a pooled mean difference of 0.04 (95% confidence interval −0.89 to 0.98) and an overall I-squared value of 0%.

Figure 47Psychological therapies versus usual care, waitlist, or attention control for fibromyalgia: effects on pain

AC = attention control; ACT = acceptance and commitment therapy; BFP = biofeedback; BFP/RLX = Biofeedback with a Relaxation component; CBT = cognitive-behavioral therapy; CI = confidence interval; EAET = emotional awareness and expression therapy; SD = standard deviation; UC = usual care; WL = waitlist

Figure 48 is a forest plot. Mean differences were reported or calculated for two short-term studies, with a pooled mean difference of −17.14 (95% confidence interval −41.51 to 7.23) and an overall I-squared value of 87.6%. Mean differences were reported or calculated for two intermediate-term studies, with a pooled mean difference of −9.81 (95% confidence interval −23.83 to 4.21) and an overall I-squared value of 95.5%.

Figure 48Psychological therapies versus pharmacological therapy for fibromyalgia: effects on function

ACT = acceptance and commitment therapy; Amit = amitriptyline; CBT = cognitive-behavioral therapy; CI = confidence interval; Dulo = duloxetine; EEG BF = Electroencephalographic Biofeedback; ESCI = escitalopram; Preg = pregabalin; SD = standard deviation.

Figure 49 is a forest plot. Mean difference was reported or calculated for one short-term study (−1.00, 95% confidence interval −1.29 to −0.71). Mean differences were reported or calculated for two intermediate-term studies, with a pooled mean difference of −0.26 (95% confidence interval −2.06 to 1.56) and an overall I-squared value of 96.2%. Mean difference was reported or calculated for one long-term study (0.10, 95% confidence interval −0.16 to 0.36).

Figure 49Myofascial release versus sham for fibromyalgia: effects on pain

CI = confidence interval; MR = myofascial release; SD = standard deviation

Figure 50 is a forest plot. Mean differences were reported or calculated for two short-term studies, with a pooled mean difference of −15.44 (95% confidence interval −31.11 to 0.23) and an overall I-squared value of 92.3%.

Figure 50Mind-body therapies versus waitlist or attention control for fibromyalgia: effects on function

AC = attention control; CI = confidence interval; FIQ = Fibromyalgia Impact Questionnaire; QG = qigong; SD = standard deviation; TC = tai chi; WL = waitlist

Figure 51 is a forest plot. Mean differences were reported or calculated for two short-term studies, with a pooled mean difference of −1.44 (95% confidence interval −2.96 to −0.23) and an overall I-squared value of 45.6%.

Figure 51Mind-body therapies versus waitlist or attention control for fibromyalgia: effects on pain

AC = attention control; CI = confidence interval; QG = qigong; SD = standard deviation; TC = tai chi; WL = waitlist

Figure 52 is a forest plot. Mean differences were reported or calculated for three short-term studies, with a pooled mean difference of −9.21 (95% confidence interval −13.65 to −5.78) and an overall I-squared value of 0%. Mean differences were reported or calculated for two intermediate-term studies, with a pooled mean difference of −9.82 (95% confidence interval −14.35 to −3.01) and an overall I-squared value of 27.4%.

Figure 52Acupuncture versus sham for fibromyalgia: effects on function

ACP = acupuncture; CI = confidence interval; FIQ = Fibromyalgia Impact Questionnaire; SD = standard deviation

Figure 53 is a forest plot. Mean differences were reported or calculated for five short-term studies, with a pooled mean difference of −1.14 (95% confidence interval −2.66 to 0.33) and an overall I-squared value of 91.6%. Mean differences were reported or calculated for three intermediate-term studies, with a pooled mean difference of −0.65 (95% confidence interval −1.15 to 0.17) and an overall I-squared value of 45.5%.

Figure 53Acupuncture versus sham for fibromyalgia: effects on pain

ACP = acupuncture; CI = confidence interval; SD = standard deviation

Figure 54 is a forest plot. Mean differences were reported or calculated for three short-term studies, with a pooled mean difference of −6.08 (95% confidence interval −14.17 to 0.16) and an overall I-squared value of 48.9%. Mean differences were reported or calculated for three intermediate-term studies, with a pooled mean difference of −7.77 (95% confidence interval −12.22 to −3.83) and an overall I-squared value of 0%. Mean differences were reported or calculated for two long-term studies, with a pooled mean difference of −8.54 (95% confidence interval −15.00 to −1.30) and an overall I-squared value of 0%.

Figure 54Multidisciplinary rehabilitation versus usual care or waitlist for fibromyalgia: effects on function

CI = confidence interval; MD = multidisciplinary rehabilitation; SD = standard deviation; UC = usual care; WL = waitlist

Figure 55 is a forest plot. Mean differences were reported or calculated for three short-term studies, with a pooled mean difference of −0.84 (95% confidence interval −2.56 to 0.64) and an overall I-squared value of 83.6%. Mean differences were reported or calculated for three intermediate-term studies, with a pooled mean difference of −0.68 (95% confidence interval −1.10 to −0.27) and an overall I-squared value of 0%. Mean differences were reported or calculated for two long-term studies, with a pooled mean difference of −0.25 (95% confidence interval −0.79 to 0.36) and an overall I-squared value of 0%.

Figure 55Multidisciplinary rehabilitation versus usual care or waitlist for fibromyalgia: effects on pain

CI = confidence interval; MD = multidisciplinary rehabilitation; SD = standard deviation; UC = usual care; WL = waitlist.

Figure 56 is a forest plot. Risk ratios were reported or calculated for one study each of short-term cognitive-behavioral therapy with a relaxation component vs. usual care (RR 1.94, 95% CI 1.03 to 3.66), intermediate-term cognitive-behavioral therapy with a relaxation component vs. usual care (RR 1.19, 95% CI 0.66 to 2.13), short-term relaxation therapy vs. usual care (RR 0.98, 95% CI 0.42 to 2.26), short-term cognitive-behavioral therapy with a relaxation component vs. pharmacologic therapy (RR 2.09, 95% CI 0.64 to 6.82), and intermediate-term cognitive-behavioral therapy with a relaxation component vs. pharmacologic therapy (RR 0.92, 95% CI 0.55 to 1.54). Results were not pooled.

Figure 56Psychological therpies versus waitlist, attention control, placebo intervention, or pharmacological treatment for chronic tension headache: effects on pain (success)

AC/WL = an attention control arm and a waitlist arm; CBT = cognitive-behavioral therapy; CBT/RLX = cognitive-behavioral therapy with a relaxation component; CI = confidence interval; PB = placebo (pill); PHARM = standard pharmacological therapy; RLX = relaxation therapy; RR = risk ratio; UC = usual care

Figure 57 is a forest plot. Standardized mean differences were reported or calculated for two short-term studies of cognitive-behavioral therapy with a relaxation component vs. usual care, with a pooled standardized mean difference of −0.40 (95% confidence interval −0.79 to 0.00) and an overall I-squared value of 0%. Standardized mean differences were reported or calculated for two short-term studies of cognitive-behavioral therapy with a relaxation component vs. pharmacologic therapy, with a pooled standardized mean differences of 0.03 (95% confidence interval −1.35 to 1.27) and an overall I-squared value of 73.1%. Standardized mean differences were reported or calculated for one study each of intermediate-term cognitive-behavioral therapy with a relaxation component vs. usual care (−0.65, 95% CI −1.06 to −0.24), short-term relaxation therapy vs. usual care (−0.21, 95% CI −0.78 to 0.36), and intermediate term cognitive-behavioral therapy with a relaxation component vs. pharmacologic therapy (0.11, 95% CI −0.28 to 0.50). Results were not pooled.

Figure 57Psychological therapies versus waitlist, attention control, placebo intervention, or pharmacological treatment for chronic tension headache: effects on pain (mean difference)

AC/WL = an attention control arm and a waitlist arm; CBT = cognitive-behavioral therapy; CBT/RLX = cognitive-behavioral therapy with a relaxation component; CI = confidence interval; PB = placebo (pill); PHARM = standard pharmacological therapy; RLX = Relaxation therapy; SMD = standardized mean difference; UC = usual care

Figusre 58 is a forest plot. Weighted mean differences were reported or calculated for two studies, with a pooled weighted mean difference of −1.61 (95% confidence interval −5.20 to 2.26) and an overall I-squared value of 90.5%.

Figure 58Acupuncture versus sham for chronic tension headache: effects on pain

ACP = standard needle acupuncture; CI = confidence interval; LACP = laser acupuncture; SD = standard deviation; WMD = weighted mean difference

Tables

Table 4Overview of included studies

Intervention	Comparator	Chronic Low Back Pain	Chronic Neck Pain	Osteoarthritis	Fibromyalgia	Chronic Tension Headache
Exercise	Sham, usual care, waitlist, no treatment, attention	10³¹^–⁴⁰ [4 new trials]	6⁴¹^–⁴⁶	Knee OA: 22 (25)⁴⁷^–⁷¹ [4 new trials] Hip OA: 4⁴⁷^,⁷²^–⁷⁴ Hand OA: 1⁷⁵	22 (24)⁷⁶^–⁹⁹	0
Exercise	Pharmacological therapy	0	2¹⁰⁰^,¹⁰¹ [1 new trial]	1 (2)¹⁰²^,¹⁰³ [1 new trial in 2 publications]	1⁹³	0
Psychological Therapies	Sham, usual care, waitlist, no treatment, attention	5¹⁰⁴^–¹⁰⁸	1⁴⁵	Knee OA: 4¹⁰⁹^–¹¹² [2 new trials] Hip, Hand OA: 0	16 (18)⁷⁸^,⁹⁷^,⁹⁸^,¹¹³^–¹²⁷ [6 new trials in 7 publications]	2¹²⁸^,¹²⁹
	Pharmacological therapy	0	0	0	4 (5)¹¹³^,¹²²^,¹²³^,¹³⁰^,¹³¹ [1 new trial in 2 publications]	2¹²⁹^,¹³²
	Exercise (or biofeedback for CTTH)	1¹³³	1⁴⁵	Knee OA: 1¹³⁴ Hip, Hand OA: 0	5⁷⁸^,⁹⁷^,⁹⁸^,¹³⁵^,¹³⁶	0
Physical Modalities	Sham, usual care, waitlist, no treatment, attention	8¹³⁷^–¹⁴⁴ [1 new trial]	5¹⁴⁵^–¹⁴⁹	Knee OA: 15¹⁵⁰^–¹⁶⁴ [2 new trials] Hip OA: 0 Hand OA: 2¹⁶⁵^,¹⁶⁶	2¹⁶⁷^,¹⁶⁸	1¹⁶⁹
	Pharmacological therapy	0	0	0	0	0
	Exercise (or biofeedback for CTTH)	1¹⁷⁰	0	0	0	0
Manual Therapies	Sham, usual care, waitlist, no treatment, attention	12¹⁰⁸^,¹⁴³^,¹⁷¹^–¹⁸⁰ [2 new trials]	3¹⁸¹^–¹⁸³ [1 new trial]	Knee OA: 2⁴⁷^,¹⁸⁴ Hip OA: 1⁴⁷ Hand OA: 0	2¹⁸⁵^,¹⁸⁶	1¹⁸⁷
	Pharmacological therapy	0	0	0	0	1¹⁸⁸
	Exercise (or biofeedback for CTTH)	5¹⁷⁴^,¹⁸⁹^–¹⁹²	1¹⁸¹	Knee OA: 1⁴⁷ Hip OA: 2⁴⁷^,¹⁹³ Hand OA: 0	0	0
Mindfulness Practices	Sham, usual care, waitlist, no treatment, attention	5 (7)¹⁰⁴^,¹⁹⁴^–¹⁹⁹	0	0	3 (4)²⁰⁰^–²⁰³ [1 new trial]	0
	Pharmacological therapy	0	0	0	0	0
	Exercise (or biofeedback for CTTH)	0	0	0	0	0
Mind-body Practices	Sham, usual care, waitlist, no treatment, attention	11³⁷^,⁴⁰^,²⁰⁴^–²¹² [4 new trials]	1 (2)²¹³^,²¹⁴ [1 new publication]	Knee OA: 2²¹⁵^,²¹⁶ Hip, Hand OA: 0	2²¹⁷^,²¹⁸	0
	Pharmacological therapy	0	0	0	0	0
	Exercise (or biofeedback for CTTH)	7³⁷^,⁴⁰^,²⁰⁵^–²⁰⁷^,²¹⁹^,²²⁰ [2 new trials]	2²²¹^,²²²	0	1²²³ [1 new trial]	0
Acupuncture	Sham, usual care, waitlist, no treatment, attention	8¹⁷⁶^,²²⁴^–²³⁰	8 (9)²¹³^,²¹⁴^,²³¹^–²³⁷ [1 new publication]	Knee OA: 9⁶⁷^,²³⁸^–²⁴⁵ Hip, Hand OA: 0	5²⁴⁶^–²⁵⁰ [2 new trials]	3²⁵¹^–²⁵³
	Pharmacological therapy	0	2²³¹^,²⁵⁴	0	0	0
	Exercise (or biofeedback for CTTH)	0	0	Knee OA: 1⁶⁷ Hip, Hand OA: 0	0	0
Function Restoration Training	Sham, usual care, waitlist, no treatment, attention	0	0	0	0	0
	Pharmacological therapy	0	0	0	0	0
	Exercise (or biofeedback for CTTH)	0	0	0	0	0
Multi-disciplinary Rehabilitation	Sham, usual care, waitlist, no treatment, attention	7²⁵⁵^–²⁶⁰	0	Knee, Hip OA: 0 Hand OA: 1²⁶¹	6 (8)⁹⁶^,²⁶²^–²⁶⁸	0
	Pharmacological therapy	1²⁶⁹	0	0	0	0
	Exercise (or biofeedback for CTTH)	9 (13)¹³³^,²⁷⁰^–²⁸¹	0	0	1⁹⁶	0

: CTTH = chronic tension-type headache; OA = osteoarthritis

Table 5Chronic low back pain: exercise

Author, Year, Followup,^a Pain Duration, Study Quality

Intervention

Population

Function and Pain Outcomes

Other Outcomes

Areeudomwong, 2017³⁸

3 months

Duration of pain: Mean 9.0 to 10 months

Fair

A. Proprioceptive Neuromuscular Facilitation (neuromuscular re-education) (n=21): 30 minute sessions 5 times/week for 4 weeks (20 total sessions)

B. Attention control (education) (n=21)

A vs. B

Age: 35 vs. 36 years

Female: 71% vs. 76%

Baseline RDQ (0-24): 4.5 vs. 4.9

Baseline NPS (0-10): 4.1 vs. 4.2

A vs. B

3 months

RDQ: 1.7 vs. 4.8, difference −3.1 (95% CI −3.9 to −2.3), p<0.001

NPS: 1.5 vs. 3.85, difference −2.31 (95% CI −3.4 to −1.2), p<0.001

A vs. B

3 months

SF-36 PCS: 53.7 vs. 44.2, difference 9.6 (95% CI 5.4 to 13.3), p<0.001

SF-36 MCS: 49.5 vs. 48.36, difference 1.2 (95% CI −3.1 to 5.4), p>0.05

GPE: 1.4 vs. 0.7, difference 0.7 (95% CI 0.2 to 1.2), p<0.01

Bramberg, 2017³⁷

4.2 months

Duration of pain: NR

Fair

A. Strength training (n=52): Five 60-minute supervised strength-training sessions over 6 weeks.

B. Attention control (education) (n=55)

A vs. B

Age: 47 vs. 46 vs. 44 years

Female: 72% vs. 62% vs. 80%

Baseline CPGS-BD (0-100): 37.6 vs. 38.6

Baseline CPGS-BP (0-100): 57.7 vs. 55.6

A vs. B

6 months

CPGS-BD: 24.8 vs. 32.8, adjusted difference −9.5 (95% CI −19.3 to 0.4), p>0.05

CPGS-BP: 41.7 vs. 50.2, adjusted difference −9.4 (95% CI −18.1 to −0.8), p<0.05

Work absence (mean days over time period)^b

A vs. B

-1 to 4 months: 5.0 vs. 8.9, difference −3.9 (95% CI −11.4 to 3.6)

-5 to 8 months: 6.4 vs. 12.5, difference −6.1 (95% CI −15.7 to 3.5)

-9 to 12 months: 9.5 vs. 9.2, difference 0.3 (95% CI −10.3 to 10.9);

Proportion absent ≥1 time: 51% vs. 44%; RR 0.95 (95% CI 0.73 to 1.22)

Costa, 2009³¹

4 and 10 months

Duration of pain: Mean 328 to 335 weeks

Fair

A: Neuromuscular re-education (motor control exercise) (n=77), 12 sessions over 8 weeks

B: Placebo (n=77) (detuned shortwave diathermy and detuned ultrasound)

12 sessions, two sessions/week for 4 weeks, then 1 session/week for 4 weeks

A vs. B

Age: 55 vs. 53 years

Female: 58% vs. 62%

Baseline RDQ (0-24): 13.1 vs. 13.4

Baseline pain (0-10 VAS): 6.8 vs. 6.6

A vs. B

4 months

RDQ: 5.3 vs. 4.3, adjusted difference 1.0 (95% CI 0.3 to 1.8)

Pain (0-10 VAS): 5.0 vs. 5.6, adjusted difference 1.4 (95% CI 0.3 to 2.4)

10 months

RDQ: 11.4 vs. 12.3, adjusted difference −1.0 (95% CI −2.8 to 0.8)

Pain: 5.0 vs. 6.3, adjusted difference −1.0 (95% CI −1.9 to −0.1)

A vs. B

4 months

Global impression of recovery (−5 to +5): 1.5 vs. 0.3, adjusted difference 1.4 (95% CI 0.3 to 1.8)

10 months

Global impression of recovery: 1.2 vs. −0.3, adjusted difference 1.6 (95% CI 0.6 to 2.6)

Garcia, 2018³⁹

1.75, 4.75, and 11.75 months

Duration of mean pain: Mean 36 to 48 months

Good

A. McKenzie Method of Mechanical Diagnosis and Therapy (directional preference) (n=74): In addition to the supervised treatment sessions, patients were instructed to do 10–15 repetitions of exercise, three to five times per day at home.

B. Placebo (n=73) (ultrasound)

A vs. B

Age: 58 vs. 56 years

Female: 78.4% vs. 74%

Baseline PSFS (0-10): 4.0 vs. 3.9

Baseline RDQ (0-24): 13.3 vs. 14.3

Baseline NRS (0-10): 7.2 vs. 7.0

A vs. B

4.75 months

PSFS: 6.2 vs. 5.9, adjusted difference −0.1 (95% CI −0.9 to 0.7), p=0.82

RDQ: 8.3 vs. 9.9, adjusted difference −0.5 (95% CI −2.3 to 1.3), p=0.61

NRS: 4.5 vs. 5.0, adjusted difference −0.8 (95% CI −1.8 to 0.3), p=0.15

11.75 months

PSFS: 5.5 vs. 6.0, adjusted difference 0.66 (95% CI −0.13 to 1.45), p=0.10

RDQ: 7.7 vs. 8.5, adjusted difference 0.5 (95% CI −1.3 to 2.3), p=0.56

NRS: 5.1 vs. 4.9, adjusted difference −0.1 (95% CI −1.0 to 1.1), p=0.88

A vs. B

4.75 months

GPE: 2.10 vs. 1.63, adjusted difference 0.65 (95% CI −0.43 to 1.74), p=0.23

11.75 months

GPE: 1.6 vs. 1.3, adjusted difference 0.0 (95% CI −1.0 to 1.1), p=0.95

Goldby, 2006³²

3, 6, 12 and 24 months

Duration of pain: Mean 11 to 12 years

Fair

A: Neuromuscular re-education (motor control exercise) (n=84), 10 sessions over 10 weeks

B: Attention control (education) (n=40)

A vs. B

Age: 43 vs. 41 years

Female: 68% vs. 68%

Race: 80% vs. 62%

Baseline ODI (0-100): 40.5 vs. 33.5

Baseline LBO (0-75): 43.9 vs. 44.0 vs. 47.6

Baseline back pain (0-100 NRS): 45.8 vs. 37.6

3 months

ODI (0-100): 31.00 vs. 28.1, difference 2.9 (95% CI −3.89 to 9.69)

LBO (0-75): 50.92 vs. 54.4, difference −3.48 (95% CI −9.67 to 2.71)

Back pain (0-100 NRS): 28.81 vs. 34.4, difference −5.59 (95% CI −17.86 to 6.68)

6 months

ODI: 25.81 vs. 23.9, difference 1.91 (95% CI −6.28 to 10.10)

LBO: 55.42 vs. 57.85, difference −2.43 (95% CI −9.14 to 4.28)

Back pain: 23.16 vs. 30.25, difference −7.09 (95% CI −20.22 to 6.04)

12 months

ODI: 24.76 vs. 26.9 difference −2.14 (95% CI −10.14 to 5.86)

LBO: 53.86 vs. 50.95, difference 2.91 (95% CI −4.29 to 10.11)

Back pain: 29.23 vs. 30, difference −0.77 (95% CI −14.13 to 12.59)

24 months

ODI: 27 vs. 27; difference 0.00 (95% CI −11.44 to 11.44)

LBO: 54.7 vs. 55.2, difference −0.5 (95% CI −9.20 to 8.20)

Back pain: 35.4 vs. 50.9, difference −15.50 (95% CI −33.06 to 2.06)

3 months

Nottingham Health Profile: 94.97 vs. 94.32, difference 0.65 (95% CI −36.97 to 38.27)

6 months

Nottingham Health Profile: 76.3 vs. 77.50, difference −1.20 (95% CI −37.76 to 35.36)

12 months

Nottingham Health Profile: 70.06 vs. 87.47 difference −17.41 (95% CI −56.12 to 21.30)

24 months

Nottingham Health Profile: 82 vs. 83, difference −1.00 (95% CI −60.85 to 58.85)

Kankaanpaa, 1999³³

3 and 9 months

Duration of pain: Mean 7 to 9 years

Fair

A. Combined exercise (exercises, stretching, relaxation, muscle function and ergonomic advice) (n=30), 24 sessions over 12 weeks

B. Attention Control (n=24) (thermal therapy and minimal massage)

A vs. B

Age: 40 vs. 39 years

Female: 36.6% vs. 33.3%

Baseline Pain and Disability Index (0-70 PDI): 13.2 vs. 9.5

Baseline back pain (0-100 mm VAS): 55.2 vs. 47.0

3 months

Pain and Disability Index (0-70): 5.7 vs. 12.6, difference −6.9 (95% CI −11.69 to - 2.11)

Back pain (0-100 VAS): 26.6 vs. 43.4; difference −16.80 (95% CI −31.12 to −2.47)

9 months

Pain and Disability Index: 5.7 vs. 11.4, difference −5.7 (95% CI −11.31 to −0.09)

Back pain intensity: 23.9 vs. 45.1, difference −21.20 (95% CI −32.69 to −9.71)

Mazloum, 2017⁴⁰

1 month

Duration of pain: Mean 30.8 to 32.4 months

Poor

A. Pilates (n=20): 3 days per week for 6 weeks

B. Exercise (n=20): 3 days per week for 6 weeks

C. Usual care (n=20) (no treatment)

A vs. B vs. C

Age: 37 vs. 43 vs. 39 years

Baseline ODI (0-100): 30.8 vs. 27.2 vs. 26.2

Baseline VAS (0-10): 6.8 vs. 7.2 vs. 6.5

1 month

A vs. C

ODI: 22.9 vs. 26.6, difference −3.7 (95% CI −6.8 to −0.6)

VAS: 3.0 vs. 6.9, difference −3.9 (95% CI −4.8 to −3.0)

B vs. C

ODI: 23.1 vs. 26.6, difference −3.5 (95% CI −8.1 to 1.2)

VAS: 4.8 vs. 6.9, difference -−2.1 (95% CI −3.1 to -−1.1)

A vs. B

ODI: 22.9 vs. 23.1. difference −0.2 (95% CI −4.5 to 4.1)

VAS: 3.0 vs. 4.8, difference −1.8 (95% CI −2.5 to −1.1)

Miyamoto, 2013³⁵

4.5 months

Duration of pain: Mean 5 to 6 years

Fair

A. Muscle performance (Pilates) (n=43),12 sessions over 6 weeks

B. Attention control (n=43) (education)

A vs. B

Age: 41 vs. 38 years

Female: 84% vs. 79%

Baseline RDQ: 9.7 vs. 10.5

Baseline pain (0-10 VAS): 6.6 vs. 6.5

4.5 months

RDQ (0-24): 4.5 vs. 6.7, adjusted difference −1.4 (95% CI −3.1 to 0.03)

Patient-Specific Functional Scale (0-10): 6.9 vs. 6.1, adjusted difference 0.2 (95% CI −0.6 to 1.1)

Pain (0-10 VAS): 4.5 vs. 5.3, adjusted difference −0.9 (95% CI −1.9 to 0.1)

4.5 months

Global impression of recovery (−5 to +5): 2.4 vs. 1.7, adjusted difference 0.7 (95% CI −0.4 to 1.8)

Miyamoto, 2018²¹²

4.5 and 11.5 months

Duration of pain: Mean 36 to 57 months

Good

A. Pilates (n=74): 1 session/week for 6 weeks (6 total sessions). Patients attended 81% of sessions.

B. Pilates (n=74): 2 sessions/week for 6 weeks (12 total sessions). Patients attended 85% of sessions.

C. Pilates (n=74): 3 sessions/weeks for 6 weeks (18 total sessions). Patients attended 82% of sessions.

D. Usual care (n=73) (no treatment)

A vs. B. vs. C vs. D

Age: 47 vs. 47 vs. 49 vs. 49 years

Female: 78% vs. 70% vs. 78% vs. 76%

Baseline RDQ (0-24): 11.0 vs. 12.8 vs. 10.6 vs. 12.3

Baseline PSFS (0-10): 3.7 vs. 3.8 vs. 3.9 vs. 3.6

Baseline NRS (0-10): 6.1 vs. 6.4 vs. 6.0 vs. 6.3

A vs. D

4.5 months

RDQ: 8.8 vs. 10.2, adjusted difference 0.0 (−1.7 to 1.8), p>0.05

PSFS: 5.5 vs. 6.0, adjusted difference −0.5 (−1.3 to 0.3), p>0.05

NRS: 5.0 vs. 5.4, adjusted difference −0.3 (−1.3 to 0.6), p>0.05

11.5 months

RDQ: 7.3 vs. 8.9, adjusted difference 0.2 (−1.6 to 2.0), p>0.05

PSFS: 6.1 vs. 6.2, adjusted difference −0.2 (−1.0 to 0.6), p>0.05

NRS: 4.8 vs. 4.9, adjusted difference 0.1 (−0.9 to 1.0), p>0.05

B vs. D

4.5 months

RDQ: 7.9 vs. 10.2, adjusted difference −2.4 (−4.1 to −0.6), p≤0.01

PSFS: 6.5 vs. 6.0, adjusted difference 0.4 (−0.4 to 1.2), p>0.05

NRS: 4.4 vs. 5.4, adjusted difference −1.0 (−2.0 to −0.1), p≤0.05

11.5 months

RDQ: 7.2 vs. 8.9, adjusted difference −1.7 (−3.5 to 0.0), p>0.05

PSFS: 6.9 vs. 6.2, adjusted difference 0.5 (−0.4 to 1.3), p>0.05

NRS: 4.1 vs. 4.9, adjusted difference −0.8 (−1.8 to 0.2), p>0.05

C vs. D

4.5 months

RDQ: 6.4 vs. 10.2, adjusted difference −1.7 (−3.5 to 0.1), p>0.05

PSFS: 6.7 vs. 6.0, adjusted difference 0.3 (−0.5 to 1.2), p>0.05

NRS: 4.3 vs. 5.4, adjusted difference −0.7 (−1.7 to 0.2), p>0.05

11.5 months

RDQ: 5.9 vs. 8.9, adjusted difference −0.7 (−2.5 to 1.1), p>0.05

PSFS: 6.6 vs. 6.2, adjusted difference 0.0 (−0.8 to 0.8), p>0.05

NRS: 4.1 vs. 4.9, adjusted difference −0.4 (−1.4 to 0.6), p>0.05

A vs. D

4.5 months

GPE: 1.5 vs. 1.2, adjusted difference 0.5 (−0.5 to 1.6), p>0.05

SF-6D: 0.80 vs. 0.80, adjusted difference 0.01 (−0.02 to 0.03), p>0.05

11.5 months

GPE: 1.6 vs. 1.9, adjusted difference −0.1 (−1.2 to 1.0), p>0.05

SF-6D: 0.81 vs. 0.80, adjusted difference 0.01 (−0.01 to 0.04), p>0.05

Mean total societal costs (SEM): 574 vs. 649, p>0.05

B vs. D

4.5 months

GPE: 2.1 vs. 1.2, adjusted difference 1.5 (0.4 to 2.6), p≤0.01

SF-6D: 0.82 vs. 0.80, adjusted difference 0.02 (−0.00 to 0.05), p>0.05

11.5 months

GPE: 2.1 vs. 1.9, adjusted difference 0.9 (−0.2 to 1.9), p>0.05

SF-6D: 0.83 vs. 0.80, adjusted difference 0.04 (0.01 to 0.06), p≤0.01

Mean total societal costs (SEM): 824 vs. 649

C vs. D

4.5 months

GPE: 2.6 vs. 1.2, adjusted difference 1.7 (0.6 to 2.8), p≤0.01

SF-6D: 0.84 vs. 0.80, adjusted difference 0.03 (0.00 to 0.06), p≤0.05

11.5 months

GPE: 2.6 vs. 1.9, adjusted difference 1.0 (−0.1 to 2.1), p>0.05

SF-6D: 0.84 vs. 0.80, adjusted difference 0.03 (0.02 to 0.07), p≤0.05

Mean total societal costs (SEM): 880 vs. 649

Nassif, 2011³⁴

4 months

Duration of pain: NR

Poor

A. Combined exercise (n=37) (stretching, stability, coordination, and muscle strengthening exercises), 24 sessions over 8 weeks

B. Usual care (n=38)

A vs. B

Age: 45 vs. 45

Female: 11% vs. 21%

Baseline RDQ: 13.9 vs. 12.3

Baseline pain (0-10 VAS): 4.5 vs. 4.9

4 months

RDQ (0-24): 10.0 vs. 10.6, difference −0.6 (95% CI −3.5 to 2.3)

Quebec Back Pain Disability Questionnaire: 27.2 vs. 30.2, difference −3.0 (95% CI −11.7 to 5.7)

Pain (0-10 NRS): 3.2 (2.3) vs. 3.5 (2.5), difference −0.3 (95% CI −1.6 to 1.0)

4 months

Dallas Pain Questionnaire anxiety and depression: 31.2 vs. 28.9, difference 2.3 (95% CI −8.2 to 12.8)

Natour, 2014³⁶

3 months

Duration of pain: >1 year

Fair

A. Exercise (Pilates) (n=30), 24 sessions over 12 weeks

B. Usual care (n=30) (no treatment)

A vs. B

Age: 48 vs. 48

Female: 80% vs. 77%

Baseline RDQ: 1.1 vs. 10.6

Baseline pain (0-10 VAS): 5.5 vs. 5.8

3 months

RDQ (0-24): 7.0 vs. 10.7, difference −3.6, p<0.001

Pain (0-10 VAS): 4.2 vs. 5.8, difference −1.6, p<0.001

3 months

SF-36 physical function (0-100): 65.4 vs. 59.6, difference 5.8, p=0.026

SF-36 role physical: 56.4 vs. 40.0, difference 16.4, p=0.086

SF-36 bodily pain: 52.2 vs. 43.9, difference 8.3, p=0.030

SF-36 general health: 65.2 vs. 62.1, difference 3.1, p=0.772

SF-36 mental health: 67.9 vs. 65.3, difference 2.6, p=0.243

SF-36 social functioning: 86.0 vs. 80.4, difference 5.6, p=0.09

No differences on other SF-36 subscales

: CI = confidence interval; CPGS=Von Korff Chronic Pain Grade Score; GPE = Global Perceived Effect Scale; LBO = Low Back Outcome Score; MCS = Mental Component Score; NPS= numeric pain scale; NR = not reported; NRS = Numeric Rating Scale; ODI = Oswestry Disability Index; PCS = Physical Component Score; PSFS = Patient Specific Functional Scale; RDQ = Roland-Morris Disability Questionnaire; RR = relative risk; SEM = standard error of the mean; SF-36 = Short-Form 36 questionnaire; VAS = visual analog scale
a: Unless otherwise noted, followup time is calculated from the end of the treatment period.
b: For missed work days: time period 1 (months 1-4), time period 2 (months 5-8) and time period 3 (months 9-12).

Table 6Chronic low back pain: psychological therapies

Author, Year, Followup,^a Pain Duration, Study Quality

Intervention

Population

Function and Pain Outcomes

Other Outcomes

Cherkin, 2016¹⁰⁴

Herman, 2017¹⁹⁶

Cherkin, 2017¹⁹⁵ (2 year data from Cherkin, 2016)

22 months

Duration of pain: >3 months (>1 year in 80% of patients)

Fair

A. CBT (n=112), 8 sessions over 8 weeks

B. Usual care (n=113)

A vs. B

49 vs. 49 years

Female: 59% vs. 77%

Baseline modified RDQ (0-23): 11.5 vs. 10.9

Baseline pain bothersome-ness (0-10): 6.0 vs. 6.0

A vs. B

4.5 months

Modified RDQ (0-23): −4.38 (95% CI −5.3 to −3.47) vs. −2.96 (95% CI −3.79 to −2.14)

Pain (0-10): −1.56 (95% CI −2.02 to −1.11) vs. −0.84 (95% CI −1.21 to −0.46)

10 months

Modified RDQ (0-23): −4.78 (95% CI −5.67 to −3.89) vs. −3.43 (95% CI −4.33 to −2.52)

Pain (0-10): −1.76 (95% CI −2.14, −1.39) vs. −1.10 (95% CI −1.48, −0.71)

≥30% improvement in pain: 39.6% (95% CI 31.7 to 49.5) vs. 31.0% (95% CI 23.8 to 40.3)

≥30% improvement in modified RDQ: 58.8% (95% CI 50.6 to 68.4) vs. 48.6% (95% CI 40.3 to 58.6)

22 months

Modified RDQ (0-23): −4.59 (95% CI−5.60 to −3.57) vs. −2.74 (95% CI−3.81 to −1.68)

≥30% improvement in modified RDQ: 62.0% (95% CI 53.5 to 71.7) vs. 42.0% (95% CI 33.8 to 52.2)

Pain: −1.79 (95% CI −2.21 to −1.37) vs. −1.25 (95% CI −1.69 to −0.81)

≥30% improvement in pain: 39.6% (95% CI 31.4 to 49.8) vs. 31.1% (95% CI 23.9 to 40.5)

A vs. B

4.5 months

PHQ-8 (0–24): −1.80 (95% CI −2.35 to −1.26) vs. −0.64 (95% CI −1.23 to −0.06)

SF-12 Physical component (0-100): 3.78 (95% CI 2.56to 5.00) vs. 3.27 (95% CI 2.09 to 4.44)

SF-12 Mental component (0-100): 2.13 (95% CI 0.86 to 3.40) vs. −1.11 (95% CI −2.39 to 0.17)

10 months

PHQ-8 (0–24): 1.72 (95% CI −2.28 to −1.16) vs. −0.88 (95% CI −1.50 to −0.27)

SF-12 Physical component: 3.79 (95% CI 2.55 to 5.03) vs. 2.93 (95% CI 1.70 to 4.16)

SF-12 Mental component: 1.81 (95% CI 0.59 to 3.03) vs. 0.75 (95% CI −0.58 to 2.08)

Total costs: $6,428 (95% CI $4676 to $10,262) vs. $6,304 (95% CI $4,193, $9,805)

Johnson, 2007¹⁰⁵

12 months

Duration of pain: 6 months

Fair

A. CBT (n=116), 8 sessions over 6 weeks

B. Usual care (n=118)

A vs. B

Age: 47 vs. 49

Female: 61% vs. 58%

Baseline RDQ (0-24): 10.6 vs. 10.9

Baseline pain (0-100 VAS): 44.9 vs. 51.6

A vs. B

6 months

RDQ (0-24): 6.5 vs. 8.0, adjusted difference −1.09 (95% CI −2.28 to 0.09)

Pain (0-100 VAS): 26.1 vs. 35.0, adjusted difference −4.60 (95% CI −11.07 to 1.88)

12 months

RDQ (0-24): 6.7 vs. 8.0, adjusted difference −0.93 (95% CI −2.30 to 0.45)

Pain (0-100 VAS): 27.9 vs. 36.4, adjusted difference −5.49 (95% CI −12.43 to 1.44)

A vs. B

6 months

Quality of life (0-1 EQ-5D): 0.75 vs. 0.71, adjusted difference 0.03 (95% CI −0.05 to 0.10)

12 months

Quality of life (0-1 EQ-5D): 0.75 vs. 0.71, adjusted difference 0.03 (95% CI −0.04 to 0.09)

Lamb 2010¹⁰⁶ and 2012¹⁰⁷

34 months

Duration of pain: 13 years

Fair

A. CBT (n=468), 8 sessions over unclear number of weeks

B. Attention control (n=233)

A vs. B

Age: 53 vs. 54 years

Female: 59% vs. 61%

Korff disability (0-100): 49 vs. 46

Baseline RDQ (0-24): 9 vs. 9

Baseline pain (0-100 Modified Von Korff): 59 vs. 59

Modified Von

A vs. B

3 months

Modified Von Korff disability (0-100): −13.2 (−15.74 to −10.59) vs. −8.9 (−12.27 to −5.56), adjusted difference −4.2 (−8.10 to −0.40)

RDQ (0-24): −2.0 (−2.43 to −1.58) vs. −1.1 (−1.54 to −0.35) adjusted difference −1.1 (−1.71 to −0.38)

Modified Von Korff pain (0-100): −12.2 (−14.56 to −9.83) vs. −5.4 (−8.40 to −2.49), adjusted difference −6.8 (−10.20 to −3.31)

4.5 months

Modified Von Korff disability: −13.9 (CI −16.25 to −11.55) vs. −5.7 (−9.22 to −2.28), adjusted difference −8.2 (−12.01 to −4.31)

RDQ: −2.5 (−3.03 to −1.96) vs. −1.0 (CI −1.67 to −0.40), adjusted difference −1.5 (−2.22 to −0.70)

Modified Von Korff pain: −13.7 (−16.20 to −11.29) vs. −5.7 (−8.99 to −2.41), adjusted difference −8.0 (−11.80 to −4.28)

10.5 months

Modified Von Korff disability: −13.8 (−16.28 to −11.39) vs. −5.4 (−8.90 to −1.99), adjusted difference −8.4 (−12.32 to −4.47)

RDQ: −2.4 (−2.84 to −1.89) vs. −1.1 (−1.72 to −0.39), adjusted difference −1.3 (−2.06 to −0.56)

Modified Von Korff pain: −13.4 (−15.96 to −10.77) vs. −6.4 (−9.66 to −3.14), adjusted difference −7.0 (−10.81 to −3.12)

34 months

Modified Von Korff disability: −16.7 (−19.43 to −13.93) vs. −11.2 (−15.59 vs. −6.86), adjusted difference −5.5 (−10.64 to −0.27)

RDQ: −2.9 (−3.42 to −2.38) vs. −1.6 (−2.48 to −0.80), adjusted difference −1.3 (−2.26 to −0.27

Modified Von Korff pain: −17.4 (−20.35 to −14.44) vs. −12.8 (−17.52 to −7.99), adjusted difference −4.6 (−10.28 to 1.00)

A vs. B

3 months

SF-12 PCS (0-100): 3.7 (2.82 to 4.59) vs. 1.5 (0.26 to 2.83), adjusted difference 2.2 (0.74 to 3.57)

SF-12 MCS (0-100): 1.3 (0.19 to 2.42) vs. 0 (−1.45 to 1.46), adjusted difference 1.3 (−0.36 to 2.96)

4.5 months

SF-12 PCS: 3.6 (2.72 to 4.52) vs. 1.8 (0.54 to 3.08), adjusted difference 1.8 (0.37 to 3.25)

SF-12 MCS: 2.5 (1.44 to 3.48) vs. −0.09 (−1.61 to 1.43), adjusted difference 2.6 (0.85 to 4.25)

10.5 months

SF-12 PCS: 4.9 (4.00 to 5.84) vs. 0.8 (−0.52 to 2.11), adjusted difference 4.1 (2.63 to 5.62)

SF-12 MSC: 0.9 (−0.10 to 1.90) vs. 0.7 (−0.75 to 2.20), adjusted difference 0.2 (−1.48 to 1.84)

34 months

EQ-5D: 0.07 (0.04 to 0.10) vs. 0.04 (−0.01 to 0.09), adjusted difference 0.03 (−0.03 to 0.08)

Poole, 2007¹⁰⁸

4.5 months

Duration of pain: 10.6 vs. 9.5 years

Poor

A. Respondent therapy (progressive muscle relaxation) (n=54), 6 sessions over 6-8 weeks

B. Usual care (n=45)

A vs. B

Age: 46 vs. 47

Female: 65% vs. 51%

Baseline Oswestry Disability Index (0-100% ODI): 33.2 vs. 36.6

Baseline pain (0-100 VAS): 40.7 vs. 40.6

A vs. B

4.5 month

ODI (0-100): 31.3 vs. 32.9

Pain (0-100 VAS): 41.3 vs. 42.7

A vs. B

4.5 month

Beck Depression Inventory (0-63): 12.6 vs. 12.8

SF-36 physical functioning (0-100): 57.3 vs. 52.2

SF-36 social functioning (0-100): 66.7 vs. 61.5

SF-36 emotional role limitations (0-100): 63.0 vs. 62.0

SF-36 pain (0-100): 48.8 vs. 44.4

SF-36 mental health (0-100): 64.4 vs. 67.7

SF-36 general health perception (0-100): 52.4 vs. 55.0

Turner, 1990¹³³

12 months

Duration of pain: 12.9 years

Poor

A. Operant therapy (n=25), 8 sessions over 8 weeks

B. Exercise (n=24)

Overall

Age: 44

Female: 48%

A vs. B

Baseline function (0-100 SIP): 7.9 vs. 8.4

Baseline pain (0-78 McGill Pain Rating): 21.0 vs. 19.4

A vs. B

6 months

Sickness Impact Profile (0-100): 7.6 vs. 6.3

McGill Pain Questionnaire Pain Rating Index (0-78): 19.5 vs. 15.7

12 months

Sickness Impact Profile (0-100): 5.3 vs. 4.7

McGill Pain Questionnaire Pain Rating Index: 16.4 vs. 14.9

A vs. B

6 months

CES-D Scale (0-60): 11.4 vs. 9.3

12 months

CES-D Scale: 8.3 vs. 9.3

: CBT = cognitive-behavioral therapy; CES-D = Center for Epidemiologic Studies-Depression; CI = confidence interval; MCS = Mental Component Score; ODI = Oswestry Disability Index; PCS = Physical Component Score; PHQ-8 = Patient Health Questionnaire 8-item depression scale; RDQ = Roland-Morris Disability Questionnaire; SF-12 = Short-Form 12 Questionaire; SF-36, Short-Form 36 Questionnaire; SIP = Sickness Impact Profile; VAS = visual analog scale
a: Unless otherwise noted, followup time is calculated from the end of the treatment period

Table 7Chronic low back pain: physical modalities (ultrasound)

Author, Year, Followup,^a Pain Duration, Study Quality

Intervention

Population

Function and Pain Outcomes

Other Outcomes

Ebadi, 2012¹³⁹

1 month

Duration of pain: Mean 6 to 8 years

Fair

A. Ultrasound (n=25), 1.5 W/cm² at 1 MHz, 10 sessions over 4 weeks

B. Sham ultrasound (n=25)

A vs. B

Age: 31 vs. 37 years

Female: 25% vs. 50%

Functional Rating Index (mean, 0-100): 41 vs. 44

Pain intensity (mean, 0-100 VAS): 47 vs. 49

A vs. B

1 month

Functional Rating Index (0-40): 22.8 vs. 30.5; p=0.004

Pain (0-100 VAS): 27.7 vs. 25.5; p=0.48

Licciardone, 2013¹⁴⁰

3 months

Proportion with LBP duration >1 year: 50%

Good

A. Ultrasound (n=233), 1.2 W/cm² at 1 MHz, 6 sessions over 8 weeks

B. Sham ultrasound (n=222)

A vs. B

Age: 38 vs. 43 years

Female: 58% vs. 68%

RDQ (0-24): 5 vs. 5

Pain intensity (0-100 VAS): 44 vs. 44

A vs. B

1 month, median (IQR)

RDQ (0-24): 3 (1-7) vs. 3 (1-7); p=0.93

Pain improved ≥30%: RR 1.03 (95% CI 0.87 to 1.23)

Pain improved ≥50%: RR 1.09 (95% CI 0.88 to 1.35)

Pain improved ≥20 mm on 0 to 100 VAS): RR 1.01 (95% CI 0.80 to 1.26)

2 months

RDQ (0-24): 3 vs. 4; p=0.76

≥50% improvement in pain: RR 1.09 (95% CI 0.88 to 1.35)

3 months

RDQ (0-24): 3 vs. 3; p=0.93

A vs. B

1 month

SF-36 general health (0-100): 72 (52-87) vs. 74 (54-87); p=0.6

Lost 1 or more days work in past 4 weeks because of low back pain: 13% vs. 6%, p=0.11

Prescription drug use for LBP: 16% vs. 18%, p=0.54

SF-36 general health (0-100): 72 (52-87) vs. 74 (54-87), p=0.73

2 months

SF-36 general health (0-100): 72 vs.72 (57-85); p=0.53

≥50% improvement in pain: RR 1.09 (95% CI 0.88 to 1.35)

3 months

SF-36 general health (0-100): 72 vs. 74, p=0.66

: CI = confidence interval; IQR = inter-quartile range; LBP = low back pain; MHz = megahertz; NR = not reported; RDQ = Roland-Morris Disability Questionnaire; RR = relative risk; SF-36 = Short-Form 36 Questionnaire; VAS = visual analog scale; W/cm² = Watt per square centimeter
a: Unless otherwise noted, followup time is calculated from the end of the treatment period

Table 8Chronic low back pain: physical modalities (interferential therapy)

Author, Year, Followup,^a Pain Duration, Study Quality

Intervention

Population

Function and Pain Outcomes

Other Outcomes

Correa, 2016¹⁴⁴

3 months

Duration of pain: Mean 95.3 to 99.4 months

Fair

All groups received: 3 sessions/week for 4 weeks (12 total sessions)

A. 1 kHz Interferential current (n=50)

B. 4 kHz Interferential current (n=50)

C. Placebo interferential current (n=50)

A vs. B vs. C

Age: 51 vs. 54 vs. 49 years

Female: 70% vs. 80% vs. 80%

Baseline RDQ (0-24): 13.3 vs. 14.2 vs. 15.1

Baseline NRS pain score in last 7 days (0-10): 7.5 vs. 7.5 vs. 7.4

A vs. C

3 months

RDQ: 9.0 vs. 10.3, adjusted difference 0.3 (CI unclear) ^b

NRS at rest: 4.6 vs. 4.7, adjusted difference 0.4 (CI unclear)

B vs. C

3 months

RDQ: 9.3 vs. 10.3, adjusted difference 0.2 (CI unclear) ^b

NRS at rest: 4.4 vs. 4.7, adjusted difference 0.2 (CI unclear)

A vs. C

3 months

GPE: 1.7 (3.1) vs. 1.6 (3.1), adjusted difference 0.6 (CI unclear)

Mean number of times that patients needed to take pain medication between treatment sessions: 12.5 (6.0) vs. 30.7 (15.2), p=0.01; difference –18.2 (95% CI –22.79 to –13.61)

B vs. C

3 months

GPE: 1.8 (3.0) vs. 1.6 (3.1), adjusted difference 0.5 (CI unclear)

Mean number of times that patients needed to take pain medication between treatment sessions: 13.1 (6.9) vs. 30.7 (15.2), p=0.014; difference −17.6 (95% CI −22.28 to −12.92)

: CI = confidence interval; GPE= global perceived effect; kHz = kilohertz; NRS= Numerical Rating Scale, RDQ = Roland-Morris Disability Questionnaire
a: Unless otherwise noted, followup time is calculated from the end of the treatment period
b: There appeared to be errors in reporting of the confidence intervals for this study since the confidence intervals did not include the point estimates.

Table 9Chronic low back pain: physical modalities (low-level laser therapy)

Author, Year, Followup,^a Pain Duration, Study Quality

Intervention

Population

Function and Pain Outcomes

Other Outcomes

Basford, 1999¹⁴²

2 months

Duration of pain: 4.5 vs. 6.5 months

Fair

A. Nd:YAG laser (542 mW/cm², 90 seconds, two sites, applied to eight points along L2 to S3 paraspinal tissues) (n=27)

12 sessions over 4 weeks

B. Sham laser (n=29)

A vs. B

Age: 48 vs.48 years

Female: 40% vs. 55%

Baseline ODI: 21 vs. 25

Baseline maximal pain, last 24 hours (0-100 VAS): 35.2 vs. 37.4

A vs. B

2 months

ODI (0-100): 14.7 vs. 22.9, difference −8.2 (95% CI −13.6 to −2.8); p=0.004

Maximal pain in last 24 hours (0-100 VAS): 19.1 vs. 35.1, difference −16.0 (95% CI −28.3 to −3.7); p=0.012

A vs. B

2 months

Patient perception of benefit (VAS, lower = less pain): 28.3 vs. 37.8 (95% CI −20.9 to 1.9); p=0.101

Djavid, 2007¹⁷⁰

1.5 months

Duration of pain: 29 months vs. 29 months vs. 25 months

Fair

A. GaAs laser (wavelength 810 nm, 50 mW wave, and 0.2211 cm² spot area laser applied to 8 points along L2 to S2-S3 paraspinal tissues, dose 27 J/cm²) (n=16)

12 sessions over 6 weeks

B. Low-level laser therapy plus exercise (n=19)

C. Exercise plus sham laser (strengthening, stretching, mobilizing, coordination) (n=18)

A vs. B vs. C

Age: 40 vs. 38 vs. 36 years

Female: 5% vs. 7% vs. 2%

Baseline ODI (0-100): 33.0 vs. 31.8

Baseline pain (0-10 VAS): 7.3 vs. 6.3

A vs. C

1.5 months

ODI (0-100): 20.8 vs. 24.1, difference in change from baseline −4.4 (95% CI −11.4 to 2.5)

Pain (0-10 VAS): 4.4 vs. 4.3, difference in change from baseline −0.9 (95% CI −2.5 to 0.7)

A vs. B

1.5 months

ODI (0-100): 20.8 vs. 16.8 difference in change from baseline −4.4 (95% CI −11.4 to 2.5)

Pain (0-10 VAS): 4.4 vs. 2.4, difference in change from baseline −0.9 (95% CI −2.5 to 0.7)

Soriano, 1998¹⁴¹

6 months

Duration of pain: greater than 3 months

Poor

A GaAs laser (wavelength 904 nm, pulse frequency 10,000 Hz, pulse width 200 nsec, peak power 20W, average power 40mW, administered at dose of 4 J/cm² per point to pain areas) (n=38)

10 sessions over 5 weeks

B. Sham laser (n=33)

A vs. B

Age: 63 vs. 64 years

Female: 58% vs. 52%

Baseline function: NR

Baseline pain (1 to 10): 7.9 vs. 8.1

6 months

No pain: 44.7% vs. 15%; p<0.01

Pain recurrence in subgroup of patients with a good or excellent response at end of treatment: 35 % vs. 70%; p=NR

: CI =confidence interval; Hz = hertz; J/cm² = Joules per square centimeter; mW = megawatt; Nd:YAG = neodymium-doped yttrium aluminum garnet; NR = not reported; ODI = Oswestry Disability Index; VAS = visual analog scale
a: Unless otherwise noted, followup time is calculated from the end of the treatment period

Table 10Chronic low back pain: physical modalities (traction)

Author, Year, Followup,^a Pain Duration, Study Quality

Intervention

Population

Function and Pain Outcomes

Other Outcomes

Beurskens, 1997¹³⁷

1.75 and 5 months

Duration of pain: 1.5 months

Fair

A. Continuous traction (n=77)

B. Sham traction (20% body weight) (n=74)

12 sessions, 5 weeks

A vs. B

Age: 39 vs. 42 years

Female: 44% vs. 43%

Baseline RDQ (0-24): 2 vs. 12

Baseline pain (0-100 VAS): 61 vs. 55

A vs. B

1.75 months

RDQ: 4.4 vs. 4.3, difference 0.1 (95% CI −1.8 to 1.9)

Pain at the moment (0-100 VAS): 28.5 vs. 22.8, difference 5.7 (95% CI −4.6 to 15.9)

5 months

RDQ: 4.7 vs. 4.0, difference 0.7 (95% CI −1.1 to 2.6)

Pain at the moment (0-100 VAS): 23.8 vs. 20.1, difference 3.7 (95% CI −8.4 to 15.8)

A vs. B

1.75 months

ADL disability (0 to 100 VAS): 27.1 vs. 29.4, difference −2.4 (95% CI −13.6 to 8.9)

Work absence (days): 23.5 vs. 27.8, difference −4.3 (95% CI −14.7 to 6.1)

Medical consumption: 34% vs. 25%, difference 9% (95% CI −6 to 24)

5 months

ADL disability: 25.7 vs. 25.8, difference 0.1 (95% CI −11.5.0 to 11.2)

Work absence (days): 35.7 vs. 43.7, difference −8.0 (95% CI −27 to 11)

Medical consumption: 45% vs. 42%, difference 3% (95% CI −13% to 19%)

Schimmel, 2009¹³⁸

2 months

Duration of pain: 1 year

Fair

A. Intermittent traction (n=31)

B. Sham traction (<10% body weight) (n=29)

20 sessions, 6 weeks

A vs. B

Age (mean): 42 vs. 46 years

Female: 39% vs. 52%

Baseline ODI: 36 vs. 33

Baseline back pain (0-100 VAS): 61 vs. 53

A vs. B

2 months

ODI (0-100): 25 vs. 23 (SD, P not reported)

Pain (0-100 VAS): 32 vs. 36; p=0.70

A vs. B

2 months

SF-36, total (0-100): 66 vs. 65 (SD, p-value not reported)

: ADL = activities of daily living; CI = confidence interval; ODI = Oswestry Disability Index; RDQ = Roland-Morris Disability Questionnaire; SD = standard deviation; SF-36 =Short-Form 36 Questionnaire; VAS = visual analog scale
a: Unless otherwise noted, followup time is calculated from the end of the treatment period

Table 11Chronic low back pain: physical modalities (short-wave diathermy)

Author, Year, Followup,^a Pain Duration, Study Quality

Intervention

Population

Function and Pain Outcomes

Other Outcomes

Gibson, 1985¹⁴³

2 months

Duration of pain: 2 to 12 months

Poor

A. Short wave diathermy (active SWD) (n=34), 12 sessions, 3 session/per week for 4 weeks

B. Placebo (detuned SWD) (n=34)

A vs. B

Age: 35 vs. 40 years

Female: 47% vs. 32%

Pain (0-100 VAS): 45 vs. 48

A vs. B

2 months

Pain (0-100 VAS, median): 25 vs. 13 (IQR not reported)

Unable to work or with limited activities: 7% vs. 19% RR 0.40, 95% CI 0.09 to 1.80

A vs. B

2 months

Using analgesics: 7% vs. 22%, RR 0.34, 95% CI 0.08 to 1.50

: CI = confidence interval; IQR = interquartile range; RR = relative risk; SWD = short wave diathermy; VAS = visual analog scale
a: Unless otherwise noted, followup time is calculated from the end of the treatment period

Table 12Chronic low back pain: manual therapies (spinal manipulation)

Author, Year, Followup,^a Pain Duration, Study Quality

Intervention

Population

Function and Pain Outcomes

Other Outcomes

Bronfort, 2011¹⁹⁰

9 months

Duration of pain: 5 years

Fair

A. Standard manipulation (n=100), 12-24 sessions over 12 weeks

B. Exercise (supervised) (n=100)

C. Exercise (home) (n=101)

A vs. B

Age: 45.2 vs. 44.5 vs. 45.6 years

Female sex: 67% vs. 57% vs. 58%

Baseline Modified RDQ (0-23): 8.7 vs. 8.4 vs. 8.7

Baseline pain (0-10 NRS): 5.4 vs. 5.1 vs. 5.2

A vs. B

4 months

Modified RDQ (0-23): 4.9 vs. 4.0 vs. 4.2, adjusted difference 0.5 (95% CI −1.0 to 2.1) for A vs. B and 0.7 (95% CI −0.9 to 2.3) for A vs. C

Pain (0-10 NRS): 3.3 vs. 2.9 vs. 3.1, adjusted difference 0.3 (95% CI −0.5 to 1.0) for A vs. B and 0.1 (95% CI −0.6 to 0.9) for A vs. C

9 months

Modified RDQ (0-23): 5.1 vs. 3.8 vs. 4.1, adjusted difference 0.4 (95% CI −1.2 to 2.0) for A vs. B and −0.1 (95% CI −0.7 to 0.5) for A vs. C

Pain (0-10 NRS): 3.3 vs. 2.8 vs. 2.8, adjusted difference 0.3 (95% CI −0.5 to 1.1) for A vs. B and 0.3 (95% CI −0.6 to 1.1) for A vs. C

A vs. B

4 months

SF-36 PCS (norm-based mean=50): 48.6 vs. 50.6 vs. 49.1, adjusted difference −1.8 (95% CI −4.4 to 0.9) for A vs. B and −0.3 (95% CI −3.0 to 2.4) for A vs. C

SF-36 MCS (norm-based mean=50): 55.9 vs. 54.8 vs. 55.1, adjusted difference 0.4 (95% CI −2.0 to 2.9) for A vs. B and −0.5 (95% CI −3.0 to 2.1) for A vs. C

OTC pain medication use, past week (days): 1.6 vs. 1.4 vs. 1.5, adjusted difference 0.4 (95% CI −0.4 to 1.1) for A vs. B and 0.4 (95% CI −0.3 to 1.2) for A vs. C

9 months

SF-36 PCS (norm-based mean=50): 48.4 vs. 50.4 vs. 49.6, adjusted difference −1.7 (95% CI −4.2 to 0.8) for A vs. B and −1.0 (95% CI −3.5 to 1.5) for A vs. C

SF-36 MCS (norm-based mean=50): 55.2 vs. 53.9 (8.6) vs. 56.0, adjusted difference 2.4 (95% CI −0.2 to 5.0) for A vs. B and −2.2 (95% CI −4.9 to 0.5) for A vs. C

OTC pain medication use, past week (days): 1.8 vs. 1.8 vs. 1.6, adjusted difference 0.1 (95% CI −0.8 to 0.9) for A vs. B and 0.4 (95% CI −0.4 to 1.3) for A vs. C

Ferreira, 2007¹⁹¹

10 months

Duration of pain: Not reported

Fair

A. Standard manipulation and mobilization (n=80), 12 sessions over 8 weeks

B. Exercise (motor control) (n=80)

C: Exercise (general exercise) (n=80)

A vs. B vs. C

Age: 54 vs. 52 vs. 55 years

Female: 70 % vs. 66% vs. 70%

Baseline RDQ (0-24): 12.4 vs. 14.0 vs. 14.1

Baseline pain (0-10 VAS): 6.2 vs. 6.3 vs. 6.5

A vs. B vs. C

4 months

RDQ (0-24): 7.7 vs. 8.4 vs. 10.1, difference 0.2 (95% CI −1.5 to 1.9) for A vs. B and −0.9 (95% CI −2.7 to 0.9) for A vs. C

Pain (0-10 VAS): 4.3 vs. 4.3 vs. 4.8, difference 0.0 (95% CI −0.9 to 0.8) for A vs. B and −0.5 (95% CI −1.4 to 0.3) for A vs. C

10 months

RDQ (0-24): 9.2 vs. 8.8 vs. 9.6, difference 1.8 (95% CI 0.0 to 3.6) for A vs. B and 1.2 (95% CI −0.6 to 3.0) for A vs. C

Pain (0-10 VAS): 4.9 vs. 4.9 vs. 5.2, difference 0.1 (95% CI −0.8 to 1.0) for A vs. B and −0.2 (95% CI −1.1 to 0.6) for A vs. C

A vs. B vs. C

4 months

Patient Specific Functional Scale (3-30): 17.3 vs. 16.4 vs. 15.0, difference 0.7 (95% CI −1.3 to 2.7) for A vs. B and 1.7 (95% CI −0.4 to 3.,8) for A vs. C

10 months

Patient Specific Functional Scale (3-30): 15.2 vs. 15.7 (6.8) vs. 13.9, difference −0.8 (95% CI −2.9 to 1.2) for A vs. B and 0.3 (95% CI −1.7 to 2.3) for A vs. C

Gibson, 1985¹⁴³

2 months

Duration of pain: 2 to 12 months

Poor

A. Manipulation (technique unclear) and mobilization (n=41), 4 sessions over 4 weeks

B. Placebo (detuned short-wave diathermy) (n=34)

A vs. B

34 vs. 40 years

Female: 61% vs. 32%

Baseline pain (0-100 VAS): 35 vs. 48

A vs. B

1 month

Pain (median [range], 0-100 VAS): 28 (0−96) vs. 27(0-80)

3 months

Pain (median [range], 0-100 VAS): 25 (4-90) vs. 6 (10-96) p<0.01

A vs. B

1 month

Using analgesics: 25% vs. 50%

3 months

Using analgesics: 18% vs. 22%

Gudavalli, 2006¹⁹²

11 months

Duration of pain: >3 months

Fair

A. Flexion–distraction manipulation (n=123), 8-16 sessions over 4 weeks

B. Exercise (n=112)

A vs. B

Age: 42 vs. 41 years

Female: 34% vs. 41%

Baseline RDQ (0-24): 6.64 vs. 6.84

Baseline pain VAS (0-100: 38.00 vs. 35.70

A vs. B

2 months

RDQ (0-24): 3.50 vs. 3.75

Pain (0-100 VAS): 16.52 vs.12.04

5 months

RDQ (0-24): 3.89 vs. 3.42

Pain (0-100 VAS): 18.26 vs. 8.92

11 months

RDQ (0-24): 3.90 vs. 3.77

Pain (0-100 VAS): 17.10 vs. 12.36

Haas, 2014¹⁷¹

10.5 months

Duration of pain: 11 to 12 years

Fair

A. Standard spinal manipulation (n=100), 6 sessions over 6 weeks

B. Standard manipulation (n=100), 12 sessions over 6 weeks

C. Standard manipulation (n=100), 18 sessions over 6 weeks

D: Attention control (minimal massage) (n=100)

A vs. B vs. C vs. D

Age: 41 vs. 42 vs. 41 vs. 41

Female: 49% vs. 49% vs. 52% vs. 49%

Baseline Modified Von Korff functional disability (0–100): 44.8 vs.46.1 vs.45.2 vs. 45.2

Baseline Pain (0–100 VAS): 51.0 vs. 51.6 vs. 51. vs. 52.2

Baseline Von Korff pain intensity (0–100): 51.0 vs. 51.6 vs. 51.5 vs. 52.2

A vs. B

4 months

Von Korff functional disability (0-100): 25.6 vs. 24.0 vs. 24.1 vs. 27.1, adjusted difference −1.4 (95% CI −7.2 to 4.5) for A vs. D, −3.4 (95% CI −9.3 to 2.4) for B vs. D, and −2.9 (95% CI −8.8 to 2.9) for C vs. D

Von Korff functional disability improved ≥50%: 51.5% vs. 59.8% vs. 54.0% vs. 49.5%, adjusted difference 2.5% (95% CI −11.5 to 16.5%) for A vs. D, 10.4% (95% CI −3.4 to 24.3%) for B vs. D, and 4.8% (95% CI −9.1 to 18.6%) for C vs. D

Von Korff pain intensity (0-100): 32.5 vs. 33.7 vs. 32.1 vs. 34.9, adjusted difference −1.7 (95% CI −6.9 to 3.4) for A vs. D, −0.8 (95% CI −6.0 to 4.4) for B vs. D, and −2.4 (95% CI −7.6 to 2.9) for C vs. D

10.5 months

Von Korff functional disability (0-100): 22.6 vs. 22.4 vs. 19.1 vs. 28.0, adjusted difference −5.2 (95% CI −10.9 to 0.5) for A vs. D, −5.9 (95% CI −11.8 to −0.1) for B vs. D, and −8.8 (95% CI −14.4 to −3.3) for C vs. D

Von Korff functional disability improved ≥50%: 57.6% vs. 57.7% vs. 62.0% vs. 58.9%, adjusted difference −1.1% (95% CI −14.8 to 12.6%) for A vs. D, −1.4% (95% CI −15.4 to 12.6%) for B vs. D, and 2.7% (95% CI −11.0 to 16.5%) for C vs. D

Von Korff pain intensity (0-100): 30.7 vs. 31.9 (vs. 28.7 vs. 36.5, adjusted difference −5.4 (95% CI −11.1 to 0.4) for A vs. D, −4.6 (95% CI −10.3 to 1.2) for B vs. D, and −7.6 (95% CI −13.2 to −2.0) for C vs. D

A vs. B

4 months

SF-12 PCS (norm-based mean=50): 50.5 vs. 51.4 vs. 50.9 vs. 50.0, adjusted difference 0.0 (95% CI −2.4 to 2.3) for A vs. D, −0.8 (95% CI −3.2 to 1.6) for B vs. C, and −1.3 (95% CI −3.6 to 1.1) for C vs. D

SF-12 MCS (norm-based mean=50): 52.8 vs. 50.8 vs. 51.3 vs. 51.8, adjusted difference −2.1 (95% CI −4.2 to 0.0) for A vs. D, −0.7 (95% CI −2.8 to 1.3) for B vs. D, and −0.1 (95% CI −2.2 to 2.1) for C vs. D

EuroQoL (0-100): 77.8 vs. 77.0 vs. 74.5 vs. 73.9, difference −2.9 (95% CI −6.9 to 1.0) for A vs. D, −1.4 (95% CI −5.5 to 2.6) for B vs. D, and −1.5 (95% CI −5.8 to 2.7) for C vs. D

10.5 months

SF-12 PCS (norm-based mean=50): 50.8 vs. 52.6 vs. 52.5 vs. 50.7, adjusted difference −0.3 (95% CI −2.1 to 2.7) for A vs. D, −1.4 (95% CI −4.0 to 1.2) for B vs. D, and −2.2 (95% CI −4.5 to 0.2) for C vs. D

SF-12 MCS (norm-based mean=50): 50.4 vs. 50.6 vs. 50.4 vs. 51.3, adjusted difference −0.2 (95% CI −2.7 to 2.3) for A vs. D, −1.1 (95% CI −3.7 to 1.6) for B vs. D, and 0.3 (95% CI −2.3 to 2.9) for C vs. D

EuroQoL (0-100): 77.1 vs. 77.3 vs. 77.2 vs. 74.8, adjusted difference −1.3 (95% CI −5.4 to 2.7) for A vs. D, −0.9 (95% CI −4.9 to 3.1) for B vs. D, and −3.3 (95% CI −7.2 to 0.5) for C vs. D

Hondras, 2009¹⁷²

4.5 months

Duration of pain: Mean 9 to 13 years

Fair

A. Standard manipulation (n=96), 12 sessions over 6 weeks

B. Flexion distraction manipulation (n=95), 12 sessions over 6 weeks

C: Usual care (n=49)

A vs. B vs. C

Age: 64 vs. 62 vs. 63 years

Female: 45% vs. 44% vs. 41%

Baseline RDQ (0-24), mean: 6.5 vs. 6.6 vs. 5.7 Baseline pain (0-100 VAS): 42.1 (23.6) vs. 42.5 (25.2) vs. 42.4 (24.5)

1.5 months

RDQ (0-24): adjusted difference −1.5 (95% CI −3.1 to 0.1) for A vs. C and −2.2 (95% CI −3.7 to −0.6) for B vs. C

Global improvement from baseline (1-10): adjusted difference 1.3 (95% CI 0.2 to 2.3) for A vs. C and 1.6 (95% CI 0.5 to 2.7) for B vs. C

4.5 months

RDQ (0-24): adjusted difference −1.3 (95% CI −2.9 to 0.6) for A vs. C and −1.9 (95% CI −3.6 to −0.2) for B vs. C

Global improvement from baseline (1-10): adjusted difference 1.7 (95% CI 0.5 to 2.8) for A vs. C and 1.8 (95% CI 0.6 to 3.0) for B vs. C

Senna, 2011¹⁷³

9 months

Duration of pain: 18-19 months

Poor

A. Standard manipulation (n=25), 12 sessions over 4 weeks

B. Standard manipulation maintained (n=26), 12 sessions over 4 weeks, plus every 2 weeks for 9 months

C. Sham manipulation (n=37)

A vs. B

Age: 40 vs. 42 vs. 42 years

Female: 27% vs. 24% vs. 24%

Baseline function (0-100 ODI): 39 vs. 40 vs. 38

Baseline pain (0-100 VAS): 42 vs. 43 vs. 41

A vs. B

3 months

ODI (0-100): 29.8 vs. 23.1 vs. 33.5; p>0.05

Pain (0-100 VAS): 35.2 vs. 25.9 vs. 35.2; p>0.05

6 months

ODI (0-100): 32.2 vs. 22.4 vs. 35.3; p>0.05

Pain (0-100 VAS): 35.5 vs. 25.4 vs. 36.8; p>0.05

9 months

ODI (0-100): 34.9 vs. 20.6 vs. 37.4

Pain (0-100 VAS): 38.5 vs. 23.5 vs. 38.3

A vs. B

3 months

SF-36, total (0-100): 29.2 vs. 32.8 vs. 26.4; p>0.05

6 months

SF-36, total (0-100): 27.8 vs. 33.1 vs. 26.1; p>0.05

9 months

SF-36, total (0-100): 27.6 vs. 33.70 vs. 25.9; p>0.05

UK BEAM Trial Team, 2004¹⁷⁴

9 months

Duration of pain: >3 months in 59%

Fair

A: Standard manipulation (n=353), 8 sessions over 12 weeks

B: Usual care (n=338)

C: Exercise (n=310)

A vs. B vs. C

Age: 42 vs. 42 vs. 44

Female: 63% vs. 53% vs. 55%

Baseline RDQ (0-24): 8.9 and 8.9 vs. 9.0 vs. 9.2

Baseline Von Korff Pain (0-100): 61.4 and 61.6 vs. 60.5 vs. 60.8

A vs. B

9 months

RDQ (0-24): 5.15 vs. 6.16, adjusted difference −1.01 (95% CI −1.81 to −0.22)

Von Korff Disability (0-100): 29.85 vs. 35.50, adjusted difference −5.65 (95% CI −9.72 to −1.57)

Von Korff Pain (0-100): 41.68 vs. 47.56, adjusted difference −5.87 (95% CI −10.17 to −1.58)

A vs. C

9 months

RDQ (0-24): 5.15 (0.29) vs. 5.74 (0.31)

Von Korff Disability (0-100): 29.85 (1.50) vs. 29.73 (1.68)

Von Korff Pain (0-100): 41.68 (1.58) vs. 41.54 (1.84)

A vs. B

9 months

SF-36 PCS (0-100): 44.18 vs. 42.50, adjusted difference 1.68 (95% CI 0.18 to 3.19)

SF-36 MCS (0-100): 48.09 vs. 46.41, adjusted difference 1.68 (95% CI −0.21 to 3.57)

A vs. C

9 months

SF-36 PCS (0-100): 44.18 (0.55) vs. 44.39 (0.63)

SF-36 MCS (0-100): 48.09 (0.69) vs. 46.77 (0.81)

: CI = confidence interval; MCS = Mental Component Summary; NR = not reported; NRS = Numeric Rating Scale; ODI = Oswestry Disability Index; OTC = over-the-counter; PCS = Physical Component Score; RDQ = Roland-Morris Disability Questionnaire; SF-12 = Short-Form 12 Questionnaire; SF-36 = Short-Form 36 Questionnaire; VAS = visual analog scale
a: Unless otherwise noted, followup time is calculated from the end of the treatment period

Table 13Chronic low back pain: manual therapies (massage)

Author, Year, Followup,^a Pain Duration, Study Quality

Intervention

Population

Function and Pain Outcomes

Other Outcomes

Ajimsha, 2014¹⁷⁵

1 month

Duration of pain: 2.3 vs. 2.25 years

Fair

A. Myofascial release (n=38)

24 sessions, 3 session/week for 8 weeks

B. Sham myofascial release (n=36)

A vs. B

Age: 36 vs. 34 years

Female: 76% vs. 78%

Baseline Quebec Back Disability Scale (0-100): 37.1 vs. 35.3

Baseline pain (0-78 McGill Pain): 23.2 vs. 23.0

A vs. B

1 month

Quebec Back Disability Scale (0-100): 28.7 vs. 32.5, difference −2.02, p<0.005

McGill Pain Questionnaire (0-78): 13.1 vs. 18.3, difference −3.25, p<0.005

Arguisuelas, 2017¹⁷⁹

3 months

Duration of pain: 7.1 to 7.9 years

Fair

A. Myofascial release (n=27): Two 40-minute sessions per week for 2 weeks.

B. Sham myofascial release (n=27)

A vs. B

Age: 47 vs. 46 years

Female: 59% vs. 63%

Baseline RDQ (0-24, MCID= 3 points): 11.1 vs. 11.1

Baseline VAS (0-100, MCID= ≥20mm): 60.5 vs. 63.3

Baseline SF-MPQ (0-45, MCID= 5 points): 22.3 vs. 23

A vs. B

3 months

RDQ: 8.1 (95% CI 5.4 to 10.9) 11.8 (95% CI 9.1 to 14.5), difference −3.7 (95% CI −7.6 to −0.2), p≤0.05

VAS: 43.0 (95% CI 31.1 to 54.9) 52.0 (95% CI 40.1 to 63.9), difference −9.0 (95% CI −-25.8 to 7.9), p≤0.05

SF-MPQ: 15.28 (95% CI 11.1 to 20.6) vs. 23.7 (95% CI 18.9 to 28.4), difference −7.8 (95% CI −14.5 to −1.1), p≤0.05

RDQ improved ≥3 points: 81.8% vs. 25%

Pain (VAS) improved ≥20 points: 50.0% vs. 37.5%

SF-MPQ improved ≥5 points: 59.1% vs. 29.1%

A vs. B

3 months

FABQ: 48.1 (95% CI 38.1 to 58.1) vs. 61.6 (95% CI 51.7 to 71.6), difference −13.5 (95% CI −27.6 to 0.5), p≤0.05

Cherkin, 2001¹⁷⁶

10.5 months

Duration of pain >1 year: 64% vs. 62%

Fair

A. Mixed massage (including Swedish) (n=78) Up to 10 sessions over 10 weeks

B. Attention control (self-care education) (n=90)

A vs. B

Age: 46 vs. 44 years

Female: 69% vs. 56%

Baseline modified RDQ (0-23): 11.8 vs. 12.0

Baseline symptom bothersomeness (0-10): 6.2 vs. 6.1

A vs. B

10.5 months

Modified RDQ (0-23): 6.8 vs. 6.4, p=0.03

Symptom bothersomeness (0-10): 3.2 vs. 3.8, p=0.003

A vs. B

10.5 months

Low back pain medication: 2.5 vs. 4.0, p=0.69

SF-12 Mental Component Score: no differences, data not shown

Cherkin, 2011¹⁷⁷

9.5 months

Duration of pain ≥1 year: 77% vs. 72% vs. 78%

Fair

A. Structural massage (n=132): (myofascial, neuromuscular, and other soft-tissue techniques) 10 sessions for 10 weeks

B. Relaxation massage (n=136): 10 sessions for 10 weeks

C. Usual care (n=133)

A vs. B vs. C

Age: 46 vs. 47 vs. 48 years

Female: 66% vs. 65% vs. 62%

Symptom bothersomeness (0-10): 5.6 vs. 5.6 vs. 5.8

Modified RDQ (0-23): 10.1 vs. 11.6 vs.10.5

A vs. B vs. C

9.5 months

Symptom bothersomeness (0-10): 4.6 (95% CI 4.2 to 5.0) vs. 3.9 (95% CI 3.5 to 4.3) vs. 4.2 (95% CI 3.8 to 4.6)

Modified RDQ (0-23): 7.2 (95% CI 6.4, 7.9) vs. 6.0 (95% CI 5.2 to 6.9) vs. 7.4 (95% CI 6.6 to 8.3), adjusted difference −0.3 (95% CI −1.4 to 0.9) for A vs. C and −1.4 (95% CI −2.6 to −0.2) for B vs. C

A vs. B vs. C

9.5 months

SF-12 Mental (0-100): 52.4 (95% CI 50.9 to 53.8) vs. 53.5 (95% CI 52.2 to 54.8) vs. 51.9 (95% CI 50.2 to 53.6)

SF-12 Physical (0-100): 37.7 (95% CI 36.8 to 38.7) vs. 37.9 (95% CI 37.0 to 38.7) vs. 37.7 (95% CI 36.8 to 38.6)

Opioid use in last week for LBP: 4.8% (95% CI 3.1 to 7.3) vs. 4.9% (95% CI 3.1 to 7.9) vs. 4.9% (95% CI 2.7 to 8.7)

Global rating of improvement “much better” or “gone”: 26.1% (95% 19.8 to 34.6) vs. 36.2% (95% CI 29.1 to 45.0) vs. 20.5 (95% CI 14.5 to 29.0), RR 1.3 (95% CI 0.8, 2.0) for A vs. C and RR 1.8 (95% CI 1.2, 2.6) for B vs. C

Healthcare costs (median): $38 (range $0 to $1443) vs. $78 (range $0 to $3,764) vs. $25 (range $0 to $8,082)

Little, 2008¹⁸⁹

11.5 months

Duration of pain: NR

Fair

A. Mixed massage (including Swedish) (n=75), 6 sessions over 6 weeks/

B: Usual care (n=72)

C: Exercise (regular exercise) (n=72) 5 times per week

Age: 45-46 years

Female: 64-78%

Baseline RDQ (0-24): 10.8-11.3

Baseline Deyo troublesome-ness (1-5): 3.3−3.4

A vs. B

10.5 months

RDQ (0-24): NR vs. 9.23 (5.3), difference −0.45 (95% CI −2.3 to 1.39)

Von Korff disability (0-10): NR vs. 3.32 (2.25), difference 0.46 (95% CI −0.43 to 1.35)

Von Korff pain (0-10): NR vs. 4.74 (2.20), difference 0.29 (95% CI −0.58 to 1.16)

A vs. C

10.5 months

RDQ: −0.45 (−2.3 to 1.39) vs. −1.65 (−3.62 to 0.31)

Von Korff disability: 0.46 (−0.43 to 1.35) vs. 0.05 (−0.92 to 1.02)

Von Korff pain: 0.29 (−0.58 to 1.16) vs. −0.31 (−1.26 to 0.63)

A vs. B

10.5 months

Von Korff overall (0-10): NR vs. 4.19, difference 0.31 (95% CI −0.52 to 1.14)

SF-36 PCS (0-100): NR vs. 56.1 (18.6), difference −1.45 (95% CI −9.04 to 6.15)

SF-36 MCS (0-100): NR vs. 64.8 (17.5), difference −2.11 (95% CI −9.37 to 5.16)

Deyo troublesomeness scale (1-5): NR vs. 3.05 (0.80), difference 0.04 (−0.25 to 0.33)

A vs. C

10.5 months

Von Korff overall: 0.31 (−0.52 to 1.14) vs. −0.19 (−1.09 to 0.72)

SF-36 Physical Component Score: −1.45 (−9.04 to 6.15) vs. −2.08 (−10.6 to 6.40)

SF-36 Mental Component Score: −2.11 (−9.37 to 5.16) vs. 0.72 (−7.38 to 8.81)

Deyo troublesomeness scale: 0.04 (−0.25 to 0.33) vs. −0.21 (−0.52 to 0.09)

Movahedi, 2017¹⁸⁰

1 month

Duration of pain: NR

Poor

A. Acupressure (n=25): Three 14-minute sessions per week for 3 weeks (9 total sessions).

B. Sham acupressure (n=25)

A vs. B

Age: 37 vs. 37 years

Female: 100% vs. 100%

Baseline FSS (9-63): 34.9 (12.3) vs. 34.8 (13.4)

A vs. B

1 month

FSS: 24.3 vs. 36.6, p<0.001; difference −12.2 (95% CI −18.57 to −5.83)

Poole, 2007¹⁰⁸

4.5 months

Duration of pain: 10 vs. 11 vs. 9.5 years

Fair

A. Reflexology (n=77)

6 sessions over 6−8 weeks

B. Usual care (n=75)

A vs. B

Age: 47 vs. 47 years

Female: 62% vs. 51%

Baseline ODI: 33.0 vs. 36.6

Baseline pain (0-100 VAS): 44.5 vs. 40.6

A vs. B

4.5 months

ODI (0-100): 29.0 (20.2) vs. 32.9 (17.6)

Pain (0-100 VAS): 39.8 (29.2) vs. 42.7 (28.4)

A vs. B

4.5 months

Beck Depression Inventory (0-63): 11.6 (10.9) vs. 12.8 (9.2)

SF-36 Physical Functioning: 57.1 (31.8) vs. 52.2 (29.5)

SF-36 Social Functioning: 68.1 (31.8) vs. 61.5 (30.8)

SF-36 Physical Limitations: 48.2 (46.4) vs. 37.8 (42.5)

SF-36 Emotional Limitations: 55.0 (46.5) vs. 62.0 (44.0)

Quinn, 2008¹⁷⁸

1.5 and 3 months

Duration of pain: At least 3 months

Fair

A. Reflexology (pressure massage stimulation) (n=7)

6 sessions over 6 weeks

B. Sham reflexology (n=8)

A vs. B

Age (median): 42 vs. 45

Female: 86% vs. 50%

Baseline RDQ: 5 vs. 7.5

Baseline pain (0-10 VAS): 4.7 vs. 3.4

A vs. B

1.5 months, median (IQR)

RDQ: 4 (3 to 4.5) vs. 4.5 (1 to 7)

Pain (0-10 VAS): 2.1 (1.5 to 4.9) vs. 4.1 (2.7 to 5.1)

McGill Pain Questionnaire (0-77): 11 (6 to 17) vs. 6.5 (5 to 13)

3 months, median (IQR)

RDQ: 4 (2 to 5) vs. 3.5 (1.8 to 4.8)

VAS: 2.2 (1.6 to 3.2) vs. 3.2 (2.6 to 4.6)

McGill Pain Questionnaire (0-77): 6 (4 to 13) vs. 7.5 (3.8 to 9.8)

A vs. B

1.5 months, median (IQR)

SF-36 General health: 52.9 (49 to 54) vs. 42.2 (40 to 51)

SF-36 Physical functioning: 48.6 (47 to 50) vs. 43.4 (40 to 50)

SF-36 Mental health: 47.2 (43 to 56) vs. 47.2 (42 to 53)

3 months, median (IQR)

SF-36 General health: 48.2 (46 to 52) vs. 47.0 (38 to 53)

SF-36 Physical functioning: 50.7 (44 to 51) vs. 45.5 (44 to 50)

SF-36 Mental health: 52.8 (39 to 53) vs. 48.6 (44 to 51)

: Abbreviations: CI = confidence interval; FABQ = Fear-Avoidance Beliefs Questionnaire; FSS = Fatigue Severity Scale; IQR = interquartile range; LBP = low back pain; NR = not reported; MCS = Mental Component Summary; MCID = minimal clinically important difference; MPQ = McGill Pain Questionnaire; NR = not reported; PCS = Physical Component Summary; ODI = Oswestry Disability Index; RDQ =Roland-Morris disability questionnaire; RR = relative risk; SF-12 = Short-Form 12 questionaire; SF-36 = Short-Form 36 questionnaire; VAS = visual analog scale
a: Unless otherwise noted, followup time is calculated from the end of the treatment period

Table 14Chronic low back pain: mindfulness-based stress reduction

Author, Year, Followup,^a Pain Duration, Study Quality

Intervention

Population

Function and Pain Outcomes

Other Outcomes

Banth, 2015¹⁹⁴

1 month

Duration of pain: ≥6 months

Poor

A. Mindfulness-based stress reduction (n=24)

8 1.5-hour sessions over 8 weeks

B. Usual care (n=24)

48 of 88 patients were analyzed, n for each group NR

A vs. B (NR)

Age: 40 years

Female: 100%

Baseline function: NR

McGill Pain questionnaire total score (0-45): 26.08 vs. 26.71

A vs. B

1 month

McGill Pain questionnaire total score (0-45): 13.58 vs. 23.60

A vs. B

1 month

SF-12 Mental component (0-100): 31.54 (4.3) vs. 24.29 (5.2)

SF-12 Physical component (0-100): 28.08 (4.2) vs. 21.08 (3.3)

Cherkin, 2016¹⁰⁴

Herman, 2017¹⁹⁶

Cherkin, 2017¹⁹⁵ (2 year data from Cherkin, 2016)

22 months

Duration of pain: >3 months (>1 year in 80% of patients)

Fair

A. Mindfulness-based stress reduction (n=116), 8 2-hour sessions over 8 weeks (optional 6 hour retreat)

B. Usual care (n=113)

A vs. B

50 vs. 49 years

Female: 61% vs. 77%

Baseline modified RDQ (0-23): 11.8 vs. 10.9

Baseline pain bothersomeness (0-10): 6.1 vs. 6.0

A vs. B

4.5 months

Modified RDQ (0-23), mean change from baseline: −4.33 (95% CI −5.16 to −3.51) vs. −2.96 (95% CI −3.79 to −2.14)

Pain bothersomeness (0-10), mean change from baseline: −1.48 (95% CI −1.86 to −1.11) vs. −0.84 (95% CI −1.21 to −0.46)

≥30% improvement in RDQ: 60.5% (95% CI 52.0 to 70.3) vs. 44.1% (95% CI 35.9to 54.2)

≥30% improvement in pain bothersomeness: 43.6% (95% CI 35.6 to 53.3) vs. 26.6% (95% CI 19.8 to 35.9)

10 months

Modified RDQ, mean change from baseline:−5.3 (95% CI −6.16to −4.43) vs. −4.78 (95% CI −5.67to −3.89) vs. −3.43 (95% CI −4.33 to −2.52)

Pain bothersomeness, mean change from baseline: −1.95 (95% CI −2.32 to −1.59) vs. −1.10 (95% CI −1.48 to −0.71)

≥30% improvement in RDQ: 68.6% (95% CI 60.3 to 78.1) vs. 48.6% (95% CI 40.3 to 58.6)

≥30% improvement in pain bothersomeness: 48.5% (95% CI 40.3 to 58.3) vs. 31.0% (95% CI 23.8 to 40.3)

22 months

Modified RDQ (0-23): −4.09 (95% CI−5.08 to −3.10) vs. −2.74 (95% CI−3.81to −1.68)

≥30% improvement in modified RDQ: 55.4% (95% CI 46.9 to 65.5) vs. 42.05% (95% CI 33.8 to 52.2)

Pain bothersomeness: −1.57 (95% CI −1.97 to −1.17) vs. −1.25 (95% CI −1.69 to −0.81

≥30% improvement in pain bothersomeness: 41.2% (95% CI 33.2 to 51.0) vs. 31.1% (95% CI 23.9 to 40.5)

A vs. B

4.5 months

SF-12 MCS, mean change from baseline (0-100): 0.45 (95% CI −0.85 to 1.76) vs. 2.13 (95% CI 0.86 to 3.40) vs. −1.11 (95% CI −2.39 to 0.17)

SF-12 PCS, mean change from baseline (0-100): 3.58 (95% CI 2.15 to 5.01) vs. 3.27 (95% CI 2.09 to 4.44)

Used medications for LBP: 43.4% (95% CI 35.9to 52.6) vs. 54.2 (95% CI 46.2 to 63.6)

10 months

SF-12 MCS, mean change from baseline: 2.01 (95% CI 0.74 to 3.28) vs. 0.75 (95% CI −0.58 to 2.08)

SF-12 PCS, mean change from baseline: 3.87 (95% CI 2.55 to 5.19) vs. 2.93 (95% CI 1.70 to 4.16)

Used medications for LBP: 46.8% (95% CI 39.2 to 55.9) vs. 52.9% (95% CI 45.1 to 62.0)

Total costs: $5,580 (95% CI $3,465, $8,343) vs. $6,304 (95% CI $4,193, $9,805)

Morone, 2009¹⁹⁸

4 months

Duration of pain: Mean 9.4 to 11 years

Fair

A. Mindfulness-based stress reduction (n=16), 8 1.5-hour sessions over 8 weeks

B. Attention control (education) (n=19)

A vs. B

Age 78 vs. 73 years

Female: 69% vs. 58%

Baseline RDQ: 8.8 vs. 11.3

Baseline McGill Pain Questionnaire Current Pain (0-10): 2.9 vs. 4.4

A vs. B

4 months

RDQ: 7.6 (95% CI 6.2 to 8.7) vs. 10.0 (95% CI 8.7 to 11.2)

McGill Pain Questionnaire Total Score (0-45): 12.4 (95% CI 10.4 to 14.6) vs. 12.0 (95% CI 10.2 to 13.7)

McGill Pain Questionnaire Current Pain (0-10): 2.3 (95% CI 1.6 to 2.8) vs. 3.7 (95% CI 3.1 to 4.3)

A vs. B

4 months

SF-36 Pain Score (10-62): 41.4 (95% CI 39.8 to 43.1) vs. 40.5 (95% CI 38.7 to 42.2)

Morone, 2016¹⁹⁷

4.5 months

Duration of pain: Mean 11 years

Fair

A. Mindfulness-based stress reduction (n=140), 8 1.5-hour sessions over 8 weeks, with 6 monthly booster sessions

B. Control, (health education) (n=142)

A vs. B

Age: 75 vs. 74 years

Female: 66% vs. 66%

Baseline RDQ (0-24): 15.6 vs. 15.4

Baseline Pain (0-20 NRS): 11.0 vs. 10.5

A vs. B

4.5 months

RDQ: 12.2 vs. 12.6, adjusted difference −0.4 (95% CI −1.5 to 0.7)

RDQ improved ≥2.5 points: 49.2% (58/117) vs. 48.9% (66/135), p=0.97

Pain (0-20 NRS): 9.5 vs. 10.6, adjusted difference −1.1 (95% CI −2.2 to −0.01)

Pain improved ≥30%: 36.7% (43/117) vs. 26.7% (36/135), p=0.09

A vs. B

4.5 months

SF-36 Global Health Composite (9-67): 42.4 vs. 41.2, adjusted difference 0.2 (95% CI −1.9 to 2.4)

SF-36 Physical Health Composite (20 to 65): 41.2 vs. 41.2, adjusted difference −0.1 (95% CI −1.9 to 1.8)

Zgierska, 2016¹⁹⁹

4.5 months

Duration of pain: Mean 14 years

Poor

A. Mindfulness-based stress reduction (n=21): 8 weekly 2 hour group sessions plus 30 minutes/day, 6 days/week of at home practice

B. Usual care (n=14)

Overall

Age: 51.8 years

Female: 80%

Baseline ODI (0-100): 68.1 vs. 64.5

Baseline Brief Pain Inventory pain intensity (0-10): 6.3 vs. 4.9

Baseline Opioid dose 166.9 vs. 120.3

A vs. B

4.5 months

ODI (0-100): −5.0 (95% CI 9.7 to 0.2) vs. 1.6 (95% CI −4.3 to 7.4)

Brief Pain Inventory pain intensity: −0.5 (95% CI −1.1 to 0.02) vs. 0.5 (95% CI 0.2 to 1.2)

A vs. B

4.5 months

Opioid dose (mg morphine equivalents): −10.1 (95% CI −35.5 to 15.2) vs. −0.2 (95% CI −31.4 to 30.9)

: CI = confidence interval; MCS = Mental Component Summary; NR = not reported; NRS = numeric rating scale; ODI = Oswestry Disability Index; PCS = Physical Component Summary; RDQ = Roland-Morris Disability Questionnaire; SF-12 = Short-Form 12 Questionaire SF-36 = Short-Form 36 Questionnaire
a: Unless otherwise noted, followup time is calculated from the end of the treatment period

Table 15Chronic low back pain: mind-body practices (yoga)

Author, Year, Followup,^a Pain Duration, Study Quality

Intervention

Population

Function and Pain Outcomes

Other Outcomes

Bramberg, 2017³⁷

4.2 months

Duration of pain: NR

Fair

A. Kundalini yoga (n=52): Two 60-minute yoga classes per week for 6 weeks (12 total yoga classes).

B. Strength training (n=52): Five 60-minute supervised strength-training sessions over 6 weeks.

C. Attention control (self-care advice) (n=55):

A vs. B vs. C

Age: 47 vs. 46 vs. 44 years

Female: 72% vs. 62% vs. 80%

Baseline CPGS-BD (0-100): 37.2 vs. 37.6 vs. 38.6

Baseline CPGS-BP (0-100): 57.1 vs. 57.7 vs. 55.6

A vs. C

6 months

CPGS-BD: 29.4 vs. 32.8, adjusted difference −6.0 (95% CI −15.6 to 3.6), p>0.05

CPGS-BP: 47.0 vs. 50.2, adjusted difference −6.5 (95% CI −14.9 to 1.8), p>0.05

B vs. C

6 months

CPGS-BD: 24.8 vs. 32.8, adjusted difference −9.5 (95% CI −19.3 to 0.4), p>0.05

CPGS-BP: 41.7 vs. 50.2, adjusted difference −9.4 (95% CI −18.1 to −0.8), p<0.05

A vs. B

6 months

CPGS-BD: 29.4 vs. 24.8, adjusted difference −3.5 (95% CI −12.2 vs. 5.3), p>0.05

CPGS-BP: 47.0 vs. 41.7, adjusted difference - 2.9 (95% CI −10.9 to 5.1), p>0.05

Work absence (mean days over time period)^b

A vs. C

-1 to 4 months: 4.1 vs. 8.9, difference −4.8 (95% CI -11.4 to 1.8)

-5 to 8 months: 4.0 vs. 12.0, difference −8.0 (95% CI −15.8 to −0.2)

-9 to 12 months: 3.6 vs. 9.2, difference −5.6 (95% CI −12.7 to 1.5); Proportion absent ≥1 time (%): RR 0.82 (95% CI 0.6 to 1.1)

B vs. C

-1 to 4 months: 5.0 vs. 8.9, difference −3.9 (95% CI −11.4 to 3.6)

-5 to 8 months: 6.4 vs. 12.5, difference −6.1 (95% CI −15.7 to 3.5)

-9 to 12 months: 9.5 vs. 9.2, difference 0.3 (95% CI −10.3 to 10.9); Proportion absent ≥1 time (%): RR 0.95 (95% CI 0.73 to 1.22)

A vs. B

-1 to 4 months: 4.1 vs. 5.0, difference −0.9 (95% CI −4.7 to 2.8)

-5 to 8 months: 4.0 vs. 6.4, difference −2.4 (95% CI −7.5 to 2.7)

-9 to 12 months: 3.6 vs. 9.5, difference −5.9 (95% CI −12.7 to 0.9); ≥1 day: 50% vs. 51%; Proportion absent ≥1 time (%): RR 0.86 (95% CI 0.65 to 1.14)

Groessl, 2017²⁰⁴

3.5 months

Duration of pain: >6 months

Fair

A. Hatha yoga (n=75): Two sessions per week for 12 weeks, 15–20 minutes of home practice on days without sessions

B. Wait list (n=75): Usual care, with yoga started after 6 months

A vs. B

Age: 53 vs. 54 years

Female: 27% vs. 25%

Baseline RDQ (0-24): 9.40 vs. 10.3

Baseline pain (0-10 Brief Pain Inventory): 4.64 vs. 4.68

A vs. B

3.5 months

RDQ (0-24): −3.37 (95% CI −4.51 to −2.23) vs. −0.89 (95% CI−2.02 to 0.23); between group difference −2.48 (95% CI - 4.08 to −0.87)

Pain intensity, Brief Pain Inventory (0-10): −0.44 (95% CI- 0.78 to −0.11) vs. 0.15 (95% CI −0.18 to 0.47); between-group difference −0.59 (95% CI −1.05 to −0.13)

A vs. B

3.5 months

Opioid medication use: 9% vs. 7%, p=0.40

Other medical treatments for pain: 39% vs. 37, p=0.42

Highland, 2017²¹¹

3 and 6 months

Duration of pain: >3 months

Fair

A. Restorative Exercise and Strength Training for Operational Resilience and Excellence (RESTORE) Yoga Program (n=34): Participants completed 2 individual yoga sessions per week in weeks 1 to 4 and then once-weekly sessions in weeks 5 to 8.

B. Usual care (n=34)

A + B

Age: 44 years

Female: 63%

A vs. B

Baseline PROMIS-PF (0-100; MCID = 3 point change): 40.67 vs. 42.03

Baseline RDQ (0-24; MCID = 30% reduction): 9.21 vs. 8.68

Baseline DVPRS (0-10; MCID = 2 point change or 30% reduction): 4.68 vs. 4.32

A vs. B

3 months

PROMIS-PF: 47.34 vs. 43.38, difference 3.96 (95% CI 0.93 to 6.99)

RDQ: 4.43 vs. 7.04, difference −2.61 (95% CI −4.83 to −0.34)

DVPRS: 2.75 vs. 3.35, difference −0.6 (95% CI −1.63 to 0.43)

PROMIS-SB: 49.04 vs. 52.04, difference −3.0 (95% CI −5.82 to −0.18)

6 months

PROMIS-PF: 47.05 vs. 44.06, difference 2.99 (95% CI −0.57 to 6.55)

RDQ: 3.25 vs. 6.52, difference −3.27 (95% CI −5.39 to −1.15)

DVPRS: 2.79 vs. 2.86, difference −0.07 (95% CI −1.13 to 0.99)

PROMIS-SB: 48.11 vs. 51.80, difference −3.69 (95% CI −7.27 to −0.11)

Proportion of patients achieving a MCID (at 6 months)^c

PROMIS-PF: 35% (8/23) vs. 20% (4/20); adjusted p=1.0

RDQ: 79% (19/24) vs. 52% (11/21), adjusted p=0.66

DVPRS: 63% (15/24) vs. 48% (10/21), adjusted p=1.0

A vs. B

3 months

PROMIS-SB: 49.04 vs. 52.04, difference −3.0 (95% CI −5.82 to −0.18)

6 months

PROMIS-SB: 48.11 vs. 51.80, difference −3.69 (95% CI −7.27 to −0.11)

Proportion of patients achieving an MCID (at 6 months)^c

PROMIS-SB (MCID = 5 point change): 83% (19/23) vs. 40% (8/20), adjusted p=0.03

Nambi, 2014²²⁰

5.5 months

Duration of pain: >3 months

Fair

A. Iyengar yoga (29 poses) (n=30)

5 sessions a week for 4 weeks

B. Exercise (stretching exercises for soft tissue flexibility and range of motion) (n=30)

A vs. B

Age: 44 vs. 43 years

Female: 63% vs. 43%

Baseline function, Physically unhealthy days: 18.0 vs. 17.8

Baseline pain (0-10 VAS): 6.7 vs. 6.7

A vs. B

5 months

Physically unhealthy days: 2.6 vs. 6.9, p=0.001

Pain (0-10 VAS): 1.8 vs. 3.8, p=0.001

A vs. B

5 months

Mentally unhealthy days: 2.1 vs. 5.0, p=0.001

Activity limitation (days): 2.0 vs. 5.0, p=0.001

Saper, 2017²⁰⁵

10 months

Duration of pain: >3 months

Fair

A. Hatha yoga (n=127)

12 sessions over 12 weeks, with or without ongoing weekly maintenance sessions

B. Exercise (n=129)

C. Attention control (education) (n=64)

A vs. B vs. C

Age: 46 vs. 46 vs. 44

Female: 57% vs. 70% vs. 66%

Baseline modified RDQ: 13.9 vs. 15.6 vs. 15.0

Baseline pain (0-10 NRS): 7.1 vs. 7.2 vs. 7.0

A1 (no maintenance) vs. A2 (maintenance) vs. C, mean

3.5 months

Modified RDQ (0-23): 10.1 vs. 9.5 vs. 11.6

Pain (0-10 NRS): 4.3 vs. 4.6 vs. 5.5

9 months

Modified RDQ (0-23): 9.2 vs. 8.9 vs. 11.1

Pain (0-10 NRS): 4.3 vs. 4.4 vs. 5.2

A1 vs. A2 vs. B1 vs. B2

3.5 months

Modified RDQ (0-23): 10.1 vs. 9.5 vs. 10.4 vs. 10.1

Pain (0-10 NRS): 4.3 vs. 4.6 vs. 4.7 vs. 4.8

9 months

Modified RDQ (0-23): 9.2 vs. 8.9 vs. 8.9 vs. 9.4

Pain (0-10 NRS): 4.3 vs. 4.4 vs. 4.0 vs. 4.1

Sherman, 2005²⁰⁶

3.5 months

Duration of pain: 3 to 15 months

Fair

A. Viniyoga (n=36)

12 sessions 1 session/week for 12 weeks

B. Exercise (n=35)

C. Attention control (self-care advice) (n=30)

A vs. B vs. C

Age: 44 vs. 42 vs. 45

Female: 69% vs. 63% vs. 67%

Baseline RDQ: 8.1 vs. 9.0 vs. 8.0

Baseline symptom bothersomeness (0-10): 5.4 vs. 5.7 vs. 5.4

A vs. B

3.5 months

Modified RDQ (0-23): 3 vs. 5 (estimated from graph), adjusted difference −1.5 (−3.2 to 0.2)^d

Reduction in RDQ score ≥50%:69% vs. 50%, RR 1.4 (95% CI 0.91 to 2.1)

Bothersomeness: 1.8 vs. 3.3 (estimated from graph), adjusted difference −1.4 (95% CI −2.5 to −0.2)^d

Medication use: 21% vs. 50%, RR 0.41 (95% CI 0.20 to 0.87)

A vs. C

3.5 months

Symptom bothersomeness (0-10): 1.8 vs. 4.1, adjusted difference −2.2 (95% CI −3.2 to −1.2)

Modified RDQ (0-23): 3 vs. 7, adjusted difference −3.6 (95% CI −5.4 to −1.8)

Reduction in RDQ ≥50%: 69% vs. 30%, RR 2.3 (95% CI 1.3 to 4.2)

A vs. B

3.5 months

Medication use: 21% vs. 59%, RR 0.35 (95% CI 0.15 to 0.73)

SF-36: No significant differences (data not provided)

Sherman, 2011²⁰⁷

3.5 months

Duration of pain: 3 to 6 months

Fair

A. Viniyoga (n=92)

12 sessions 1 session/week for 12 weeks

B. Exercise (n=91)

C. Attention control (self-care advice) (n=30)

A vs. B

Age: 47 vs. 49 vs. 50

Female: 67% vs. 63% vs. 60%

Baseline RDQ: 9.8 vs. 8.6 vs. 9.0

Baseline symptom bothersomeness (0-10): 4.9 vs. 4.5 vs. 4.7

A vs. B

3.5 months

Modified RDQ (0-23): 4.49 (95% CI 3.51 to 5.48) vs. 4.26 (95% CI 3.30 to 5.22), adjusted difference −0.35 (95% CI −1.52 to 0.83)^d

Reduction in RDQ score ≥50%: 60% vs. 51%, RR 1.17 (95% CI 0.88 to 1.54)

Symptom bothersomeness (0-10): 3.59 (95 % CI 3.12 to 4.06) vs. 3.34 (95% CI 2.86 to 3.81)

A vs. C

3.5 months

Modified RDQ (0-23): 4.49 vs. 5.73, adjusted difference −1.81 (95% CI −3.12 to −0.50)^d

Reduction in RDQ score ≥50%: 60% vs. 31%, RR 1.90 (95% CI 1.21 to 2.99)

Symptom bothersomeness (0-10): 3.59 (95% CI 3.12 to 4.06) vs. 3.80 (95% CI 3.14 to 4.46)

A vs. B

3.5 months

LBP better, much better, or completely gone: 51% vs. 51%, RR 1.00 (95% CI 0.75 to 1.34)

A vs. C

LBP better, much better, or completely gone: 51% vs. 20%, RR 2.57, 95% CI 1.39 to 4.78)

Tilbrook, 2011²⁰⁸

3 and 6 months

Duration of pain: 96 vs. 72 months

Fair

A. Iyengar yoga (n=152)

12 sessions 1 session/week for 12 weeks

B. Attention control (self-care advice) (n=147)

A vs. B

Age: 46 vs. 46

Female: 68% vs. 73%

Baseline RDQ (0-24): 7.84 vs. 7.75

Baseline Aberdeen Back Pain Scale (0-100): 25.36 vs. 26.69

A vs. B

Difference in change from baseline (95% CI)

3 months

RDQ (0-24): −1.48 (−2.62 to −0.33)

Aberdeen Back Pain Scale (0 to 100): −1.74 (−4.32 to 0.84)

6 months

RDQ(0-24): −1.57 (−2.71 to −0.42)

Aberdeen Back Pain Scale: −0.73 (−3.30 to 1.84)

A vs. B

Difference in change from baseline (95% CI)

3 months

SF-12 PCS (0-100): 1.24 (−0.83 to 3.33)

SF-12 MCS (0-100): 2.02 (−0.34 to 4.37)

6 months

SF-12 PCS: 0.80 (−1.28 to 2.87)

SF-12 MCS: 0.42 (−1.92 to 2.77)

Williams, 2005²¹⁰

3 months

Duration of pain: 11.3 vs. 11.0 years

Fair

A. Iyengar yoga (n=30), 16 sessions 1 session/week for 16 weeks

B. Attention control (education) (n=30)

A vs. B

Age: 49 vs. 48

Female: 65% vs. 70%

Pain Disability Index (7-70): 14.3 vs. 21.2

Pain intensity, McGill Pain Questionnaire (0-10 VAS): 2.3 vs. 3.2

A vs. B

3 months

Pain Disability Index (7-70): 3.9 vs. 12.7, p=0.009

Pain, McGill Pain Questionnaire (0-10 VAS): 0.6 vs. 2.0, p=0.039

Present Pain Index (0-5): 0.5 vs. 1.1, p=0.013

A vs. B

3 months

Stopped or decreased medication use: 50% vs. 33%, p=0.007

Williams, 2009²⁰⁹

6 months

Duration of pain: 47 vs. 78 months

Fair

A. Iyengar yoga (n=43)

48 sessions for 24 weeks

B. Waitlist (standard medical care) (n=47)

A vs. B

Age: 48 vs. 48 years

Female: 74% vs. 79%

Oswestry Disability Index (0-100): 25.2 vs. 23.1

Pain (0-100 VAS): 41.9 vs. 41.2

A vs. B

6 months

Oswestry Disability Index (0-100): 19.3 vs. 23.5, p=0.001

Pain (0-100 VAS): 22.2 vs. 38.3, p=0.0009

A vs. B

6 months

Beck Depression Inventory (0-63): 4.6 vs. 7.8, p=0.0004

: CI = confidence interval; CPGS=Von Korff Chronic Pain Grade Score; DVPRS = Defense and Veterans Pain Rating Scale; LBP = low back pain; MCID = minimally clinically important difference; NR = not reported; NRS = numeric rating scale; ODI = Oswestry Disability Index; PROMIS-PF = Patient-Reported Outcomes Measurement Information System Physical Functioning; PROMIS-SB =Patient-Reported Outcomes Measurement Information System Symptom Burden; RDQ = Roland-Morris Disability Questionnaire; RR = relative risk; SF-36 = Short-Form 36 Questionaire; VAS = visual analog scale
a: Unless otherwise noted, followup time is calculated from the end of the treatment period.
b: For missed work days: time period 1 (months 1-4), time period 2 (months 5-8) and time period 3 (months 9-12).
c: n/N not reported; calculated fromTable 3 of article.
d: Adjusted for baseline scores.

Table 16Chronic low back pain: mind-body practices (qigong)

Author, Year, Followup,^a Pain Duration, Study Quality

Intervention

Population

Function and Pain Outcomes

Other Outcomes

Blodt, 2015²¹⁹

3 and 9 months

Duration of pain: Mean 3 years

Fair

A. Qigong (movement exercises and exercise to change “qi”) (n=64)

12 sessions over 12 weeks

B. Exercise (strengthening, stretching and relaxation exercises) (n=63)

A vs. B

Age (mean): 46 vs. 48 years

Female: 91% vs. 70%

Baseline RDQ: 6.2 vs. 5.7

Baseline pain (0-100 VAS): 55.6 vs. 52.1

A vs. B

3 months

RDQ (0-24): 4.1 vs. 3.1, difference 0.9 (95% CI –0.1 to 2.0)

Average low back pain (0-100 VAS): 35.1 vs. 27.4, difference 7.7 (95% CI 0.7 to 14.7)

9 months

RDQ: 4.3 vs. 3.1, difference 1.2 (95% CI 0.1 to 2.3)

Average low back pain (0-100 VAS): 35.9 vs. 28.8, difference 7.1 (95% CI –1.0 to 15.2)

A vs. B

3 months

SF-36 Bodily pain (0-100): 43.0 vs. 44.6, difference 1.5 (95% CI −1.2 to 4.2)

SF-36 Physical component score: 45.8 vs. 46.6, difference −0.8 (95% CI –3.4 to 1.9)

SF-36 Mental component score: 45.4 vs. 46.6, difference 11.2 (95% CI –4.9 to 2.4)

Quality of sleep (0-10): 4.6 vs. 4.5, difference 0.0 (95% CI–0.9 to 1.0)

Sleep satisfaction (0-10): 5.0 vs. 4.8, difference 0.3 (95% CI –0.6 to 1.1)

9 months

SF-36 Bodily pain: 41.4 vs. 43.4, difference −2.0 (95% CI −5.4 to 1.4)

SF-36 Physical component score: 44.8 vs. 46.5, difference −1.8 (95% CI −4.9 to 1.3)

SF-36 Mental component score: 45.0 vs. 45.5, difference −0.5 (95% CI −4.6 to 3.6)

Quality of sleep: 4.5 vs. 4.7, difference −0.2 (95% CI −1.0 to 0.7)

Sleep satisfaction: 5.1 vs. 5.1, difference −0.1 (95% CI –0.9 to 0.8)

: CI = confidence interval; RDQ = Roland-Morris Disability Questionnaire; SF-36 = Short-Form 36 Questionnaire; VAS = visual analog scale
a: Unless otherwise noted, followup time is calculated from the end of the treatment period

Table 17Chronic low back pain: acupuncture

Author, Year, Followup,^a Pain Duration, Study Quality

Intervention

Population

Function and Pain Outcomes

Other Outcomes

Brinkhaus, 2006a²²⁴

4 and 10 months

Duration of pain: 14.7 vs. 13.6 years

Good

A: Needle acupuncture to body acupoints (n=140) 12 sessions over 8 weeks

B: Sham acupuncture (n=70)

A vs. B

Age: 59 vs. 58 years

Female: 64% vs. 75%

Baseline Functional (FFbH-R) score: 57.1 vs. 57.2

Baseline pain (0-100 VAS): 63 vs. 66

Baseline Pain Disability Index (0-70): 28.9 vs. 31.5

A vs. B

4 months

Functional (0-10, FFbH-R 0, higher scores indicate better function): 66.0 vs. 64.1, difference 1.9 (95% CI −4.2 to 8.0)

Number of days with limited function in past 6 months: 40.9 vs. 59.5, difference −18.6 (95% CI −33.3 to −3.9)

Pain (0-100 VAS): 38.4 vs. 42.1, difference −3.8 (95% CI −12.4 to 4.9)

Pain Disability Index (0-70): 19.3 vs. 21.4, difference −2.1 (95% CI −6.3 to 2.1)

10 months

Functional (0-100 FFbH-R): 66.0 vs. 63.1, difference 2.9 (95% CI −3.2 to 9.0)

Number of days with limited function in past 6 months: 42.4 vs. 52.9, difference −10.5 (95% CI −27.0 to 6.1)

Pain (0-100 VAS): 39.2 vs. 44.9, difference −5.7 (95% CI −14.4 to 3.0)

Pain Disability Index: 19.0 vs. 23.0, difference −4.0 (95% CI −8.1 to 0.1)

A vs. B

4 months

SF-36 bodily pain subscale (0-100): 53.6 vs. 49.6, difference 3.9 (95% CI −2.7 to 10.7)

SF-36 PCS (0-100): 39.3 vs. 37.6, difference 1.7 (95% CI −1.3 to 4.7)

SF-36 MCS (0-100): 49.9 vs. 46.8, difference 3.1 (95% CI −0.5 to 6.6)

Allgemaine Depressionssskala (ADS, t standard): 49.7 vs. 50.3, difference −0.6 (95% CI −2.5 to 3.7)

10 months

SF-36 bodily pain subscale: 52.4 vs. 44.0, difference 8.5 (95% CI 1.7 to 15.2)

SF-36 PCS: 38.9 vs. 36.1, difference 2.8 (95% CI −0.2 to 5.7)

SF-36 MCS: 50.5 vs. 47.2, difference 3.3 (95% CI 0.1 to 6.5)

ADS: 48.2 vs. 50.7, difference −2.5 (95% CI −5.3 to 0.4)

Carlsson, 2001²²⁵

1, 3, 6 months

Duration of pain: 6 months or longer

Poor

A. Needle acupuncture or electroacupuncture (n=34), 8 sessions over 8 weeks, with followup session at 3 and 6 months

B. Placebo (sham transcutaneous electrical nerve stimulation) (n=16)

A vs. B (NR)

Age: 50 years

Female: 66%

Baseline function: NR

Baseline Pain (0-100 VAS): 57 vs. 46

A vs. B

1 month

Pain (0-100 VAS): 50 vs. 60, P not reported

Global assessment “pain improved”: 47% vs. 13%, RR 3.76 (95% CI 0.98 to 14.4)

3 months

Pain (0-100 VAS): 42 vs. 56, P not reported

Global assessment “pain improved”: 44% vs. 13%, RR 6.87 (95% CI 1.87 to 25.1)

≥6 months outcomes

Pain (0-100 VAS): 41 vs. 50, P not reported

Global assessment “pain improved”: 41% vs. 13%, RR 3.29 (95% CI 0.85 to 12.8)

A vs. B

≥6 months

Analgesic intake (tablets per week): 21.4 vs. 21.5

Work full time: 32% vs. 31%

Cherkin, 2001¹⁷⁶

9.5 months

Duration of pain: 3 to 12 months, mean not reported

Fair

A. Needle acupuncture (n=94),10 sessions over 10 weeks

B. Attention control (education) (n=90)

A vs. B

Age: 54 vs. 44 years

Female: 52% vs. 44%

Baseline symptom bothersomeness (0-10): 6.2 vs. 6.1

Baseline modified RDQ (0-23): 12. vs. 12.0

A vs. B

9.5 months

Symptom bothersomeness (0-10): 4.5 vs. 3.8, adjusted p=0.002

Modified RDQ (0-23): 8.0 vs. 6.4, adjusted p=0.05

A vs. B

9.5 months

≥1 work-loss day due to LBP in past month: No difference (data not reported)

Medication use: 51% vs. 62%, p<0.05

Provider visits:1.9 vs. 1.5

LBP medication fills: 4.4 vs. 4.0

Imaging studies: 0.2 vs. 0.1

Cost of services (1998 $): 252 vs. 200

Cherkin, 2009²²⁶

4.5 and 10.5 months

Duration of pain: 3 to 12 months, mean not reported

Fair

A. Needle acupuncture (individualized) (n=157), 10 sessions over 7 weeks

B. Needle acupuncture (standardized) (n=158), 10 sessions over 7 weeks

C. Sham acupuncture (n=162)

D. Usual care (n=161)

A vs. B vs. C vs. D

Age: 47 vs. 49 vs. 47 vs. 46 years

Female: 68% vs. 56% vs. 60% vs. 64%

Symptom bothersomeness (0-10): 5.0 vs. 5.0 vs. 4.9 vs. 5.4

Baseline pain (0-10 VAS): 5.0 vs. 5.0 vs. 4.9 vs. 5.3

Baseline modified RDQ (0-23): 10.8 vs. 10.8 vs. 9.8 vs. 11.0

A vs. B

4.5 months

Symptom bothersomeness (0-10): 3.8 (2.5) vs. 3.7 (2.6) vs. 3.5 (2.7) vs. 4.4 (2.6)

≥2 point decrease in symptom bothersomeness: 49% vs. 44% vs. 48% vs. 41%

Modified RDQ (0-23): 6.8 (5.5) vs. 6.7 (5.8) vs. 6.4 (6.0) vs. 8.4 (6.0)

10.5 months

Symptom bothersomeness (0-10): 3.7 (2.6) vs. 3.5 (2.7) vs. 3.4 (2.7) vs. 4.1 (2.6)

≥2 point decrease in symptom bothersomeness: 52% vs. 49% vs. 50% vs. 47%

Modified RDQ (0-23): 6.0 (5.4) vs. 6.0 (5.8) vs. 6.2 (5.8) vs. 7.9 (6.5)

≥3 point decrease on RDQ: 65% vs. 65% vs. 59% vs. 50%

>7 days with cutting down on activities due to LBP in the past month: A, B and C 5-7% vs. D 18%, p=0.0005

A vs. B

10.5 months

SF-36 PCS: No differences, data not provided

SF-36 MCS: No differences, data not provided

Missed work/school for >1 day in past month: A, B and C 5-10% vs. D 16%, p=0.01

Mean total costs of back-related health services: $160-221 across groups, p=0.65

Cho, 2013²²⁷

1.5 and 4 months

Duration of pain: 3 months

Fair

A. Needle acupuncture (n=57), 12 sessions over 6 weeks

B. Sham acupuncture (n=59)

A vs. B

Age: 42 vs. 42 years

Female: 83% vs. 86%

Baseline ODI (0-100): 28.2 vs. 24.2

Baseline pain (0-10 VAS): 6.5 vs. 6.4

A vs. B

1.5 months

ODI (0-100): 15.5 vs. 15.5 bothersomeness (0-10 VAS): 2.83 vs. 3.99

Pain (0-10 VAS): 2.78 vs. 4.06

4 months

ODI: 15.3 vs. 15.3

Symptom bothersomeness: 2.85 vs. 3.63

Pain (0-10 VAS): 2.79 vs. 3.52

A vs. B

1.5 months

Beck Depression Inventory (0-63): 6 vs. 7.5

4 months

Beck Depression Inventory: 6 vs. 7

Haake, 2007²²⁸

1.5 and 4.5 months

Duration of pain: Mean 8 years

Fair

A. Needle acupuncture (n=387), 10-15 sessions over 5 weeks

B. Sham acupuncture (n=387)

C. Usual care (n=388)

A vs. B vs. C

Age: 50 vs. 49 vs. 51 years

Female: 57% vs. 64% vs. 58%

Baseline Hannover Functional Ability Questionnaire (0-100): 46.3 vs. 46.3 vs. 46.7

Baseline Von Korff Chronic Pain Grade Scale (0-100): 67.7 vs. 67.8 vs. 67.8

A vs. B

1.5 months

Hannover Functional Ability (0-100): 65.4 vs. 61.3 vs. 56.0

Von Korff Chronic Pain Grade Scale (0-100): 45.4 vs. 48.5 vs. 54.8

4.5 months

Hannover Functional Ability (0-100): 66.8 vs. 62.2 vs. 55.7

Von Korff Chronic Pain Grade Scale: 40.2 vs. 43.3 vs. 52.3

A vs. B

1.5 months

SF-12 PCS (0-100): 40.3 vs. 39.2 vs. 36.1

SF-12 MCS (0-100): 50.5 vs. 50.2 vs. 48.6

Treatment response (≥33% improvement in pain or ≥12% improvement in function): 55.0% (213/387) vs. 51.9% (201/387) vs. 41.9% (162/387), RR 1.05 (95% CI 0.93 to 1.21) for A vs. B and RR 1.31 (95% CI 1.13 to 1.52) for A vs. C

4.5 months

SF-12 PCS (0-100): 41.6 vs. 39. vs. 35.8

SF-12 MCS (0-100): 50.7 vs. 50.9 vs. 49.2

Treatment response: 47.6% (184/387) vs. 44.2% (171/387) vs. 27.4% (106/387), RR 1.08 (95% CI 0.92 to 1.25) for A vs. B and RR 1.74 (95% CI 1.43 to 2.11) for A vs. C

Kerr, 2003²²⁹

4.5 months

Duration of pain: Mean 86 vs. 73 months

Poor

A. Needle acupuncture (n=26), 6 sessions over 6 weeks

B. Placebo (sham TENS) (n=20)

A vs. B

Age: 43 vs. 43 years

Female: 50% vs. 35%

Baseline function: NR

Baseline pain (0-100 VAS): 79.7 vs. 76

A vs. B

4.5 months

Pain relief “yes”: 91% vs. 75%, RR 1.19 (95% CI 0.89 to 1.60)

Thomas, 2006²³⁰

9 and 21 months

Duration of pain: Mean 17 weeks

Fair

A. Needle acupuncture (n=159), 10 sessions over 12 weeks

B. Usual care (n=80)

A vs. B

Age: 42 vs. 44

Female: 62% vs. 58%

Baseline ODI (0-100): 33.7 vs. 31.4

Baseline McGill Present Pain Index (0-5): 2.64 vs. 2.70

A vs. B

9 months

ODI (0-100): 20.6 vs. 19.6, adjusted difference −0.5 (−5.1 to 4.2)

McGill Present Pain Index (0-5): 1.43 vs. 1.53, adjusted difference −0.1 (−0.4 to 0.3)

21 months

ODI: 18.3 vs. 21.0, adjusted difference −3.4 (−7.8 to 1.0)

McGill Present Pain Index: 1.42 (1.1) vs. 1.71, adjusted difference −0.2 (−0.6 to 0.1)

A vs. B

9 months

SF-36 bodily pain (0-100): 64.0 vs. 58.3, adjusted difference 5.6 (95% CI −0.2 to 11.4)

21 months

Used medication for LBP in the past 4 weeks: 40% vs. 59%, difference −19% (−35 to −3), p=0.03

21 months

SF-36 bodily pain: 67.8 vs. 59.5, adjusted difference 8.0 (2.8 to 13.2)

: CI = confidence interval; FFbH-R = Funktionsfragebogen Hannover-Rücken (Hannover Functional Ability Questionnaire-back); LBP = low back pain; MCS = Mental Component Summary; NR = not reported; ODI = Oswestry Disability Index; PCS = Physical Component Summary; RDQ = Roland-Morris Disability Questionnaire; RR = Relative risk; SF-36 = Short-Form 36 Questionnaire; TENS = transcutaneous electrical nerve stimulation; VAS = visual analog scale
a: Unless otherwise noted, followup time is calculated from the end of the treatment period

Table 18Chronic low back pain: multidisciplinary rehabilitation

Author, Year, Followup,^a Pain Duration, Study Quality

Intervention

Population

Function and Pain Outcomes

Other Outcomes

Abbassi, 2012²⁵⁹

10.25 months

Duration of pain: ~6 years

Poor

A. Multidisciplinary rehabilitation (n=12), 7 sessions over 7 weeks

B. Multidisciplinary pain management (spouse-assisted) (n=10). 7 sessions over 7 weeks

C: Usual care (n=11)

A + B + C

Overall

Age (mean): 45 years

Female: 88%

A vs. B vs. C

Baseline RDQ (0–24): 12.1 vs. 11.2 vs. 8.4

Baseline pain (0-10 VAS): 4.6 vs. 5 vs. 3.6

A vs. B vs. C

10.25 months

RDQ (0–24): 8.8 vs. 8.2 vs. 10.4, p=0.44

Pain (0–10 VAS): 3.7 vs. 2.8 vs. 4.3, p=0.44

Bendix, 1995,²⁷⁰ 1997,²⁸⁰ 1998²⁸¹

60 months

Duration of pain: >6 months

Fair

A. Multidisciplinary rehabilitation (n=40), 18 sessions over 6 weeks (total ~135 hours)

B. Multidisciplinary rehabilitation (n=35), 12 sessions over 6 weeks (total 24 hours)

C. Exercise (n=31)

A vs. B vs. C

Age: 40 vs. 44 vs. 42

Female: 75% vs. 77% vs. 74%

Baseline pain (0-10 NRS): 5.3 vs. 5.9 vs. 5.4

Baseline Low Back Pain Rating Scale (0-30): 15.5 vs. 15.3 vs. 14.4

A vs. B vs. C

3.25 months

Back pain (0-10 NRS): 2.7 vs. 5.6 vs. 4.4, p<0.001

Low Back Pain Rating Scale (0-30): 8.5 vs. 16.1 vs. 13.5, p=0.002

12 months

Back pain (0-10 NRS): 3.3 vs. 6.5 vs. 5.3, p=0.005

Low Back Pain Rating Scale (0-30): 8.9 vs. 16.4 vs. 13.7, p<0.001

24 months

Back pain (0-10 NRS): 3 vs. 6 vs. 5, p=0.08

Low Back Pain Rating Scale (0-30): 10 vs. 17 vs. 14, p=0.003

60 months

Back pain (0-10 NRS): 4 vs. 6 vs. 5, p=0.3

Low Back Pain Rating Scale (0-30): 8 vs. 16 vs. 14, p=0.02

A vs. B vs. C

3.25 months

Days of sick leave: 25 vs.122 vs. 13, p=0.005

Healthcare system contacts: 0.5 vs. 2.8 vs. 1.3, p=0.05

12 months

Days of sick leave: 52 vs. 295 vs. 100, p=0.002

Healthcare system contacts: 4.5 vs. 12.0 vs. 11.8, p=0.002

Days of sick leave: 2.5 vs. 37 vs. 11, p=0.06

24 months

Healthcare system contacts: 5 vs. 21 vs. 14, p=0.03

Overall assessment (1-5): 2 vs. 3 vs. 3, p=0.005

60 months

Overall assessment (1-5): 2 vs. 3 vs. 3, p=0.004

Increase in proportion able to work: 30% vs. 23% vs. 0%, p=0.001

Days of sick leave: 13 vs. 11 vs. 88, p=0.2

Healthcare system contacts: 15 vs. 10 vs. 24, p=0.2

Back surgery: 5% vs. 10% vs. 10%, p=0.7

Bendix,1996,²⁵⁵ 1998²⁸¹

60 months

Duration of pain: >6 months

Fair

A. Multidisciplinary rehabilitation (n=55), 18 sessions over 6 weeks (total ~135 hours)

B. Usual care (n=51)

A vs. B

Age 41 vs.40 years

Female: 71% vs. 69%

Baseline pain (0-10 NRS): 6.1 vs. 6.1

Baseline Low Back Pain Rating Scale (0-30): 16.9 vs. 15.9

A vs. B

3.25 months

Back pain (0-10 NRS): 5.7 vs. 6.9, p=0.05

Low Back Pain Rating Scale (0-30): 12.1 vs. 16.8, p<0.001

24 months

Back pain (0-10 NRS): 6 vs. 6.5, p=0.5

Low Back Pain Rating Scale (0-30): 16 vs. 15, p=0.9

60 months

Back pain (0-10 NRS): 5 vs. 5, p=1.0

Low Back Pain Rating Scale (0-30): 12 vs. 16, p=0.2

A vs. B

3.25 months

Days of sick leave: 10 vs. 122, p=0.02

Contacts to health-care system: 1.6 vs. 5.3, p<0.001

24 months

Days of sick leave: 15 vs. 123, p<0.001

Healthcare system contacts: 12 vs. 26, p<0.001

60 months

Days of sick leave: 10 vs. 50, p=0.4

Healthcare system contacts: 16 vs. 48, p=0.1

Back surgery: 7% vs. 12%, p=0.4

Bendix, 2000²⁷¹

10 months

Duration of pain: Not reported

Fair

A. Multidisciplinary rehabilitation (n=59), 18 sessions over 8 weeks (total ~139 hours)

B. Exercise (n=68)

A vs. B

Age: 40 vs. 43 years

Female: 66% vs. 65%

Baseline function: NR

Baseline pain: NR

A vs. B

10 months

Back pain (0–10): 5.1 vs. 5.7, p=0.33

Low Back Pain Rating Scale (0–30 ADL): 12 vs. 13, p=0.41

A vs. B

10 months

Overall assessment (1–5): 1.7 vs. 2.7, p=0.03

Work capable: 75% vs. 69%, p=0.64

Healthcare contacts (number): 2.5 vs. 4, p=0.28

Harkapaa, 1989²⁵⁶

1 month

Duration of pain: >2 years

Poor

A. Multidisciplinary rehabilitation (inpatient) (n=156), 3 weeks (number of sessions and total hours unclear)

B. Multidisciplinary rehabilitation (outpatient) (n=150), 15 sessions over 8 weeks (total hours unclear)

C. Usual care (n=153)

A vs. B vs. C

Age: 45 vs. 45 vs. 45 years

Female: 37% vs. 39% vs. 35%

Baseline function, LBP Disability Index (0-45): 16.7 vs. 17.6 vs. 16.7

Baseline Pain Index (0-400): 184.9 vs. 178.6 vs. 175.8

A vs. B vs. C

1 month

LBP Disability Index (0-45): 13.8 vs. 14.7 vs. 17.3, p<0.004 for A vs. C and p<0.01 for B vs. C

Pain Index (0-400): 127 vs. 145 vs. 160, p<0.001 for A vs. C and p<0.04 for B vs. C

Jousset, 2004²⁷²

5 months

Duration of pain: >4 months

Poor

A. Multidisciplinary rehabilitation (n=44), 25 sessions over 5 weeks (total 150 hours)

B. Exercise (n=42)

A vs. B

Age: 41 vs. 40 years

Female: 30% vs. 37%

Baseline function Quebec Disability Scale (0-100): 34.6 vs. 31.6 Baseline pain (0-10 NRS): 5.0 vs. 4.6

A vs. B

5 months

Quebec Disability Scale (0-100): 22.0 vs. 22.9, p=0.80

Pain (0-10 NRS): 3.1 vs. 4.0, p=0.01

Dallas Pain Questionnaire ADL (0-100): 36.7 vs. 41.5, p=0.36

A vs. B

5 months

Hospital Anxiety Depression Scale (0-21): 12.7 vs. 13.4 (6.4), p=0.62

Dallas Pain Questionnaire Social interest (0-100): 19.6 vs. 24.3, p=0.37

Lambeek 2010²⁵⁸

9 months

Duration of pain: >4 months

Fair

A. Multidisciplinary rehabilitation (n=66), 26+ sessions over up to 13 weeks (total hours unclear)

B. Usual care (n=68)

A vs. B

Age: 46 vs. 47 years

Female: 44% vs. 40%

Baseline modified RDQ (0-23): 14.7 vs. 15.0

Baseline pain (0-10 VAS): 5.7 vs. 6.3

A vs. B

3 months

Modified RDQ (0-23): 4.8 vs. 5.0 (0.9), adjusted difference 0.06, 95% CI −2.3 to 2.5

Pain (0-10 VAS): 1.3 vs. 2.3, adjusted difference 0.5, 95% CI −0.6 to 1.6

9 months

Modified RDQ (0-23): 7.2 vs. 4.4, adjusted difference −2.9, 95% CI −4.9 to −0.9

Pain (0-10 VAS): 1.6 vs. 1.9, adjusted difference 0.21, 95% CI −0.8 to 1.2

A vs. B

9 months

General practitioner visits (# of patients): 13 vs. 29

Medical specialist visits (# of patients): 13 vs. 29

Total costs (pounds): 13,165 (SD 13,600) vs. 18,475 (SD 13,616), difference −5,310 (95% CI −10,042 to −391)

Monticone 2013²⁷⁶

23 months

Duration of pain: 25 vs. 26 months

Fair

A. Multidisciplinary rehabilitation (n=45), 26 sessions over 5 weeks (total 26 hours)

B. Exercise (n=45)

A vs. B

Age: 49 vs. 50 years

Female: 60% vs. 56%

Baseline RDQ (0-24): 15.3 vs. 15.0 Baseline pain (0-10 VAS): 7.0 vs. 7.0

A vs. B

11 months

RDQ (0-24): 1.3 (1.6) vs. 11.0 (2.0)

Pain (0-10 VAS): 1.4 (1.1) vs. 5.3 (1.2)

23 months

RDQ (0-24): 1.4 vs. 11.1, difference −9.7, 95% CI −10.4 to −9.0

Pain (0-10 VAS): 1.5 vs. 6.2, difference −4.7, 95% CI −5.1 to −4.3

A vs. B

11 months

SF-36 physical pain (0-100): 79.0 (14.6) vs. 52.0 (16.2)

SF-36 physical functioning (0-100): 85.7 (19.6) vs. 62.1 (19.4)

SF-36 general health (0-100): 85.0 (13.8) vs. 56.4 (15.9)

SF-36 mental health (0-100): 89.8 (13.0) vs. 54.1 (11.9)

23 months

SF-36 physical pain: 80.4 vs. 61.8, difference 18.6, 95% CI 12.8 to 24.3

SF-36 physical functioning (0-100): 87.6 vs. 65.0, difference 22.6, 95% CI 15.0 to 30.1

SF-36 general health: 86.3 vs. 63.1, difference 23.2, 95% CI 17.3 to 29.1

SF-36 mental health: 91.0 vs. 58.8, difference 32.2, 95% CI 27.4 to 37.0)

Monticone 2014²⁷⁷

3 months

Duration of pain: 15 vs. 14 months

Fair

A. Multidisciplinary rehabilitation (n=10), 16 sessions over 8 weeks (total 16 hours)

B. Exercise (n=10)

A vs. B

Age: 59 vs. 57 years

Female: 7% vs. 4%

Baseline function (0-100 ODI): 26 vs. 24

Baseline pain (0-10 NRS): 5 vs. 4

A vs. B

3 months

ODI (0-100): 8 vs. 15, p=0.027

Pain (0-10 NRS): 2 vs. 3, p=1.0

A vs. B

3 months

SF-36 bodily pain (0-100): 65 vs. 55, p=0.261

SF-36 general health (0-100): 71 vs. 55, p=0.018

SF-36 social function (0-100): 81 vs. 61, p=0.001

SF-36 emotional role (0-100): 77 vs. 57, p=0.007

SF-36 mental health (0-100): 88 vs. 67, p=0.001

Nicholas, 1991²⁷³

11 months

Duration of pain: 7 years

Poor

A. Multidisciplinary rehabilitation (cognitive treatment) (n=10)

B. Multidisciplinary rehabilitation (behavioral treatment) (n=10)

C. Multidisciplinary rehabilitation (cognitive treatment and relaxation treatment) (n=8)

D. Multidisciplinary rehabilitation (behavioral treatment and relaxation training) (n=9)

E. Exercise + attention control (psychologist-led group discussions) (n=10)

F. Exercise (n=11)

For all multidisciplinary rehabilitation interventions, 19 sessions over 5 weeks (total 21.5 hours)

Overall

Age: 41 years

Female: 51%

A vs. B vs. C vs. D vs. E vs. F

Baseline function, (0-100 Sickness Impact Profile): 37.13 vs. 34.24 vs. 33.41 vs. 20.53 vs. 27.12 vs. 28.06 Baseline pain (0-5 categorical scale): 2.78 vs. 2.96 vs. 3.80 vs. 2.27 vs. 2.84 vs. 2.77

A vs. B vs. C vs. D vs. E vs. F

5 months

Sickness Impact Profile (0-100): 24.42 (11.78) vs. 15.44 (14.12) vs. 25.69 (8.50) vs. 14.86 (9.08) vs. 19.40 (6.89) vs. 29.78 (8.76)

Pain (0-5 categorical scale): 2.18 (0.55) vs. 1.87 (0.73) vs. 3.20 (0.93) vs. 2.22 (0.48) vs. 2.64 (0.90) vs. 3.18 (0.72)

11 months

Sickness Impact Profile (0-100): 23.85 (12.50) vs. 12.80 (8.62) vs. 20.77 (8.29) vs. 12.87 (6.68) vs. 18.94 (12.79) vs. 25.18 (8.08)

Pain (0-5 categorical scale): 2.56 (0.97) vs. 2.66 (1.06) vs. 3.30 (0.83) vs. 1.88 (0.65) vs. 2.70 (0.84) vs. 3.22 (0.69)

A vs. B vs. C vs. D vs. E vs. F

5 months

Spielberger State Anxiety Inventory (20-80): 57.17 (10.30) vs. 37.57 (12.92) vs. 55.71 (10.47) vs. 36.40 (6.28) vs. 41.13 (11.70) vs. 54.00 (12.03)

Beck Depression Inventory (0-63): 18.67 (9.01) vs. 8.14 (5.77) vs. 16.14 (3.80) vs. 9.00 (6.07) vs. 9.88 (5.46) vs. 19.17 (8.78)

Medication use (0-5): 1.50 (1.26) vs. 0.57 (0.73) vs. 1.86 (0.64) vs. 1.60 (1.02) vs. 1.50 (0.71) vs. 1.83 (1.07)

11 months

Spielberger State Anxiety Inventory (20-80): 42.83 (9.42) vs. 37.43 (12.26) vs. 47.17 (17.01) vs. 40.67 (11.81) vs. 46.56 (11.51) vs. 53.40 (18.78)

Beck Depression Inventory (0-63): 18.67 (10.04) vs. 8.00 (5.93) vs. 12.83 (6.69) vs. 13.17 (8.51) vs. 10.56 (5.21) vs. 17.60 (6.09)

Medication use (0-5): 1.17 (1.37) vs. 0.71 (0.88) vs. 1.67 (1.37) vs. 1.33 (0.75) vs. 1.44 (0.96) vs. 1.60 (1.49)

Nicholas, 1992²⁷⁴

5 months

Duration of pain: 5.5 years

Fair

A. Multidisciplinary rehabilitation (n=10), 18 sessions over 5 weeks, (total 31.5 hours)

B. Exercise + attention control (psychologist-led group discussions) (n=10)

Overall

Age: 44 years

Female: 45%

A vs. B

Baseline function (0-100 Sickness Impact Profile): 30.87 vs. 32.10

Baseline pain (0-5 categorical scale): 3.13 vs. 2.84

A vs. B

5 months

Pain intensity (0-5 categorical scale): 2.89 (0.64) vs. 2.75 (1.11)

A vs. B

5 months

Beck Depression Inventory (0-63): 14.44 (5.98) vs. 18.50 (9.26)

Using medication: 44% vs. 88%

Roche, 2007,²⁷⁸ 2011²⁷⁹

10.75 months

Duration of pain: >4 months

Fair

A. Multidisciplinary rehabilitation (n=68), 25 sessions over 5 weeks (total 150 hours)

B. Exercise therapy (n=64)

A vs. B

Age: 41 vs. 39 years

Female: 32% vs. 38%

Baseline function (0-100 Dallas Pain Questionnaire daily activities (0-100): 51.8 vs. 51 Baseline Pain (0-10 VAS): 4.7 vs. 4.5

A vs. B

10.75 months

Dallas Pain Questionnaire daily activities (0-100): 31.4 vs. 39.1, difference −7.7 (95% CI −16.15 to 0.75)

Pain (0-10 VAS): 2.9 vs. 3.5, difference −0.6 (95% CI −1.49 to 0.29)

A vs. B

10.75 months

Dallas Pain Questionnaire anxiety/depression (0-100): 21.9 vs. 25.5, difference −3.6 (95% CI −12.56 to 5.36)

Strand, 2001²⁶⁰

11 months

Duration of pain: 10 vs. 9 years

Fair

A. Multidisciplinary rehabilitation (n=81), 20 sessions over 4 weeks (total 120 hours)

B. Usual Care (n=36)

A vs. B

Age: 45 vs. 42 years

Female: 59% vs. 64%

Baseline function (0-100 Disability Rating Index): 55.6 vs. 58.3 Baseline pain (0-100 VAS): 48.3 vs. 53.0

A vs. B

11 months

Disability Rating Index (0-100): −27.3 (95% CI −34 to −21) vs. −3.3 (95 % CI −10 to 14) vs. −16.4 (95% CI −26 to −7.3) vs. 0.2 (95% CI −14 to 14), difference −3.8 (95% CI −13.9 to 6.3)

Pain (0-100 VAS): −21.1 (95% CI −31 to −11) vs. −2.3 (95% CI −9.4 to 4.8) vs. −23.1 (95% CI −37 to 9.2) vs. 7.1 (95% CI −7.7 to 22), difference −1.0 (95% CI −11.7 to 9.6)

A vs. B

11 months

Working: 47% vs. 58% difference −11% (95% CI −8 to 30)

Tavafian, 2008²⁶⁹

12 months

Duration of pain: 9 months

Poor

A. Multidisciplinary program (n=37), 5 sessions over 0.5 weeks (total hours unclear)

B. Medications (acetaminophen, NSAID and chlordiazepoxide) (n=37)

A vs. B

Age: 43 vs. 45 years

Female, %: 100 vs. 100 Baseline SF-36 Physical (0-100): 41.2 vs. 42.3

Baseline SF-36 Mental (0-100): 47.5 vs. 47.7

A vs. B

3 months

SF-36 Physical (0-100): 76.7 vs. 51.2, difference 25.5 (95% CI 14.69 to 36.31)

SF-36 MCS (0-100): 80.4 vs. 57.4, difference 23.0 (95% CI 10.78 to 35.22)

6 months

SF-36 PCS (0-100): 66.6 vs. 51.2, difference 15.4 (95% CI 2.35 to 28.45)

SF-36 MCS (0-100): 66.9 vs. 57.9, difference 9.0 (95% CI −3.88 to 21.88)

6 months

SF-36 PCS (0-100): 64.7 vs. 51.1, difference 13.6 (95% CI −1.48 to 28.68)

SF-36 MCS (0-100): 65.1 vs. 60.2, difference 4.9 (95% CI −7.57 to 17.37)

Turner, 1990¹³³

12 months

Duration of pain: 12.9 years

Poor

A. Multidisciplinary rehabilitation (n=24), 16 sessions over 2 weeks (total 32 hours)

B. Exercise (n=24)

Overall

Age: 44 years

Female: 48%

A vs. B

Baseline function (Sickness Impact Profile): 8.5 vs. 8.4

Baseline pain (0-78 MPQ): 25.5 vs. 19.4

A vs. B

6 months

Sickness Impact Profile (0-100): 4.5 vs. 6.3

McGill Pain Questionnaire Pain Rating Index (0-78): 13.3 vs. 15.7

12 months

Sickness Impact Profile (0-100): 4.8 vs. 4.7

McGill Pain Questionnaire Pain Rating Index (0-78): 18.2 vs. 14.9

A vs. B

6 months

Center for Epidemiologic Studies-Depression Scale (0-60): 8.3 vs. 9.3

12 months

Center for Epidemiologic Studies-Depression Scale (0-60): 10.0 vs. 9.3

van der Roer, 2008²⁷⁵

10 months

Duration of pain: ~50 weeks

Fair

A. Multidisciplinary rehabilitation (n=60), 30 sessions over 10 weeks (total hours unclear)

B. Exercise (n=54)

A vs. B

Age: 42 vs. 42 years

Female: 55% vs. 48%

Baseline function RDQ (0-24): 11.6 vs. 12.1

Baseline pain (0-10 NRS): 6.2 vs. 5.9

A vs. B

4 months

RDQ (0-24): 7.4 vs. 7.7, adjusted difference 0.13 (95% CI −2.24 to 2.50)

Pain (0-10 NRS): 4.1 vs. 4.8, adjusted difference −0.97 (95% CI −1.88 to −0.06)

10 months

RDQ (0-24): 6.7 vs. 7.1, adjusted difference 0.06 (−2.22 to 2.34)

Pain (0-10 VAS): 3.9 vs. 4.6, adjusted difference −1.02 (−2.14 to 0.09)

A vs. B

4 months

Global Perceived Effect positive (%): 38.2% vs. 39.8%, OR 0.93 (95% CI 0.36 to 2.43)

10 months

Global Perceived Effect positive (%): 45.0% vs. 32.3%, OR 1.71 (95% CI 0.67 to 4.38)

Von Korff, 2005²⁵⁷

22.5 months

Duration of pain: >3 months

Fair

A. Multidisciplinary rehabilitation (n=119), 4 sessions over 5 weeks (total 4 hours)

B. Usual care (n=121)

A vs. B

Age: 50 vs. 50 years

Female: 65% vs. 60%

Modified RDQ (0-23): 12.3 vs. 11.4 Baseline pain (0-10 NRS): 5.7 vs. 5.8

A vs. B

4.5 months

Function

Modified RDQ (0-23): 9.2 (6.6) vs. 10.1 (6.4), p=0.0003

>1/3 reduction in RDQ: 42.2% vs. 23.7%, adjusted OR 3.5, p=0.0007

Pain (0-10 NRS): 4.2 (2.0) vs. 4.7 (2.2), p=0.007

10.5 months

Modified RDQ (0-23): 8.4 vs. 9.1, p=0.0063

>1/3 reduction in RDQ: 44.6% vs. 22.7%, adjusted OR 2.1, p=0.03

Pain (0-10 NRS): 4.0 vs. 4.7, p=0.004

22.5 months

Modified RDQ (0-23): 8.1 vs. 9.1, p=0.0078

>1/3 reduction in RDQ: 49.4% vs. 37.0%, adjusted OR 1.8, p=0.08

Pain (0-10 NRS): 4.3 vs. 4.6, p=0.115

A vs. B

4.5 months

SF-36 Social Functioning (0-100): 74.4 vs. 73.6, p=0.26

SF-36 Mental Health (0-100): 70.3 vs. 69.5, p=0.23

10.5 months

SF-36 Social Functioning (0-100): 74.4 vs. 73.6, p=0.26

SF-36 Mental Health (0-100): 70.3 vs. 69.5, p=0.23

22.5 months

SF-36 Social Functioning (0-100): 76.7 vs. 76.3, p=0.28

SF-36 Mental Health (0-100): 71.0 vs. 72.4, p=0.98

: ADL = activity of daily living; CI = confidence interval; LBO = Low Back Outcome Score; LBP = low back pain; MCS = Mental Component Summary; MPQ = McGill Pain Questionnaire; NR = not reported; NRS = Numerical Rating Scale; NSAID = nonsteroidal anti-inflammatory drug; ODI = Oswestry Disability Index; PCS = Physical Component Summary; RDQ = Roland-Morris Disability Questionnaire; SF-36 = Short-Form 36Q; VAS = visual analog scale
a: Unless otherwise noted, followup time is calculated from the end of the treatment period

Table 19Chronic neck pain: exercise therapies

Author, Year, Followup,^a Pain Duration, Study Quality

Intervention

Population

Function and Pain Outcomes

Other Outcomes

Andersen, 2008^b^,⁴¹

6 and 12 months

Duration of pain: NR

Poor

A. Dynamic strengthening exercise (muscle performance exercise) (n=61): for the neck/shoulder muscles, performed in in the workplace; 20 minute sessions, 3 times a week (2 of the 3 weekly sessions were supervised by experienced instructors)

B. Lifestyle physical exercise and activity increase (combination exercise) (n=59): workplace activities such as steppers placed near the copying machines, punch bags in the hall, group sessions of Nordic walking, and strength and aerobic fitness exercise programs

C. Control group (n=62): ergonomics, stress management, organization of work, cafeteria food quality

Treatment lasted 1 year. All groups were allowed 1 hour per week during working time for activities

A + B + C

Age: 45 years

Female: 78%

Office workers: 100%

A vs. B vs. C

Baseline pain (0-10 VAS): 5.0 vs. 5.0 vs. 4.7

A vs. C

6 months

Pain VAS: 3.4 vs. 4.2, difference −0.8 (95% CI −0.9 to −0.7)

12 months^c

Pain VAS: 3.8 vs. 4.6, difference −0.80 (95% CI −0.87 to −0.73)

Days of pain in last 3 months (0-90): 25 vs. 30, p>0.05

B vs. C

6 months

Pain VAS: 3.6 vs. 4.2, difference −0.6 (95% CI −0.7 to −0.5)

12 months^c

Pain VAS: 3.6 vs. 4.6, difference −1.0 (95% CI −1.1 to −0.9)

Days of pain in last 3 months: 26 vs. 30, p>0.05

Aslan Telci, 2012¹⁰⁰

6 months

Duration of pain: 12 months

Poor

A. Combination exercises (n=20): consisting of posture, active range of motion, stretching, isometric and dynamic strengthening and endurance exercises, relaxation and proprioception exercises. Clinic followup once a week to maintain motivation and check whether exercises performed correctly for a total of 3 weeks and home exercise for at least another month.

B. NSAIDs and muscle relaxants for 15 days (n=20): all patients received verbal advice regarding pain control, posture, and ergonomics.

A vs. B

Age: 48 vs. 52 years

Female: 85% vs. 75%

BMI: 25 vs. 27

Employed: 50% vs. 40%

Education year: 12 vs. 11

Baseline NDI (0-50): 14.0 vs. 10.7

Baseline pain (0-10 VAS): 6.7 vs. 6.4

A vs. B

3 month

NDI: 9.4 vs. 11.5, difference −2.2 (95% CI −5.8 to 1.5)

Pain VAS: 4.1 vs. 5.1, difference −1.0 (95% CI −2.3 to 0.3)

6 month

NDI: 11.9 vs. 13.7, difference −1.8 (95% CI −5.7 to 2.1)

Pain VAS: 4.5 vs. 5.3, difference −0.8 (95% CI −2.3 to 0.7)

A vs. B

3 month

NHP (0-100): 89.2 vs. 230.0, difference −140.8 (95% CI −214.0 to −67.5)

BDI (0-63): 6.8 vs. 10.7, difference −4.0 (95% CI −8.4 to 0.5)

6 month

NHP (0-100): 122.3 vs. 257.6, difference −135.3 (95% CI −209.1 to −61.5)

BDI (0-63): 8.3 vs. 11.8, difference −3.8 (95% CI −8.5 to 1.0)

de Araujo Cazotti, 2018¹⁰¹

3 months

Duration of pain: Range, mean 69 to 86 months

Fair

[New trial]

A. Pilates (muscle performance exercise) (n=32): 1 hour session, 2 times/week, for 12 weeks. Repetitions/exercise varied from 6 to 12. 91% of participants completed all of the scheduled sessions.

B. Pharmacological treatment (n=32): 750 mg acetaminophen every 6 hours if they were experiencing pain. Participants in group A were also instructed to do the same of they were experiencing pain.

A vs. B

Age: 49 vs. 49 years

Female: 19% vs. 25%

Baseline NDI (0-50): 13.3 vs. 12.8

Baseline NPS (0-10): 6.4 vs. 5.8

A vs. B

3 months

NDI: 4.2 vs. 9.8, difference −5.6 (95% CI −8.4 to −2.8)

NPS: 1.9 vs. 5.0, difference −3.1 (95% CI −4.2 to −2.0)

A vs. B

3 months

SF-36 Physical functioning (0-100): 80.3 vs. 73.1, difference 7.2 (95% CI −2.3 to 16.7)

SF-36 Role physical (0-100): 75.0 vs. 55.6, difference 19.4 (95% CI −2.6 to 41.4)

SF-36 Bodily pain (0-100): 68.6 vs. 50.4, difference 18.2 (95% CI 6.8 to 29.6)

SF-36 General health (0-100): 79.5 vs. 74.8, difference 4.7 (95% CI −7.4 to 16.8)

SF-36 Vitality (0-100): 66.6 vs. 56.6, difference 10 (95% CI −0.6 to 20.6)

SF-36 Social functioning (0-100): 86.7 vs. 76.2, difference 10.5 (95% CI −2.5 to 23.5)

SF-36 Role emotional (0-100): 72.9 vs. 72.9, difference 0 (95% CI −19.4 to 19.4)

SF-36 Mental health (0-100): 77.4 vs. 65.2, difference 12.2 (95% CI 2.5 to 21.9)

Acetaminophen use, Median (IQR): 0 (0 to 39) vs. 3.5 (0 to 159)

Lauche, 2016⁴²

3 months

Duration of pain: NR

Poor

A. Combination exercises (n=37): weekly 60-75 minute session for 12 weeks; ergonomic principles, proprioceptive exercises, and isometric and dynamic mobilization, stretching, strengthening neck and core exercises, and relaxation exercises; illustrated written exercises for home use ≥15 minutes/day.

B. Wait list (n=39): continuing usual activities/therapies

A vs. B

Age: 47 vs. 49 years

Female: 86% vs. 82% years

Baseline NDI: NR

Baseline pain, recently (0-100 VAS): 46.2 vs. 51.5

Baseline pain, considered tolerable (0-100 VAS): 20.5 vs. 20.7

A vs. B

3 month

NDI: 25.1 vs. 29.4, difference −4.3 (95% CI −10.2 to 1.6)

Recent pain VAS: 33.1 vs. 44.6, difference −11.5 (95% CI −20.8 to −2.2)

Pain with motion VAS: 34.9 vs. 45.5, difference −10.6 (95% CI −18.5 to −2.7)

A vs. B

3 month

SF-36 PCS (0-100): difference 2.0 (95% CI −1.6 to 5.6)

SF-36 MCS (0-100): difference 0.5 (95% CI −3.9 to 4.9)

Li, 2017⁴³

1.5 months

Duration of pain: 4 years

Fair

A. Progressive resistance training (muscle performance exercise) (n=38): ≥3 sessions per week for 6 weeks. Sessions consisted of four cervical isometric exercises, each repeated 8-12 times. Resistance progressively increased every 2 weeks, starting at 30% of maximal strength and increased to 70%.

B. Fixed resistance training (muscle performance exercise) (n=35): ≥3 sessions per week for 6 weeks. Sessions consisted of four cervical isometric exercises, each repeated 8-12 times. Resistance was fixed at 70% of the participant’s maximal strength.

C. Attention control (n=36): Subjected received information and had weekly discussions about workplace ergonomics, stress management, relaxation, meditation, and diet.

A vs. B vs. C

Age: 36 vs. 34 vs. 34

BMI: 21 vs. 22 vs. 22

Years working: 9 vs. 9 vs. 10

Pain duration (years): 3 vs. 4 vs. 4

Work (days/week): 5 vs. 6 vs. 5

Computer use (hours/day): 7 vs. 8 vs. 7

Baseline NDI (0-50): 28.3 vs. 28.9 vs. 27.8

Baseline pain (0-10 VAS): 5.3 vs. 5.4 vs. 5.2

A vs. C

1.5 month

NDI: 14.9 (4.9) vs. 26.6 (5.4), difference −11.7 (95% CI −14.1 to −9.3)

Pain VAS: 1.9 (0.9) vs. 5.1 (1.0), difference −3.2 (95% CI −3.6 to −2.8)

B vs. C

1.5 month

NDI: 15.8 (4.8) vs. 26.6 (5.4), difference −10.8 (95% CI −13.2 to −8.4)

Pain VAS: 2.5 (0.9) vs. 5.1 (1.0), difference −2.6 (95% CI −3.1 to −2.1)

None

Stewart, 2007⁴⁴

1.5 and 12 months

Duration of pain: 9 months

Fair

A. Combination exercise, plus advice (n=66); aerobic, stretching, functional, speed and endurance, trunk and limb strengthening; 1 hour per session for 12 session over 6 weeks

B. Advice alone (n=68): included reassurance of a favorable outcome and encouragement to resume light activity

A vs. B

Age: 44 vs. 43 years

Female: 73% vs. 62%

Baseline NDI (0-50): 18.2 vs. 19.7

Baseline PSFS (0-10): 3.9 vs. 4.1

Baseline pain (0-10 VAS): 5.2 vs. 5.3

A vs. B

1.5 months

NDI: 12.0 vs. 15.7, difference −2.7 (95% CI −4.5 to −0.9)

PSFS: 6.4 vs. 5.6, difference 0.9 (95% CI 0.3 to 1.6)

Pain VAS: 3.2 vs. 4.3, difference −1.1 (95% CI −1.8 to −0.3)

12 months

NDI: 12.1 vs. 15.5, difference −2.3 (95% CI −4.9 to 0.3)

PSFS: 6.6 vs. 6.0, difference 0.6 (95% CI −0.1 to 1.4)

Pain VAS: 3.5 vs. 3.8, difference −0.2 (95% CI 0.6 to −1.0)

A vs. B

1.5 months

Bothersomeness (0-10) 3.6 vs. 4.8, p=0.019

SF 36 physical (0-100): 42.1 vs. 38.9, p=0.003

SF 36 mental (0-100): 51.4 vs. 46.4, p=0.005

Global Perceived Effect (−5 to 5) 2.5 vs. 1.5, p=0.006

12 months

Bothersomeness 4.1 vs. 4.0, p=0.480

SF 36 physical: 42.3 vs. 38.9, p=0.003

SF 36 mental: 48.4 vs. 46.1, p=0.33

Global Perceived Effect: 2.3 vs. 1.9, p=0.48

Viljanen, 2003⁴⁵

3 and 9 months

Duration of pain: 11 years

Fair

A. Dynamic strengthening exercises (muscle performance exercises) (n=135): physical-therapist guided; 3 times per week for 12 weeks, 30 minute sessions

B. No intervention (n=130)

A vs. B

Age: 45 vs. 44 years

Female: 100% vs. 100%

Office workers: 100%

Computer work >6 hours per day: 33% vs. 35%

Baseline neck disability scale^e (0-80): 29 vs. 26

Baseline pain (0-10 VAS): 4.8 vs. 4.1

A vs. B

3 months

Neck disability scale^e: 15 vs. 14, adjusted difference −0.1 (95% CI −3.1, 2.9)

Pain VAS: 2.9 vs. 2.9, adjusted difference 0.4 (95% CI −0.3, 1.0)

9 months

Neck disability scale^e: 19 vs. 17, adjusted difference −0.1 (95% CI −3.0 to 2.9)

Pain VAS: 3.1 vs. 3.2, adjusted difference 0.5 (95% CI −0.1 to 1.0)

Waling, 2002^d⁴⁶

6 and 36 months

Duration of pain: 6.8 years

Poor

A. Strength training (muscle performance exercise) (n=29): for neck and shoulder muscles, 3 times per week for 10 weeks, 1 hour/session

B. Endurance training (muscle performance exercise) (n=28): using arm-cycling and arm exercises, 30 repetition maximum, 3 times per week for 10 weeks, 1 hour/session

C. Coordination training (neuromuscular reeducation exercises) (n=25): focus on balance and postural stability 3 times per week for 10 weeks, 1 hour/session

D. Reference group (n=21): stress management 1 time per week for 10 weeks, 2 hour/session

A vs. B vs. C vs. D

Age: 38 vs. 39 vs. 38 vs. 39 years

Female: 100% all groups

Office workers: 100%

Baseline pain, at present (0-10 VAS): 2.6 vs. 2.8 vs. 3.3 vs. 3.7

A vs. B vs. C vs. D

6 months

Proportion of patients with frequent pain (several times per week or more): 76% vs. 91% vs. 78% vs. 73%, p=0.50

36 months

Pain VAS at present: 3.1 vs. 2.2 vs. 2.7 vs. 1.6, p=0.073

Pain VAS in general (0-10): 3.2 vs. 2.9 vs. 2.9 vs. 2.0, p=0.249

Pain VAS at worst (0-10): 6.1 vs. 5.8 vs. 5.7 vs. 5.8, p=0.902

Frequent pain: 47% vs. 50% vs. 58% vs. 39%, p=0.66

: BDI = Beck Depression Inventory; BMI = body mass index; CI = confidence interval; MCS = Mental Component Summary; NDI = Neck Disability Index; NHP = Nottingham Health Profile; NR = not reported; NSAID = nonsteroidal anti-inflammatory drug; PCS = Physical Component Summary; PSFS = Patient Specific Functional Scale; SF-36 = Short-Form 36 questionnaire; VAS = visual analog scale
a: Unless otherwise noted, followup time is calculated from the end of the treatment period
b: Cluster RCT where clusters were formed from participants working on the same floor
c: Intervention lasted 12 months and followup is at the end of the intervention
d: Cluster RCT where clusters were formed from participants selecting a time that best fit their schedule
e: Neck disability scale was created by investigators from responses to eight questions related to functional limitations due to pain; this scale is not the same as the more common NDI

Table 20Chronic neck pain: psychological therapies

Author, Year, Followup,^a Pain Duration, Study Quality

Intervention

Population

Function and Pain Outcomes

Other Outcomes

Viljanen, 2003⁴⁵

3 and 9 months

Pain duration: 11 years

Fair

A. Physical therapist guided relaxation training (n=128): progressive relaxation, autogenic training, functional relaxation, and systematic desensi-tization (goal was to teach correct activation and relaxation of muscles used in daily activities); 3 times per week for 12 weeks, 30 minute sessions

B. Physical therapist guided dynamic strengthening exercises of the shoulder and cervical musculature (muscle performance exercises) (n=135): 3 times per week for 12 weeks, 30 minute sessions

C. No intervention (n=130)

A vs. B vs. C

Age: 43 vs. 45 vs. 44 years

Female: 100%

Performing physical activity ≥3x/week: 34% vs. 44% vs. 41%

Duration of office work: 20 vs. 23 vs. 21 years

Sedentary work >6 hours per day: 75% vs. 76% vs. vs. 73%

Computer work >6 hours per day: 39% vs. 33% vs. vs. 35%

Absent from work due to neck pain: 12% vs. 12% vs. 12%

Pain duration: 11 vs. 11 vs. 10 years

Depression index: 16 vs. 16 vs. 16

Baseline neck disability scale^a (0-80): 29 vs. 29 vs. 26

Baseline pain (0-10 VAS): 4.8 vs. 4.8 vs. 4.1

A vs. C

3 months

Neck disability scale^b: 15 vs. 14, adjusted difference 0.1 (95% CI −2.9 to 3.2)

Pain VAS: 3.0 vs. 2.9, adjusted difference 0.2 (95% CI −0.4 to 0.8)

9 months

Neck disability scale^b: 19 vs. 17, adjusted difference 0.2 (95% CI −2.8 to 3.1)

Pain VAS: 3.3 vs. 3.2, adjusted difference 0.2 (95% CI −0.3 to 0.8)

A vs. B

3 months

Neck disability scale^a: 15 vs. 15, adjusted difference 0.2 (95% CI −2.8 to 3.2)

Pain VAS: 3.0 vs. 2.9, adjusted difference −0.2 (95% CI −0.8 to 0.4)

9 months

Neck disability scale^a: 19 vs. 19; adjusted difference 0.2 (95% CI −2.7 to 3.2)

Pain VAS: 3.3 vs. 3.1, adjusted difference −0.2 (95% CI −0.8 to 0.3)

: CI = confidence interval; VAS = visual analog scale
a: Unless otherwise noted, followup time is calculated from the end of the treatment period
b: Neck disability scale was created by investigators from responses to eight questions related to functional limitations due to pain. This scale is not the same as the more common Neck Disability Index (NDI)

Table 21Chronic neck pain: physical modalities

Author, Year, Followup^a Pain Duration, Study Quality

Intervention

Population

Function and Pain Outcomes

Other Outcomes

Altan, 2005¹⁴⁵

3 months

Pain duration: 4.5 years

Fair

A. GaAs low-level laser treatment (n=26): over the 3 trigger points bilaterally and 1 point in the taut bands in trapezius muscle bilaterally for 2 min over each point once a day for 2 weeks. Laser wavelength of 904 nm.

B. Sham laser treatment (n=27)

A vs. B

Age: 43 vs. 43 years

Female: 87% vs. 48%

Baseline pain (0-10 VAS): 6.9 vs. 6.2

Baseline pain (5-point scale, 0-5): 2.4 vs. 2.2

A vs. B

3 months:

Pain (VAS): 3.2 vs. 3.8, difference −0.6 (95% CI −1.0 to −0.3)

Pain (5 point scale): 1.1 vs. 1.2, difference −0.1 (95% CI −0.2 to 0.05)

Chiu, 2011¹⁴⁶

1.5 months

Pain duration: NR

Poor

A. Cervical Traction (intermittent) (n=39): ranging from 10-20% of patient body weight, holding time 10-25 seconds; resting time 20-50% of holding time; twice/week for 6 weeks; sessions lasting 20 minutes.

B. Infrared Irradiation Control (n=40): via infrared lamp positioned so that patients reported minimal warmth over the back of their neck; twice/week for 6 weeks; sessions lasting: 20 minutes.

A vs. B

Age: 50.9 vs. 46.8 years

Female: 65.2% vs. 76.5%

Baseline NPQ (0-100%): 46.1 vs. 38.5

Baseline NPS (0-10): 5.8 vs. 5.2

A vs. B

1.5 months

NPQ Disability^b: 31.4 vs. 29.6; p>0.05, 95% CI 29.7 to 37.5, power=0.15

NPS Pain Severity^b: 3.5 vs. 2.8; p>0.05, 95% CI 3.3 to 4.5, power=0.17

Chow, 2006¹⁴⁷

1 month

Pain duration: 15 years

Good

A. Low-level laser therapy (n=45): 2x/week for 7 consecutive weeks, maximum half hour per treatment. Up to 50 tender points in the neck were treated for 30 seconds per point. Laser wavelength of 830 nm.

B. Sham laser (n=45)

A vs. B

Age: 57 vs. 55 years

Female: 64% vs. 67%

Baseline NPQ (0-100%):

Baseline NPAD (0-100):

Baseline pain (0-10 VAS): 5.9 vs. 4.0

MPQ VAS (1-5):

A vs. B

1 month

NPQ: −3.5 vs. −0.6, difference −3.0 (95% CI −5.0 to −0.9)

NPAD: −15.2 vs. −3.1, difference −12.1 (95% CI −19.3 to −4.8)

Proportion with improved pain >3 points (%): 40% vs. 7%, RR 6.0 (95% CI 1.9 to 19.0)

Pain VAS: −2.7 vs. 0.3, difference 3.0 (95% CI −3.8 to −2.1)

MPQ VAS: −2.1 vs. 0.1, difference −2.2 (95% CI −3.5 to −0.9)

A vs. B

1 month

SF36 PCS (0-100): 3.2 vs. −1.3, difference 4.5 (95% CI 0.7 to 8.2)

SF 36 MCS (0-100): 2.4 vs. 5.4, difference −2.9 (95% CI −7.2 to 1.3)

MPQ sensory (0-33): −3.4 vs. −1.9, difference −1.5 (95% CI −4.5 to 1.5)

MPQ affective (0-12): −1.3 vs. −0.7, difference −0.6 (95% CI −2.3 to 1.1)

Gur, 2004¹⁴⁸

2.5 months

Pain duration: 43 months

Fair

A. Active Ga-As low-level laser therapy (n=30): daily for 2 weeks, 3 minutes each myofascial tender point. Laser wavelength of 904 nm.

B. Sham laser (n=30)

A vs. B

Age: 32 vs. 31 years

Female: 82% (total pop only)

Employed: 12% vs. 17%

Baseline NPAD (0-100): 65.4 vs. 68.5

Baseline pain at rest (0-10 VAS): 7.4 vs. 6.9

Baseline pain at movement (0-10 VAS): 7.4 vs. 7.2

A vs. B

2.5 months

NPAD: 41.1 vs. 63.3, difference −22.2 (95% CI −36.7 to −7.6)

VAS pain at rest: 4.2 vs. 6.3, difference −2.1 (95% CI −3.8 to −0.4)

VAS pain at movement: 5.3 vs. 7.3, difference −2.0 (95% CI −3.3 to −0.7)

A vs. B

2.5 months

BDI (0-63): 14.72 vs. 21.38, difference −6.66 (95% CI −13.24 to −0.08)

NHP (0-100): 56.41 vs. 72.48, difference −16.1 (95% CI −30.9 to −1.3),

Trock, 1994¹⁴⁹

1 month

Pain duration: 7.5 years

Poor

A. Pulsed electromagnetic fields (n=42): extremely low frequency (<2 A, 120 V) applied with stepwise energy characteristics as follows: 5 Hz, 0-15 gauss for 10 minutes; 10 Hz, 15-25 gauss for 10 minutes; and 12 Hz, 15-25 gauss for 10 minutes. Maximum number of pulses/burst was 20.

B. Sham (n=39)

Treatments were given for 30 minute periods, 3-5 times per week for 18 treatments.

A vs. B

Age: 61 vs. 67 years

Female: 71% vs. 67%

Weight (lb): 161 vs. 162

Duration of symptoms: 7 vs. 8 years

Baseline ADL difficulty (0-24) 11.9 vs. 11.5

Baseline pain (0-10 VAS): 7.2 vs. 6.2

A vs. B

1 month:

ADL difficulty: 3.8 vs. 2.1, difference 1.6 (95% CI −1.5 to 4.8)

Pain: 2.6 vs. 1.5, difference 1.1 (95% CI −0.3 to 2.6)

A vs. B

1 month:

Patients’ assessment of improvement (0-100): 41.2 vs. 40.0, difference 1.2 (95% CI −15.2 to 17.6)

: ADL = activity of daily living; BDI = Beck Depression Inventory; CI = confidence interval; Ga-As = Gallium Arsenide; MPQ = McGill Pain Questionnaire; NHP = Nottingham Health Profile; NPAD = Neck Pain and Disability Scale; NPQ = Northwick Park Questionnaire; NPS = numeric pain scale; NR = not reported; RR = risk ratio; SF-36 MCS = Short-Form 36 Questionnaire Mental Coomponent Score; SF-36 PCS = Short-Form 36 Questionnaire Physical Component Score; VAS = visual analog scale
a: Unless otherwise noted, followup time is calculated from the end of the treatment period
b: Results of two-way repeated measures analysis of variance (ANOVA).

Table 22Chronic neck pain: manual therapies (massage)

Author, Year, Followup,^a Pain Duration, Study Quality

Intervention

Population

Function and Pain Outcomes

Other Outcomes

Pach, 2018¹⁸³

1 and 3 months

Duration of pain: mean 11.2 to 11.5 years

Fair

[New trial]

A. Tunia massage (n=46) Two 30-minute sessions/week for 3 weeks (6 sessions total). Authors report high adherence but data is not provided.

B. No intervention waitlist (n=46)

A vs. B

Age: 46 vs. 45 years

Female: 89.1% vs. 84.8%

Baseline NDI (0-50): 45.5 vs. 46.5

Baseline NPDS (0-100): 42.7 vs. 42.7

Baseline pain during previous 7 days (0-100 VAS): 55.8 vs. 59.5

A vs. B

3 months

NDI: 36.6 (95% CI 33.5 to 39.6) vs. 46.1 (95% CI 42.9 to 49.3), adjusted difference −9.6 (95% CI −14.0 to −5.1)

NPDS: 30.2 (95% CI 25.8 to 34.6) vs. 42.3 (95% CI 37.7 to 46.8), adjusted difference −12.1 (95% CI −18.4 to −5.8)

Mean VAS score during previous 7 days: 30.1 (95% CI 23.8 to 36.4) vs. 48.1 (95% CI 41.5 to 54.6), adjusted difference −17.9 (95% CI −27.1 to −8.8);

A vs. B

3 months

SF-12 Physical health (0-100): 48.1 (95% CI 45.8 to 50.3) vs. 42.4 (95% CI 40.1 to 44.7), adjusted difference 5.6 (95% CI 2.4 to 8.9)

SF-12 Mental health (0-100): 48.3 (95% CI 45.4 to 51.1) vs. 45.7 (95% CI 42.8 to 48.5), adjusted difference 2.6 (95% CI −1.4 to 6.6)

Proportion of patients using medication for neck pain during the previous 4 weeks: 0.4 (95% CI 0.2 to 0.6) vs. 0.5 (95% CI 0.3 to 0.7), adjusted difference −0.1 (95% CI −0.4 to 0.1)

Rudolfsson, 2014¹⁸¹

6 months

Duration of pain: median 84 to 123 months

Fair

A. Massage, classical (n=36): upper body including the back, neck and shoulders.

B. Neck coordination exercise (n=36): performed with a newly developed training device designed to improve the fine movement control of the cervical spine.

C. Strength training (n=36): isometric and dynamic exercises targeting the neck and shoulder regions.

All 3 interventions consisted of 22 individually supervised single treatment sessions, 30 min each, distributed over 11 weeks

A vs. B vs. C

Age: 51 vs. 52 vs. 51 years

Female: 100% vs. 100% vs. 100%

Baseline pain (0-10 NRS), 5 vs. 6 vs. 6 (median)

Baseline NDI: 26 vs. 29 vs. 31

SF-36 PCS (0-100): 43 vs. 39 vs. 39 (median)

SF-36 MCS (0-100): 49 vs. 52 vs. 47 (median)

A vs. B:

6 months

Pain NRS (0-10): 4.0 vs. 3.8, difference 0.2 (95% CI −0.8 to 1.2)

A vs. C:

6 months

Pain NRS (0-10): No data given at 6 month, however, authors state no difference among A, B or C.

Sherman, 2009¹⁸²

2.5 and 6.5 months

Duration of pain >1 year: 81%

Fair

A. Massage (n=32): Swedish and clinical techniques and self-care recommendations; 10 massage treatments over a 10-week period

B. Self-care book: (n=32) information on potential causes of neck pain, neck-related headaches, whiplash, recommended strengthening exercises, body mechanics and posture, conventional treatment, complementary therapies for neck pain, and first aid for intermittent flare-ups.

A vs. B

Age: 47 vs. 46 years

Female: 69% vs. 69%

White: 87% vs. 81%

Smoker: 9% vs. 6%

Pain lasted > 1 year: 81% vs. 81%

Baseline NDI (0-50): 14.2 vs. 14.2S

A vs. B

2.5 months

NDI, % ≥5 points: 39% vs. 14%, RR 2.7 (95% CI 0.99 to 7.5)

NDI (0-50): difference −2.3 (95% CI −4.7 to 0.15)

6.5 months

NDI, % ≥5 points: 57% vs. 31%, RR 1.8 (95% CI 1.0 to 3.5)

NDI: difference: −1.9 (95% CI −4.4 to 0.6)

A vs. B

2.5 months

Bothersome score (0-10): difference −1.2 (95% CI −2.5 to 0.1)

Bothersome improvement ≥30%: 55% vs. 25%, RR 2.1 (95% CI 1.04 to 4.2)

SF-36 PCS (0-100): 52.8 vs. 53.3, p=0.982

SF-36 MCS (0-100): 45.9 vs. 45.3, p=0.444

6.5 months

Bothersome score: difference −0.14 (95% CI −1.5 to 1.2)

Bothersome improvement ≥30%: 43% vs. 39%, RR 1.1 (95% CI 0.6 to 2.0)

SF-36 PCS and MCS: data not given, no statistical difference

Medication use: No change in group A, 14% increase in group B

: CI = confidence interval; NDI = Neck Disability Index; NR = not reported; NRS = numeric rating scale; SF-36 MCS = Short-Form 36 Questionnaire Mental Component Scale; SF-36 PCS = Short-Form 36 Questionnaire Physical Component Scale VAS = Visual Analog Scale
a: Unless otherwise noted, followup time is calculated from the end of the treatment period.

Table 23Chronic neck pain: mind-body practices

Author, Year, Followup,^a Pain Duration, Study Quality

Intervention

Population

Function and Pain Outcomes

Other Outcomes

Lansinger, 2007²²¹

6 and 12 months

Pain duration: >5 years, 45%

Poor

A. Qigong (n=72): 10-12 group sessions of 10-15 people done 1-2 times per week over 3 months. Sessions were 1 hour and consisted of information of the philosophy of medical qigong followed by exercises based on the Biyun method

B. Exercise (n=67): 10-12 sessions 1-2 times per week over 3 months. Sessions were 1 hour and individualized to target 30%-70% of a person’s maximal voluntary capacity, with exercises aiming to maintain/increase circulation, endurance, and strength.

All patients: Ergonomic instructions and a pamphlet containing written information on neck pain

A vs. B

Age: 45 vs. 43

Female: 73% vs. 67%

Physical activity:

No to light exercise: 67% vs. 65%

Med to hard exercise: 33% vs. 35%

Baseline NDI (0-100), median: 26 vs. 22

Baseline pain (VAS, 0-10), median: 45 vs. 39

A vs. B

6 months

NDI, median: 22 vs. 18, p>0.05

Neck pain VAS (0-10), median: 2.6 vs. 2.3, p>0.05

12 months

NDI, median: 22 vs. 18, p>0.05

Neck pain VAS, median: 2.8 vs. 2.1, p>0.05

MacPherson, 2015²¹³, Essex 2017²¹⁴

ATLAS trial

1, 7, and 12 months

Duration of pain, 7 years

Fair

[Essex – New publication reporting healthcare utilization]

A. Alexander Technique group (n=172): up to 20 one-to-one lessons of 30 minutes’ duration (600 minutes total) plus usual care, delivered weekly, with the option of being delivered twice per week initially and every 2 weeks later.

B. Usual care (n=172) including general and neck pain–specific treatments routinely provided to primary care patients, such as prescribed medications and visits to physical therapists and other healthcare professionals.

Treatment was 12 sessions over 5 months lasting 50 minutes.

A vs. B

Age: 52 vs. 54 years

Female: 69% vs. 69%

White: 93% vs. 89%

Employed: 61% vs. 62%

Baseline NPQ (0-100%): 39.6 vs. 40.5

A vs. B

1 month

NPQ: 35.4 vs. 40.9, difference −5.6 (95% CI −8.3 to −2.8)

7 months

NPQ: 37.1 vs. 41.0, difference −3.9 (95% CI −6.9 to −1.0)

A vs. B

1 month

SF-12v2 physical: data NR, p=NS

SF-12v2 mental: data NR, p=NS

7 months

SF-12v2 physical: 0.68 (95% CI −1.1 to 2.4), p=0.44

SF-12v2 mental: 1.76 (95% CI 0.2 to 3.4), p=0.033

12 months^b

Mean utilization of NHS resources^c: p>0.05, data NR

Mean utilization of private healthcare (additional sessions):

-: Acupuncture: 0.2 vs. 0.1, p>0.05
-: Alexander Technique: 0.5 vs. 0, p<0.05
-: Other private appointments: 1.0 vs. 2.1, p>0.05

Mean days off work due to neck pain: 1.4 vs. 2.3, p>0.05

Mean total NHS cost (2012/13 UK £):1200 (95% CI 1000 to 1400) vs. 484 (95% CI 371 to 598), adjusted difference,^d 667 (95% CI 472 to 896); p<0.001

Seferiadis, 2015²²²

3 months

Pain duration: 9.5 years

Fair

A. Basic body awareness therapy (n=57): 1.5 hour sessions twice a week for 10 weeks. Sessions consisted of exercises based on activities of daily living, meditation, and tai chi inspired exercises aiming to improve posture and increase efficient movement patterns

B. Exercise (n=56): 1.5 hour sessions twice a week for 10 weeks. Sessions consisted of 45 minutes of muscle strengthening, 15 minutes of stretching, and 20 minutes of progressive muscle relaxation

A vs. B

Age: 47 vs. 49

Female: 66% vs. 77%

WAD classification:

1: 0% vs. 2%

2: 23% vs. 28%

3: 77% vs. 70%

Baseline NDI (0-50): 20 vs. 18.8

A vs. B

3 months

NDI: Difference from baseline −2.0 (95% CI −3.5 to −0.5) vs. −1 (95% CI −2.5 to 0.4), p>0.05

A vs. B

3 months

SF-36v2 physical functioning (0-100): Difference from baseline 7.1 (95% CI 3.7 to 11.4) vs. 0.5 (95% CI −3.2 to 4.1), p>0.05

SF-36 role-physical(0-100): Difference from baseline 17.5 (95% CI 5.9 to 29) vs. 19 (95% CI 9.3 to 28.6), p>0.05

SF-36 bodily pain(0-100): Difference from baseline 12.2 (95% CI 6.9 to 17.6) vs. 4.9 (95% CI −0.1 to 9.8), p=0.044

SF-36 general health(0-100): Difference from baseline 7.5 (95% CI 2.4 to 12.6) vs. 4.5 (95% CI −0.1 to 9), p>0.05

SF-36 vitality(0-100): Difference from baseline 7.3 (95% CI 1.0 to13.6) vs. 5.6 (95% CI −0.5 to 11.6), p>0.05

SF-36 social functioning(0-100): Difference from baseline 13.3 (95% CI 6.6 to 19.9 vs. 3.5 (95% CI −3 to 9.9), p=0.037

SF-36 role-emotional (0-100): Difference from baseline 9.3 (95% CI −2.3 to 21) vs. 4 (95% CI −8.3 to 16.4), p>0.05

SF-36 mental health (0-100): Difference from baseline 2.8 (95% CI −2 to 7.6) vs. 1.2 (95% CI −3.6 to 5.9), p>0.05

: CI = confidence interval; NDI = Neck Disability Index; NHS = National Health Service; NPQ = Northwick Park Neck Pain Questionnaire; NR = not reported; SF-12 = Short-Form 12 Questionaire; SF-36 =Short-Form 36 Questionnaire; UK = United Kingdom; VAS = visual analog scale; WAD = Whiplash Associated Disorders
a: Unless otherwise noted, followup time is calculated from the end of the treatment period.
b: 12 month data are health utilization data only from a subset of patients from the ATLAS trial (publication Essex 2017) who had full economic data N=293 (57%) [to include the acupuncture arm; details in the Acupuncture section]; no demographic data provided for the subset
c: Across all appointment types and prescription medications; National Health Services (NHS) appointment types to include, general practitioner appointments, physiotherapy visits, hospital outpatient visits, accident and emergency admissions, hospital day case admissions, other hospital admissions. NHS prescription medication included all prescription medication and prescription items specifically for neck pain. Neck pain prescriptions t-test comparing usual care and acupuncture borderline significance (p=0.06).
d: For baseline NHS healthcare costs and practice size.

Table 24Chronic neck pain: acupuncture

Author, Year, Followup,^a Pain Duration, Study Quality

Intervention

Population

Function and Pain Outcomes

Other Outcomes

Birch, 1998²³¹

3 months

Duration of pain, 7.5 years

Poor

A. Relevant acupuncture, Japanese technique (n=15): using bilateral needles on hands and feet known to be associated with treatment for neck pain and followed by Infrared lamp.

B. Irrelevant acupuncture (n=16): using bilateral needles on hands and feet in areas not associated with treatment for neck pain and followed by light.

C. NSAIDs only (n=15): 500mg per day of Trilisate

30 minute treatment twice per week for 4 weeks, then once per week for 4 weeks, total 14 treatments

A vs. B vs. C

Age: 41 vs. 38 vs. 39 years

Female: 86% vs. 77% vs. 86%

Employed: 86% vs. 69% vs. 77%

Baseline pain (CPEQ, 0-10) 4.8 vs. 4.7 vs. 4.9

A vs. B

3 months

SF-MPQ^b (0-33): 9.0 vs. 15.1, p=NS

A vs. C

3 months

SF-MPQ: 9.0 vs. 18.0, p=NS

Cho, 2014²⁵⁴

1 month

Duration of pain, NR

Poor

A. Active acupuncture, traditional Chinese (n=15), 3x/week for 3 weeks.(length of time for each intervention not reported)

B. Zaltoprofen (80mg) alone (n=15) 3x/day for 3 weeks.

A vs. B

Age: 38 vs. 39 years

Female: 60 vs. 80

Baseline NDI (0-50): 22.3 vs. 26.3

Baseline Pain (0-10 VAS): 6.1 vs. 7.1

A vs. B

1 month

NDI: 17.3 vs. 17.7, difference −0.40 (95% CI −4.6 to 3.8)

Pain VAS: 4.5 vs. 3.8, difference 0.7 (95% CI −0.7 to 2.1)

A vs. B

1 month

BDI (0-63) : 28.5 vs. 27.2, p=NS

SF-36 (0-100): 88.6 vs. 84.3, p=NS

EQ-5D (scale unclear): 7.3 vs. 6.7, p=NS

Ho 2017²³²

1 month

Duration of pain: 6 years

Fair

A. Acupuncture (n=77): 30 sessions of abdominal acupuncture 3 times a week for 2 weeks. The acupuncture points CV12, CV4, KI17, and ST24 were needled for 30 minutes with infrared therapeutic lamp placed 30 cm above the naval.

B. Sham acupuncture (n=77): 30 sessions of sham abdominal acupuncture 3 times a week for 2 weeks. Blunt sham needles were nonpenetrative and administered at nonacupuncture points.

A vs. B

Age: 46 vs. 45

Female: 81% vs. 83%

Use of pain medications: 15% vs. 13%

Previous acupuncture use: 42% vs. 44%

Baseline NPQ (0-100%): 41.3 vs. 41.0

Baseline pain (0-10 VAS): 6.4 vs. 6.1S

A vs. B

1 month

NPQ, mean ∆ (95% CI): −11.9 (−14.6 to −9.2) vs. −3.3 (−5.5 to −1.0), difference −8.7 (95% CI −12.1 to −5.2) p<0.001

Pain VAS, mean ∆ (95% CI): −2.4 (−2.8 to −1.9) vs. −0.6 (−0.9 to −0.2), difference −1.8 (95% CI −2.4 to −1.2) p<0.001

A vs. B

1 month

SF-36 PCS, mean ∆ (95% CI): 4.1 (3.0 to 5.3) vs. 1.3 (0.1 to 2.5), difference 2.8 (95% CI 1.2 to 4.5), p=0.003

SF-36 MCS, mean ∆ (95% CI): 2.0 (0.5 to 3.5) vs. −0.3 (−2.0 to 1.4), difference 2.3 (95% CI −0.0 to 4.5) p=NR

Liang, 2011²³³

3 months

Duration of pain: NR

Fair

A. Active acupuncture, traditional Chinese, (n=93)

B. Sham acupuncture (n=97)

Treatment was 3x/week for 3 weeks (9 treatments total) lasting 20 minutes after needling

Both groups received infrared

A vs. B

Age: 37 vs. 37 years

Female: 72% vs. 73%

Baseline NPQ (0-100%): 32.7 vs. 33.0

Baseline Pain (0-10 VAS): 5.3 vs. 5.5

A vs. B

3 months

NPQ: 19.1 vs. 25.5, difference −6.4 (95% CI −9.9 to −2.9)

Pain VAS: 2.9 vs. 3.2, difference −0.3 (95% CI −0.75 to 0.15)

A vs. B

3 months

SF-36 physical functioning (0-100): 84.3 vs. 85.9, p=0.447

SF-36 mental (0-100): 67.1 vs. 61.6, p=0.001

MacPherson, 2015²¹³, Essex, 2017²¹⁴

ATLAS trial

1, 7, and 12 months

Duration of pain: 7 years

Fair

[Essex – New publication reporting healthcare utilization]

A. Active acupuncture, traditional Chinese, (n=173): plus usual care 2 weeks later.

B. Usual care (n=172): including general and neck pain–specific treatments routinely provided to primary care patients, such as prescribed medications and visits to physical therapists and other healthcare professionals.

Treatment was 12 sessions over 5 months lasting 50 minutes

A vs. B

Age: 52 vs. 54 years

Female: 69% vs. 69%

White: 93% vs. 89%

Employed: 61% vs. 62%

Baseline NPQ (0-100%): 39.64 vs. 40.46

A vs. B

1 month

NPQ: 35.4 vs. 40.9, difference −5.6 (95% CI −8.3 to −2.8)

7 months

NPQ: 37.07 vs. 41.0, difference −3.9 (95% CI −6.9 to −1.0)

A vs. B

1 month

SF-12v2 physical: data NR, p=NS

SF-12v2 mental: data NR, p=NS

7 months

SF-12v2 physical (0-100): difference 0.7 (95% CI 1.1 to 2.4)

SF-12v2 mental (0-100): difference 1.8 (95% CI 0.2 to 3.4)

12 months^c

Mean utilization of NHS resources^d: p>0.05, data NR

Mean utilization of private healthcare (additional sessions):

-: Acupuncture: 1.5 vs. 0.1, p<0.001
-: Alexander Technique: 0 vs. 0, p>0.05
-: Other private appointments: 0.9 vs. 2.1, p>0.05

Mean days off work due to neck pain: 0.4 vs. 2.3, p>0.05

Mean total NHS cost (2012/13 UK £): 947 (95% CI 800 to 1094) vs. 484 (95% CI 371 to 598), adjusted difference,^e 451 (95% CI 285 to 634); p<0.001

Sahin, 2010²³⁴

3 months

Duration of pain: NR

Fair

A. Electro-acupuncture (n=15)

B. Sham acupuncture (n=16)

Treatment was 10 sessions, 3 sessions per week, lasting 30 minutes

A vs. B

Age: 39 vs. 35 years

Female: 100% vs. 81%

University graduate: 54% vs. 94%

BMI: 23.9 vs. 24.6

Baseline pain with motion (0-10 VAS): 7.4 vs. 6.2

Baseline pain at rest (0-10 VAS): 4.0 vs. 5.3

A vs. B

3 months

Pain with motion VAS: 4.50 vs. 5.38, difference −0.9 (95% CI −2.7 to 0.9)

Pain at rest VAS: 4.0 vs. 3.5, difference 0.5 (95% CI −1.9 to 2.8)

Vas, 2006²³⁵

6 months

Duration of pain: 3.8 years

Fair

A. Active acupuncture, traditional Chinese, (n=61)

B. Sham TENS (n=62)

Treatment was 5 sessions over 3 weeks lasting 30 minutes

A vs. B

Age: 46 vs. 47 years

Female: 75% vs. 89%

Baseline pain with motion (0-10 VAS): 6.9 vs. 7.2

PQ (0-100%): 52.7 vs. 56.5

A vs. B

6 months

(Mean from baseline)

Pain VAS with motion: 4.1 vs. 2.7, difference 1.4 (95% CI 0.3 to 2.6)

A vs. B

6 months

SF-36 PCS: (0-100): 9.3 vs. 5.3, p=0.054

SF-36 MCS: (0-100): 8.0 vs. 5.2, p=0.351

Rescue medication (none or occasional): 87% (39/45) vs. 68% (27/40), RR 1.28 (95% CI 1.01 to 1.64)

White, 2004²³⁶

2, 6, 12 months

Duration pain: 6 years

Fair

A. Active acupuncture, Western technique based on tender local and distal points (n=70)

B. Sham electro-acupuncture (n=65)

Treatment was 8 sessions over 4 weeks lasting 20 minutes

A vs. B

Age: 54 vs. 53 years

Female: 66% vs. 63%

Baseline NDI (0-50): 16.8 vs. 17.2

Baseline pain (0-10 VAS): 5.0 vs. 5.4

A vs. B

2 months

NDI: 11.0 vs. 12.7, difference −1.7 (95% CI −4.3 to 0.9)

Pain VAS: 1.7 vs. 2.3, difference −0.6 (95% CI −1.3 to 0.1)

6 months

NDI: 9.9 vs. 10.6, difference −0.7 (95% CI −3.6 to 2.2)

Pain VAS: 1.9 vs. 2.1, difference −1.8 (95% CI −1.1 to 0.7)

12 months

NDI: 8.9 vs. 10.7, difference −1.8 (95% CI −4.84 to 1.24)

Pain VAS: 2.1 vs. 2.4, difference −0.3 (95% CI −1.4 to 0.6)

A vs. B

2 months

SF-36 PCS (0-100): 42.5 vs. 43.8, p=NS

SF-36 MCS (0-100): 52.5 vs. 50.3, p=NS

Zhang, 2013²³⁷

3 and 6 months

Duration of pain: 6.3 years

Fair

A. Electro-acupuncture, traditional Chinese (n=103)

B. Sham laser acupuncture (n=103): via a mock laser pen

2 minutes, with the pen at a distance of 0.5 to 1 cm from the skin.

Treatment 3x/week for 3 weeks, 45 min for electro-acupuncture and 2 min per point for sham laser

A vs. B

Age: 46 years (whole population)

Female: 70% (whole population)

Baseline NPQ (0-100%): 40.7 vs. 41.1

Baseline pain with motion (0-10 NPS): 5.5 vs. 5.2

A vs. B

3 months

NPQ: mean 32.9 (95% CI 30.3 to 35.4) vs. mean 33.3 (95% CI 30.1 to 36.5), p=0.664

Pain with motion VAS: mean 4.7 (95% CI 4.2 to 5.1) vs. mean 4.5 (95% CI 4.1 to 5.0), p=0.617

6 months

NPQ: mean 33.6 (95% CI 30.7 to 36.4) vs. mean 34.3 (95% CI 31.1 to 37.6), p=0.808

Pain with motion: mean 4.7 (95% CI 4.2 to 5.2) vs. mean 4.4 (95% CI 3.9 to 4.8), p=0.813

A vs. B

3 months

SF-36 PCS (0-100): mean 52.8 (95% CI 53.0 to 53.7) vs. mean 53.3 (95% CI 52.4 to 54.2), p=0.982

SF-36 MCS (0-100): mean 45.9 (95% CI 46.0 to 46.8) vs. mean 45.3 (95% CI 44.2 to 46.4), p=0.444

6 months

SF-36 PCS: mean 53.0 (95% CI 52.0 to 53.9) vs. mean 53.2 (95% CI 52.3 to 54.0), p=0.559

SF-36 MCS: mean 45.4 (95% CI 44.5 to 46.3) vs. mean 44,4 (95% CI 43.4 to 45.4), p=0.246

: ∆ = change; BDI = Beck Depression Inventory; CI = confidence interval; CPEQ = Comprehensive Pain Evaluation Questionnaire; EQ-5D = Euroqol 5-D; NDI = Neck Disability Index; NHS = National Health Service; NPQ = Northwick Park Neck Pain Questionnaire; NR = not reported; NS = not statistically significant; NSAID = nonsteroidal anti-inflammatory drug; SF-36 MCS = Short Form-36 questionnaire Mental Component Score; SF-36 PCS = Short Form-36 questionnaire Physical Component Score; SF-MPQ = McGill Pain Questionnaire Short Form; TENS = Transcutaneous electrical nerve stimulation; VAS = visual analog scale
a: Unless otherwise noted, followup time is calculated from the end of the treatment period
b: Estimated from Figure 1 in Birch et al.²³¹
c: 12 month data are health utilization data only from a subset of patients from the ATLAS trial (publication Essex 2017) who had full economic data N=293 (57%) [to include the acupuncture arm; details in the Acupuncture section]; no demographic data provided for the subset
d: Across all appointment types and prescription medications; National Health Services (NHS) appointment types to include, general practitioner appointments, physiotherapy visits, hospital outpatient visits, accident and emergency admissions, hospital day case admissions, other hospital admissions. NHS prescription medication included all prescription medication and prescription items specifically for neck pain. Neck pain prescriptions t-test comparing usual care and acupuncture borderline significance (p=0.06).
e: For baseline NHS healthcare costs and practice size.

Table 25Osteoarthritis knee pain: exercise

Author, Year, Followup,^a Pain Duration, Study Quality

Intervention

Population

Function and Pain Outcomes

Other Outcomes

Abbott, 2013⁴⁷

9.75 months

Duration of diagnosis: Mean 2.5 to 2.8 years

Fair

A. Exercise (n=51/29 knee OA): 7 sessions of strengthening, stretching, and neuromuscular control over 9 weeks, with 2 booster sessions at week 16. Individual exercises prescribed as needed. Home exercise prescribed 3 times weekly

B. Usual care (n=51/28 knee OA)

A vs. B (total population, includes hip OA)

Age: 67 vs. 66 years

Female: 52% vs. 58%

Percent hip OA: 43% vs. 45%

Percent knee OA: 57% vs. 55%

Percent both hip OA and knee OA: 20% vs. 26%

Baseline WOMAC (0−240): 95.5 vs. 93.8

A vs. B (knee OA only)

A vs. C

9.75 months

WOMAC mean change from baseline: −12.7 vs. −31.5

None

Allen, 2018⁶⁸

3, 6, and 12 months

Duration of pain: NR

Fair

[New trial]

A. PT (n=140): Up to 8 sessions over 4 months

B. IBET (n=142): Strength and stretching (3 times/week) and daily aerobic exercises

C. WL (n=68)

All Patients: continued to received usual care

A vs. B vs. C

Age: 66 vs. 65 vs. 64 years

Female: 71% vs. 69% vs. 78%

Mean duration of chronicity: 11.6 vs.14.1 vs. 14.2 years

Baseline WOMAC-Total (0-96): 32 vs. 31.3 vs. 33.6

Baseline WOMAC-ADL (0-68): 22.6 vs. 21.8 vs. 23.9

Baseline WOMAC-Pain (0-20): 6.1 vs. 6.0 vs. 6.2

Baseline PASE-Total (0-400): 121.4 vs. 132.3 vs. 126.9

Baseline PASE-Household: 70.4 vs. 81.6 vs. 71.8

Baseline PASE-Leisure: 20.9 vs. 22.4 vs. 21.5

Baseline PASE-Work: 29.1 vs. 30.5 vs. 34.2

A vs. C

8/12 months^b

LSM Δ in WOMAC Total (N=348): −4.4 (95% CI −6.7 to −2.2) vs. −2.8 (95% CI −5.9 to 0.3); difference −1.6 (95% CI −5.3 to 2.1), p=0.390

LSM Δ in WOMAC-ADL (N=348): −3.3 (95% CI −4.9 to −1.7) vs. −1.5 (95% CI −3.8 to 0.7); difference −1.8 (95% CI −4.4 to 0.9), p=0.1900

LSM Δ in WOMAC-Pain (N=350): −0.7 (95% CI −1.2 to −0.2) vs. −0.6 (95% CI −1.4 to 0.1); difference −0.1 (95% CI −0.9 to 0.8), p=0.900

LSM Δ in PASE-Total (N=340): 8.3 (95% CI −2.0 to 18.6) vs. 1.2 (95% CI −13.1 to 15.5); difference 7.1 (95% CI −9.7 to 23.9), p=0.410

LSM Δ in PASE-Leisure (N=344): 8.7 (95% CI 4.3 to 13.1) vs. −0.1 (95% CI −6.3 to 6.0); difference 8.8 (95% CI 1.5 to 16.1), p=0.020

LSM Δ in PASE-Household (N=345): 2.3 (95% CI −3.6 to 8.2) vs. −3.4 (95% CI −11.6 to 4.8); difference 5.7 (95% CI −3.9 to 15.4), p=0.250

LSM Δ in PASE-Work (N=349): −2.6 (95% CI −9.6 to 4.3) vs. 5.2 (95% CI −4.5, 15); difference −7.9 (95% CI −19.4 to 3.6), p=0.180

B vs. C

12 months

LSM Δ in WOMAC-Total (N=348): −5.5 (95% CI −7.8 to −3.1) vs. −2.8 (95% CI −5.9 to 0.3); difference −2.6 (95% CI −6.4 to 1.1), p=0.170

LSM Δ in WOMAC-ADL (N=348): −3.4 (95% CI −5.1 to −1.7) vs. −1.5 (95% CI −3.8 to 0.7); difference −1.9 (95% CI −4.6 to 0.8), p=0.170

LSM Δ in WOMAC-Pain (N=350): −1.2 (95% CI −1.7 to −0.6) vs. −0.6 (95% CI −1.4 to 0.1); difference −0.5 (95% CI −1.4 to 0.4), p=0.260

LSM Δ in PASE-Total (N=340): 8.2 (95% CI −3.0 to 19.4) vs. 1.2 (95% CI −13.1 to 15.5); difference 7.0 (95% CI −10.3 to 24.4), p=0.430

LSM Δ in PASE-Leisure (N=344): 7.7 (95% CI 2.9 to 12.4) vs. −0.1 (95% CI −6.3 to 6.0); difference 7.7 (95% CI 0.3 to 15.3), p=0.040

LSM Δ in PASE-Household (N=345): −3.7 (95% CI −10.1 to 2.7) vs. −3.4 (95% CI −11.6 to 4.8); difference D −0.3 (95% CI −10.3 to 9.7), p=0.950

LSM Δ in PASE-Work (N=349): 5.3 (95% CI −2.2 to 12.7) vs. 5.2 (95% CI −4.5 to 15); difference 0.00 (95% CI −11.8 to 11.8), p=1.000

None

Bennell, 2005⁴⁸

3 months

Duration of pain: 9.6 vs. 8.7 years

Fair

A. Neuromuscular Re-education (Physiotherapy) (n=73): Knee taping; exercises to retrain the quadriceps, hip, and back muscles; balance exercises; thoracic spine mobilization; and soft tissue massage. individual sessions lasting 30 to 45 minutes once weekly for 4 weeks, then fortnightly for 8 weeks. Thrice-daily standardized home exercises.

B. Control (n=67)

Placebo: sham ultrasound and topical nontherapeutic gel. 30 to 45 minutes once weekly for 4 weeks, then fortnightly for 8 weeks.

A vs. B

Age: 67 vs. 70 years

Female: 68% vs. 66%

Baseline WOMAC Physical Function (0-68): 27.6 vs. 28.4

Baseline WOMAC Pain (0-20): 8.2 vs. 8.0

Baseline VAS Pain on movement (0-10): 5.3 vs. 5.2

Baseline KPS (0-36): 16.6 vs. 16.4

Baseline KPS Frequency (0-30): 23.5 vs. 22.8

A vs. B

3 months

Responders (≥1.75 point change), global improvement in pain on VAS (since start of trial): 59% (35/59) vs. 50% (33/65), RR 1.2 (95% CI 0.8 to 1.6)

Responders (≥1.75 point change), VAS pain on movement (prior week): 58% (34/59) vs. 42% (27/65), RR 1.4 (95% CI 1.0 to 2.0

WOMAC, Physical Function: 20.0 vs. 21.7, difference −0.9 (95% CI −4.4 to 2.7)

WOMAC, Pain: 5.8 vs. 6.0, difference −0.4 (95% CI −1.5 to 0.7)

VAS pain on movement: 3.2 vs. 3.5, difference −0.5 (95% CI −1.2 to 0.3)

KPS, Severity: 13.5 vs. 14.3, difference −1.0 (95% CI −2.5 to 0.6)

KPS, Frequency: 19.4 vs. 20.3, difference −1.7 (95% CI −3.5 to 0.1)

A vs. B

3 months

SF-36, Physical Function (0-100): 50.5 vs. 46.2, difference 4.3 (95% CI −1.8 to 10.4)

SF-36, Bodily Pain (0-100): 60.4 vs. 61.8, difference 1.8 (95% CI −6.7 to 10.3)

SF-36, Role Physical (0-100): 47.0 vs. 46.5, difference 1.6 (95% CI −11.1 to 14.3)

AQoL(−0.04 to 1.0): 0.52 vs. 0.48, difference 0.05 (95% CI 0.01 to 0.10)

Withdrawals: 18% (13/73) vs. 3% (2/67); RR 6.0 (95% CI 1.4, 25.5)

Group A: Minor skin irritation (48%), increased pain with exercises (22%), pain with massage (1%)

Group B: Increased pain (2%), itchiness and pain with application of gel (2%)

(All were minor and short-lived)

Chen, 2014⁴⁹

6 months

Duration of pain: 10-144 months

Poor

A. Exercise (n=30): 3 sessions per week for 8 weeks. Sessions consisted of a 20 minutes of hot packs and 5 minutes of passive range of motion exercises on a stationary bike, followed by an isokinetic muscle-strengthening exercise program

B. Control (n=30): Details NR

A + B

Age: 63

Females: 85%

A vs. B

Baseline Lequesne Index (0-26): 7.8 vs. 8.0

Baseline pain VAS (0-10): 5.5 vs. 5.6

A vs. B

6 months

Lequesne Index: 5.4 vs. 7.6, (difference −2.2, 95% CI −3.1 to −1.3)

Pain VAS: 4.0 vs. 6.5, (difference −2.5, 95% CI −3.3 to −1.7)

A vs. B

6 months

Intolerable knee pain: 10% (3/30) vs. 0% (0/30)

RR=infinity, p=0.08

Dias, 2003⁵⁰

6 months

Duration of pain: NR

Poor

A. Exercise (n=25): 12 exercise sessions twice a week for the 6 month study period in addition to three supervised walks of 40 minutes each week. Exercise sessions consisted stretching, concentric and eccentric isotonic progressive resistance exercises, and closed kinetic chain weight-bearing exercises

B. Control group (n=25): Subjects were instructed to follow the instructions given at an educational session that all participants attended (see information below)

All patients: One-hour educational session consisting of a lecture on disease characteristics, joint protection, pain management, and strategies to overcome difficulties in activities of daily life

A vs. B

Age, median: 74 vs. 76

Female: 84% vs. 92%

Baseline Lequesne Index, median (0-24): 12 vs. 12.5

Baseline HAQ, median (0-3): 1 vs. 1

A vs. B

6 months

Lequesne Index, median: 4.3 vs. 13, p=0.001

HAQ, median: 0.3 vs. 1.1, p=0.006

A vs. B

6 months

SF-36 functional capacity, median (0-100): 77.5 vs. 40, p<0.001

SF-36 physical role limitation, median (0-100): 92.5 vs. 75, p=0.001

SF-36 bodily pain, median (0-100): 100 vs. 0, p=0.002

SF-36 general health, median (0-100): 100.5 vs. 51, p=0.021

SF-36 vitality, median (0-100): 93.5 vs. 87, p=0.027

Adverse Events: NR

Ettinger, 1997⁵¹ (index trial)

Pennix 2002⁵⁸ (substudy looking at baseline depressive symptoms)

FAST trial

6 months, 15 months

Duration of pain: NR

Fair

A. Aerobic Exercise Program (n=144): 3-month facility-based walking program of 3 times per week for 1 hour. Each session consisted of a 10-minute warm-up and cool-down phase, including slow walking and flexibility stretches, and a 40-minute period of walking at an intensity equivalent to 50% to 70% of the participants’ heart rate reserve. Followed by 15-month home-based walking program.

B. Resistance Exercise Program (n=146): 3-month supervised facility-based program, with three 1-hour sessions per week, and a15-month home-based program. Each session consisted of a 10-minute warm-up and cool-down phase and a 40-minute phase consisting of 2 sets of 12 repetitions of 9 exercises.

C. Attention Control (n=149): attended, during the first 3 months, monthly group sessions on education related to arthritis management, including time for discussions and social gathering. Later, participants were called bimonthly (months 4-6) or monthly (months 7-18) to maintain health updates and provide support

A vs. B vs. C

Age: 69 vs. 68 vs. 69 years

Female: 69% vs. 73% vs. 69%

African-American: 24% vs. 28% vs. 26%

Baseline function: NR

A vs. C

Average across all time-points:

FAST Physical Disability Scale

Total: 1.7 vs. 1.9

Ambulation subscale: 2.2 vs. 2.6

Transfers subscale: 1.8 vs. 1.9

Pain: 2.1 vs. 2.4

B vs. C

Average across all time-points:

FAST Physical Disability Scale

Total: 1.7 vs. 1.9

Ambulation subscale: 2.7 vs. 2.6

Transfers subscale: 1.7 vs. 1.9

Pain: 2.2 vs. 2.4

A vs. B vs. C

Adverse Events: Falls- 14% (2/144) vs. 14% (2/146) vs. 0% (0/149); p=0.15 for both A vs. C and B vs. C

Death- 0% (0/144) vs. 0% (0/146) vs. 0.7% (1/149)

CES-D (average across all time-points)

CES-D: 2.12 vs. 2.59 vs. 2.80; A vs. C, p<0.001; B vs. C, p=0.27

Penninx, 2001⁵⁷

FAST trial (same trial as Ettinger 1997 and Pennix 2002 above): substudy in only patients with no baseline ADL disability

6 and 15 months

Duration of pain: NR

Fair

A. Aerobic Exercise Program (n=88): 3-month facility-based walking program of 3 times per week for 1 hour. Each session consisted of a 10-minute warm-up and cool-down phase, including slow walking and flexibility stretches, and a 40-minute period of walking at an intensity equivalent to 50% to 70% of the participants’ heart rate reserve. Followed by 15-month home-based walking program.

B. Resistance Exercise Program (n=82): 3-month supervised facility-based program, with three 1-hour sessions per week, and a15-month home-based program. Each session consisted of a 10-minute warm-up and cool-down phase and a 40-minute phase consisting of 2 sets of 12 repetitions of 9 exercises.

C. Attention Control (n=80): attended, during the first 3 months, monthly group sessions on education related to arthritis management, including time for discussions and social gathering. Later, participants were called bimonthly (months 4-6) or monthly (months 7-18) to maintain health updates and provide support

A vs. B vs. C

Age: 70 vs. 69 vs. 69 years

Female: 66% vs. 72% vs. 66%

African-American: 25% vs. 21% vs. 28%

Baseline disability (scale NR): 1.7 vs. 1.7 vs. 1.6

Baseline pain intensity (1-6): 2.2 vs. 2.1 vs. 2.1

A vs. B vs. C

15 months

ADL Disability (overall): 36.4% vs. 37.8% vs. 52.5%

Disability in transferring from a bed to a chair: 29.5% vs. 36.6% vs. 50.0%

Disability in bathing: 12.5% vs. 13.4% vs. 27.5%

Disability in toileting: 19.4% vs. 13.4% vs. 25.0%

Disability in dressing: 5.7% vs. 7.3% vs. 17.5%

Disability in eating: 0% vs. 1.2% vs. 5.0%, p=0.02

15 months

ADL Disability (overall)

A vs. C: adjusted RR 0.53 (95% CI 0.33 to 0.85),

B vs. C: adjusted RR 0.60 (95% CI 0.38 to 0.97),

Disability in transferring from a bed to a chair

A vs. C: adjusted RR 0.46 (95% CI 0.28 to 0.76)

B vs. C: adjusted RR 0.68 (95% CI 0.42 to 1.09)

Disability in bathing

A vs. C: adjusted RR 0.31 (95% CI 0.15 to 0.68)

B vs. C: adjusted RR 0.44 (95% CI 0.21, 0.93)

Disability in toileting

A vs. C: adjusted RR 0.58 (95% CI 0.29 to 1.15)

B vs. C: adjusted RR 0.61 (95% CI 0.28 to 1.31)

Disability in dressing

A vs. C: adjusted RR 0.20 (95% CI 0.07 to 0.64)

B vs. C: adjusted RR 0.46 (95% CI 0.17 to 1.22)

Disability in eating: incidence too small to calculate risks.

A vs. B vs. C

15 months

Increased severity of knee OA leading to withdrawal: n=3 (not reported by exercise group)

Holsgaard-Larsen 2017/2018¹⁰²^,¹⁰³

10 months

Duration of pain: NR

Fair

[New trial]

A. NEMEX (n=47): 8 weeks of twice weekly 60-minute sessions.

B. Standard Pharmaceutical Care (PHARMA) (n=46): Standard recommendations of analgesics and anti-inflammatory drugs (acetaminophen and oral NSAIDs – including prescription if needed)

A vs. B

Age: 58 vs. 58

Female: 62% vs. 54%

Baseline KOOS-ADL (0-100): 68.2 vs. 68.4

Baseline KOOS-Symptoms (0-100): 66.1 vs. 66.6

Baseline KOOS-Sports/Recreation (0-100): 35.3 vs. 42.6

Baseline UCLA (0-10): 7.1 vs. 6.8

Baseline KOOS-Pain (0-100): 61.6 vs. 60.1

A vs. B

10 months

Mean Δ in KOOS-ADL: 11.4 vs. 7.9; difference −3.6 (95% CI −9.2 to 2.1) p=0.216

MCID KOOS-ADL: number of responders (>10 improvement): 47% (22/47) vs. 28% (13/46); RR 1.7, 95% CI 1.0 to 2.9, p=0.06

Mean Δ in KOOS-Sports/Recreation: 9.4 vs. 6.5; difference −2.9 (95% CI −11.4 to 5.5) p=0.492

Mean Δ in KOOS-Symptoms: 10.9 vs. 3.3; difference −7.6 (95% CI −12.7 to −2.6) p=0.004

Mean Δ in UCLA: 0.0 vs. 0.1; difference 0.1 (95% CI −0.6 to 0.7) p=0.852

Mean Δ in KOOS-Pain: 13.6 vs. 9.4; difference 4.2 (95% CI −10.0 to 1.6) p=0.153

A vs. B

10 months

Mean Δ in KOOS-QoL (0-100): 10.0 vs. 8.7; difference −1.3 (95% CI −7.5 to 4.9) p=0.682

Mean Δ in EuroQol-5D Health State (scale unclear): 0.3 vs. 2.9; difference 2.6 (95% CI −2.9 to 8.1) p=0.347

Huang, 2003⁵³

10 months

Duration of pain: range, 0.33 (4 months) to 9 years

Poor

A. Isokinetic Strengthening (n=33): 3 sessions per week for 8 weeks. 60% of average peak torque the initial dose of isokinetic exercise. An increasing dose program was used in the initial first to fifth sessions (1 set to 5 sets), and a dose of 6 sets was applied from 6^th to the 24^th sessions. Each set consists of 5 repetitions of concentric and eccentric contraction in angular velocity 30°/second and 120°/second for extensors, and 5 repetitions of eccentric and concentric contraction in angular velocity 30°/second and 120°/second for flexors.

B. Isotonic Strengthening (n=33): same protocol as in the isokinetic exercise; the isotonic muscle strengthening exercise program consisted of 5 repetitions of concentric and eccentric the maximum velocity that the lever arm could achieve.

C. Isometric Strengthening (n=33): protocol as in the isokinetic exercise; the speed of passive forward or backward motion was set at 30°/second.

All intervention groups exercised 3 times weekly for 8 weeks. The patients in all groups also received 20 minutes of hot packs and passive range motion exercise by an electric stationary bike (20 cycles per minute) for 5 minutes to both knees before muscle strengthening exercise.

D. Control (n=33)

Description NR

A+B+C+D

Age: 62 years

Female: 70%

A vs. B vs. C vs. D

Baseline Lequesne Index (0-26): 6.9 vs. 7.1 vs. 6.8 vs. 7.2

Baseline VAS pain (0-10): 4.8 vs. 4.6 vs. 4.7 vs. 4.6

A vs. D

10 months

Lequesne Index: 3.1 vs. 7.6, difference −4.5 (95% CI −5.3 to −3.7),

VAS Pain: 2.5 vs. 6.1; p<0.05

B vs. D

10 months

Lequesne Index: 3.1 vs. 7.6, difference −3.6 (95% CI −4.4 to −2.8)

VAS Pain: 2.0 vs. 6.1; p<0.05

C vs. D

10 months

Lequesne Index: 4.8 vs. 7.6, difference −2.8 (95% CI −3.6 to −2.0)

VAS Pain: 3.2 vs. 6.1; p<0.05

A vs. B vs. C vs. D

10 months

Withdrawals: 3% (1/33) vs. 6% (2/33) vs. 3% (1/33) vs. 18% (6/33)

Withdrawals RR (95% CI):

A vs. D: 0.17 (0.02, 1.3)

B vs. D: 0.33 (0.07,1.53)

C vs. D: 0.17 (0.02, 1.3)

Stopped therapeutic exercise due to intolerable pain during exercise: 12.1% (4/33) vs. 6.1% (2/33) vs. 6.1% (2/33)

Huang, 2005⁵⁴

10 months

Duration of pain: 0.42 (5 months) to 12 years

Fair

A. Isokinetic Exercise (n=35):

3 times per week for 8 weeks. Began with 60% of the mean peak torque, increasing dose program was used in the first 5 sessions (1 set to 5 sets), and a dose of 6 sets was applied from the 6^th to 24^th sessions, with the density rising from 60% to 80% of the mean peak torque as the patient was able. Each set consisted of 5 repetitions of concentric contraction in angular velocities of 30°/second and 120°/second for extensors, and 5 repetitions of eccentric and concentric (Ecc/Con) contractions in angular velocities of 30°/second and 120°/second for flexors.

B. Control (n=35):

Warm-up exercises only

A+B

Age: 65 years

Female: 81%

A vs. B

Baseline Lequesne Index (1-26): 7.6 vs. 7.4

Baseline VAS pain (0-10): 5.3 vs. 5.4

A vs. B

10 months

Lequesne Index: 5.8 vs. 8.1, difference −2.3 (95% CI −3.2 to −1.4)

VAS Pain: 3.9 vs. 6.6, p<0.05

A vs. B

10 months

Withdrawals 11% (4/35) vs. 11% (4/35)

Discontinuation of exercise due to intolerable pain during exercise: 14% (5/35) vs. NA

Huang 2005⁵²

10 months

Duration of pain: 0.5 (6 mos.) to 11 years

Fair

A. Isokinetic Exercise (n=30):

3 times per week for 8 weeks. Began with 60% of the average peak torque. Intensity of isokinetic exercise increased from 1 set to 5 sets during the first through fifth sessions and remained at 6 sets for the remaining 6th through 24th sessions. Each set consisted of 5 repetitions of concentric contraction in angular velocities of 30°/s and 120°/s for extensors, and 5 repetitions of eccentric and concentric contractions in angular velocities of 30°/s and 120°/s for flexors.

B. Control (n=30):

Heat for 20 minutes and 5 minutes of passive range of motion on bike only.

A+B

Age: 62 (range, 42-72) years

Female: 81%

A vs. B

Baseline Lequesne Index(1-26): 6.7 vs. 7.0

Baseline VAS pain (0-10): 4.9 vs. 4.8

A vs. B

10 months

Lequesne Index: 5.1 vs. 7.8, difference −2.7 (95% CI −3.8 to −1.6)

VAS Pain: 3.5 vs. 6.0; p<0.05

A vs. B

10 months

Withdrawals 13% (4/30) vs. 13% (4/30)

Discontinuation of exercise due to intolerable pain during exercise: 17% (5/30) vs. NA

Lund, 2008⁵⁵

3 months

Duration of pain: 8.5 vs. 7.8 vs. 4.5

Fair

A. Aquatic Exercise (n=27): 2x per week for 8 weeks. Warm-up, strengthening and endurance exercise, balance exercise and stretching exercise. Each session lasted 50 min, comprising 10 min warm-up, 20 min resistance exercises, 10 min balance and stabilizing exercises, 5 min lower limb stretches and 5 min cool-down period. Compliance was 92%.

B. Land-based Exercise (n=25): 2x per week for 8 weeks. Warm-up, strengthening/endurance exercise, balance exercise and stretching exercise. Each session lasted 50 min, comprising 10 min warm-up, 20 min resistance exercises, 10 min balance and stabilizing exercises, 5 min lower limb stretches and 5 min cool-down period. Compliance was 85%.

C. Control (n=27): No exercise

All 3 groups were asked to continue any other treatment as usual.

A vs. B vs. C

Age: 65 vs. 68 vs. 70 years

Female: 83% vs. 88% vs. 66%

Baseline KOOS symptom (0-100): 50.5 vs. 50.9 vs. 50.1

Baseline KOOS pain (0-100): 47.1 vs. 41.0 vs. 37.9

Baseline KOOS Activities of Daily Living (0-100): 44.7 vs. 40.6 vs. 39.6

Baseline KOOS Sport (0-100): 79.1 vs. 75.6 vs. 70.0

Baseline KOOS Quality of Life (0-100): 63.7 vs. 57.0 vs. 60.8

Baseline VAS pain at rest (0-100): 29.8 vs. 23.3 vs. 15.5

Baseline VAS pain during walking (0-100): 59.8 vs. 53.0 vs. 48.5

A vs. C

3 months

KOOS symptom: 64.1 vs. 63.7; difference 0.5 (95% CI −6.6 to 7.6)

KOOS Activities of Daily Living: 63.0 vs. 61.4; difference 1.6 (95% CI −5.7 to 8.9)

KOOS sport: 24.2 vs. 23.5; difference 0.7 (95% CI −9.3 to 10.7)

KOOS quality of life: 42.8 vs. 41.4; difference 1.7 (95% CI −5.4 to 8.2)

KOOS pain: 60.7 vs. 62.6; difference −1.5 (95% CI −8.7 to 5.8)

VAS pain at rest: 18.1 vs. 23.8; difference −5.7 (95% CI −13.3 to 2.0)

VAS pain: 52.9 vs. 58.3; difference −5.4 (95% CI −16.2 to 5.4)

B vs. C

3 months

KOOS symptom: 66.1 vs. 63.7; difference 2.4 (95% CI −4.8 to 9.5)

KOOS Activities of Daily Living: 63.9 vs. 61.4; difference 2.5 (95% CI −5.0 to 9.9)

KOOS sport: 31.6 vs. 23.5; difference 8.1 (95% CI −2.0 to 18.2)

KOOS quality of life: 43.1 vs. 41.4; difference 1.7 (95% CI −5.3 to 8.7)

KOOS pain: 62.0 vs. 62.6; difference −0.3 (95% CI −7.5 to 7.0)

VAS pain at rest: 15.6 vs. 23.8; difference −8.1 (95% CI −15.8 to −0.4)

VAS pain walking: 50.1 vs. 58.3; difference −8.2 (95% CI −19.7 to 2.7)

A vs. B vs. C

3 months

Withdrawals: 4% (1/27) vs. 20% (5/25) vs. 7% (2/27)

A vs. C: RR 0.5 (95% CI 0.05, 5.2)

B vs. C: RR 2.5 (95% CI 0.6, 12.7)

Increased pain during and after exercise: 11% (3/27) vs. 32% (8/25) vs. NR

Swollen knees: 0% (0/27) vs. 12% (3/25) vs. NR

Withdrawals due to adverse events: 0% (0/27) vs. 12% (3/25) vs. NR

Mat, 2017⁷⁰

Immediately post-treatment (6 months)

Duration of pain: NR

Fair

[New trial]

A. Home Based Balance and Exercise Program [Modified Otago Exercise Program] (n=17): Encouraged to train 3 times/week, in 30 minute sessions for 6 month period.

B. Usual Care (n=24)

A vs. B

Age: 76 vs. 72, p=0.02

Female: 82.4% vs. 82.4%

Baseline KOOS-ADL (0-100): 65.1 vs. 79.7

Baseline KOOS-Sport/Recreation (0-100): 33.8 vs. 57.1

Baseline KOOS-Symptoms (0-100): 70.5 vs. 75.9

Baseline KOOS-Pain (0-100): 73.3 vs. 80.3

A vs. B

6 months

KOOS-ADL: 75.0 vs. 80.4; difference 9.2 (95% CI NR), p=0.230

KOOS-Sport/Rec: 44.1 vs. 62.3; difference 5.0 (95% CI NR), p=0.620

KOOS-Symptoms: 80.4 (18.8) vs. 80.6; difference 5.1 (95% CI NR), p=0.430

KOOS-Pain: 81.2 vs. 80.0; difference 8.2 (95% CI NR), p=0.210

A vs. B

6 months

Short FES-I (7-28): 13.9 vs. 13.6; difference −5.2 (95% CI NR), p=0.020

KOOS-QoL (0-100): 55.9 vs. 62.0; difference 6.6 (95% CI NR), p=0.460

Messier, 2004⁵⁶

Rejeski, 2002⁶⁰

3, 6 and18 months

Duration of pain: NR

Fair

A. Exercise (n=80):

Three 1-hour sessions per week done at the study facility for 4 months. Option to undergo a 2 month transition phase alternating between facility and home sessions, after which they carried out the program at home. Sessions consisted of 15 minutes of aerobic exercises, 15 minutes of resistance-training, an additional 15 minutes of aerobic exercises, and a 15 minute cool down phase.

B. Control (n=78):

1 hour sessions monthly for three months consisting of presentations on OA, obesity, and exercise and a question and answer session. Monthly phone contact was maintained for months 4-6 and bimonthly phone contact was maintained for months 7-18.

All subjects: Instructed to continue use of all medications and other treatments as prescribed by their personal physicians

A vs. B

Age: 69 vs. 69

Female: 74% vs. 68%

Baseline WOMAC physical function (0-68): 24.0 vs. 26.0

Baseline WOMAC pain (0-20): 6.6 vs. 7.3

A vs. B

6 months

WOMAC physical function*: 22.0 vs. 22.0

WOMAC pain: 6.2 vs. 6.2, difference 0.0 (95% CI −0.2 to 0.2)

18 months

WOMAC physical function: 21.0 vs. 22.6

WOMAC physical function, mean change: 3.1 vs. 3.4

WOMAC pain: 6.2 vs. 6.0, difference 0.2 (95% CI 0.04 to 0.4)

A vs. B

3 months

Accident related to treatment: 1% (1/80) vs. 0% (0/78)

6-18 months (average; reported by Rejeski 2002)

SF-36 PCS: 37.1 vs. 34.4

SF-36 PCS, adjusted mean: 37.6 vs. 35.3

SF-36 MCS: 52.9 vs. 53.5

SF-36 MCS, adjusted mean: 54.1 vs. 53.7

Quilty, 2003⁵⁹

2.5 months, 10.5 months

Duration of pain: NR

Fair

A. Combination (Physiotherapy) (n=40):

9 sessions over a 10 week period. Sessions consisted of patellar taping, 7 individualized exercises, posture correction, and footwear advice. All exercises were performed 10 times each, 5 times a day

B. Control (n=43):

Baseline discussion with the physiotherapist concerning diagnosis, prognosis, footwear, weight reduction, and activity. General exercise was encouraged but no specific quadriceps exercises were advised

A vs. B

Age: 69 vs. 67 years

Baseline WOMAC Function (0-68): 27.4 vs. 27.8

Baseline VAS pain (0-100): 51.0 vs. 53.4

A vs. B

2.5 months

WOMAC function: 26.5 vs. 27.5; Adjusted difference −0.6 (95% CI −3.7, 2.4)

VAS Pain: 42.8 vs. 50.5; Adjusted difference −6.4 (95% CI −15.3, 2.4)

10.5 months

WOMAC function: 29.7 vs. 28.3; Adjusted difference 1.7 (95% CI −1.8, 5.2)

VAS Pain: 48.1 vs. 54.1; Adjusted difference −4.9 (95% CI −13.6, 3.8)

A vs. B

Withdrawals 2% (1/43) vs. 0% (0/44)

Adverse Events: None

de Rooij, 2017⁶⁹

3 months

Duration of symptoms: Mean 8.6 to 9.4 years

Fair

[New trial]

A. Individualized Exercise Therapy (n=63):

2 sessions of 30–60 minutes per week under the supervision of a PT for 20 weeks. Training consisted of muscle-strength training of the lower extremity, aerobic training, and training of daily activities. 86% (54/93) of patients received ≥27 of 40 sessions.

B. Usual Care and Waitlist (n=63)

A vs. B

Age: 63 vs. 64 years

% Female: 78% vs. 73%

Baseline WOMAC physical functioning (0-68): 35.1 vs. 31.0

Baseline SF-36 physical functioning (0-20): 18.4 vs. 18.8

Baseline patient-specific functioning list (PSFL) (0-10): NR

Baseline NRS knee pain (0-10): 6.4 vs. 5.9

Baseline WOMAC pain (0-20): 10.1 vs. 9.4

A vs. B

3 months

WOMAC physical functioning (0–68): 23.5 vs. 31.4, difference −9.3 (95% CI −12.8 to −5.8)

SF-36 physical functioning (0–20): 21.4 vs. 18.9, difference 2.1 (95% CI 0.9 to 3.3)

PSFL (performance of activites 0-10): 4.1 vs. 5.9, difference −1.7 (95% CI −2.5 to −1.0)

NRS knee pain severity (0–10): 4.7 vs. 6.2, difference −1.6 (95% CI −1.6 to −1.0)

WOMAC pain (0–17): 6.6 vs. 8.6, difference −2.0 (95% CI −3.1 to −0.8)

Rosedale, 2014⁶¹

2.5 months

Duration of pain: NR

Fair

A. Exercise (n=120):

Given end-range exercises in the direction they had responded to, to be performed 10 times every 2 to 3 hours. A nonresponder subgroup was given exercises to strengthen quadriceps and aerobic exercises. All subjects in the exercise group attended 4 to 6 physiotherapy sessions, 2 to 3 assessment sessions lasting up to 1 hour and the rest followup sessions lasting 20 minutes, over a 2 week period.

B. Waiting list (n=60):

Subjects were followed up in the orthopedic department at the surgeon’s discretion and continued receiving their usual care.

A vs. B vs. C

Age: 66 vs. 64

Female: 56% vs. 60%

Median comorbidities: 3 vs. 3

Baseline KOOS function (0-100): 56 vs. 51

Baseline KOOS function in sport and recreation(0-100): 22 vs. 20

Baseline KOOS knee symptoms (0-100): 50 vs. 48

Baseline KOOS quality of life(0-100): 28 vs. 27

Baseline KOOS pain(0-100): 51 vs. 46

Baseline P4 pain scale: 21 vs. 23

A vs. B

2.5 months

KOOS function: 61 vs. 52, (adjusted difference 5, 95% CI 1 to 9)

KOOS function in sport and recreation: 31 vs. 24, (adjusted difference 6, 95% CI 0 to 11)

KOOS pain: 56 vs. 46, (adjusted difference 7, 95% CI 3 to 11)

P4 pain scale: 24 vs. 21, (adjusted difference −2, 95% CI −4 to 1)

KOOS knee symptoms: 56 vs. 52, (adjusted difference 2, 95% CI −2 to 6)

KOOS quality of life: 34 vs. 32, (adjusted difference 1, 95% CI −3 to 6)

Segal, 2015⁶²

3 and 9 months

Duration of pain: NR

Fair (3 months)

Poor (9 months)

A. Gait Training (n=24):

guided strategies to optimize knee movements during treadmill walking; computerized motion analysis with visual biofeedback; individualized home programs from physical therapist; Twice weekly sessions (45 minutes) for 12 weeks (24 total sessions)

B. Usual Care (n=18):

Usual care for knee OA and were not asked to make changes in their lifestyle (e.g., annual visit to their physician, use of pain medications, knee surgery and/or physical therapy); ask to keep a diary

A vs. B

Age: 70 vs. 69 years

Female: 76% vs. 53%

Race: NR

Baseline LLFDI basic lower limb function score: 65.8 vs. 63.5

Baseline KOOS Symptoms: 60.1 vs. 63.0

Baseline KOOS Pain: 62.7 vs. 59.8

A vs. B, between group difference in change score compared with baseline

3 months

LLFDI basic lower limb function score: 2.3 (95% CI −1.8 to 6.3)

KOOS Pain: 3.7 (95% CI −4.7 to 12.1)

KOOS Symptoms: 6.2 (95% CI −2.9 to 15.4)

9 months

LLFDI basic lower limb function score: 1.0 (95% CI −7.4 to 9.4)

KOOS Pain: 7.2 (95% CI −2.0 to 16.5)

KOOS Symptoms: 6.0 (95% CI −6.2 to 18.2)

Sullivan, 1998⁶³

10 months

Duration of pain: NR

Poor

A. Exercise (n=52):

3 group sessions of 10-15 subjects per week were done for 8 weeks. Sessions were structured as a hospital-based supervised fitness walking and supportive patient education program. Sessions consisted of stretching and strengthening exercises, expert speakers, group discussions, instructions in safe walking techniques, and up to 30 minutes of walking. At the end of the 8 week treatment period, subjects were encouraged to continue walking and given guidelines for managing individualized programs of fitness walking.

B. Usual care (n=50):

Subjects continued to receive the standard routine medical care they had been receiving prior to enrollment in the study. Subjects were interviewed weekly during the 8 week treatment period about their functional and daily activities.

A vs. B

Age: 71 vs. 68

Female: 77% vs. 90%

Baseline AIMS physical activity subscale (0-10): 6.3 vs. 6.4

Baseline AIMS arthritis impact subscale (0-10): 4.6 vs. 4.5

Baseline AIMS pain subscale (0-10): 4.9 vs. 5.5

Baseline AIMS general health perception subscale (0-10): NR

Baseline pain VAS (0-10): 4.1 vs. 6.3

A vs. B

10 months

AIMS physical activity subscale: 6.1 vs. 6.2, difference −0.1, (95% CI −1.7 to 1.5)

AIMS arthritis impact subscale: 3.3 vs. 3.8, difference −0.5, (95% CI −1.8 to 0.8)

AIMS pain subscale: 4.6 vs. 5.5, difference −0.9, (95% CI −2.2 to 0.4)

Pain VAS: 5.0 vs. 5.4, difference −0.4, (95% CI −2.0 to 1.2)

AIMS general health perception subscale: 3.7 vs. 3.3, difference 0.4 (95% CI −1.0 to 1.8)

Thomas, 2002⁶⁴

6 months, 12 months, 18 months, 24 months

Duration of pain: NR

Poor

A. Exercise (n=470):

Two year, self-paced program that started with four 30 minute visits in the first 2 months followed by visits every 6 months. Designed to maintain and improve strength of muscles around the knee, range of motion at the knee joint, and locomotor function. 121 of the 470 patients also received attention control which consisted of monthly phone calls by a study researcher that sought to monitor symptoms and offer simple advice on knee pain management. 114 of the 470 patients received the attention control and a placebo tablet in addition to the exercise program. The remaining 235 participate in the exercise program only.*

B. Control (n=316):

160 subjects received attention control consisted of monthly phone calls by a study researcher that sought to monitor symptoms and offer simple advice on knee pain management. 78 subjects took a placebo tablet. 78 patients had no contact with the researchers between assessment visits.

A vs. B

Age: 62 vs. 62

Female: 63% vs. 66%

Baseline WOMAC physical function score (0-68): 23.2 vs. 23.0

Baseline WOMAC pain score (0-20): 7.15 vs. 7.35

A vs. B

6 months

WOMAC physical function: difference NR

WOMAC pain: difference −0.6 (95% CI −1.0 to −0.2)

24 months

WOMAC physical function: difference −2.6 (95% CI −4.1 to −1.1)

WOMAC pain: difference −0.82 (95% CI −1.3 to −0.3)

A vs. B

6 months

HADS: NR

SF-36: NR

24 months

HADS: NR (NS)

SF-36: NR (NS)

Thorstensson, 2005⁶⁵

5 months

Duration of pain: NR

Fair

A. Exercise (n=30):

1 hour group exercise sessions of 2 to 9 participants, twice a week for 6 weeks. Sessions consisted of weight-bearing exercises to increase postural control and to increase endurance and strength in the lower extremity. Patients were given daily exercises to perform at home.

B. Control group (n=31):

Subjects were told not to make any lifestyle changes. Subjects met with the physical therapist at baseline, at 6 weeks, and at 6 months

A vs. B

Age: 55 vs. 57

Female: 50% vs. 52%

Baseline KOOS ADL (0-100): 69 vs. 71

Baseline KOOS Symptoms (0-100): 63 vs. 66

Baseline KOOS sports and recreation (0-100): 34 vs. 37

Baseline KOOS Pain (0-100): 60 vs. 64

A vs. B

5 months

KOOS ADL, mean change: 0.9 vs. −1.9, p=0.61

KOOS pain, mean change: 3.1 vs. −1.1, p=0.32

KOOS symptoms, mean change: 1.0 vs. −3.4, p=0.31

KOOS sports and recreation, mean change: 0.5 vs. −8.3, p=0.32

A vs. B

5 months

KOOS QOL, mean change (0-100): 5.1 vs. −2.3, p=0.02

SF-36 PCS, mean change (0-100): 3.0 vs. −0.7, p=0.09

SF-36 MCS, mean change (0-100): 0.7 vs. −0.7, p=0.40

Adverse Events:

A vs. B

Increased knee pain: 3% (1/30) vs. 0% (0/31)

Waller, 2017⁷¹

12 months

Duration of pain: NR

Fair

[New trial]

A. Aquatic Exercise (n=43):

Aquatic resistance training sessions (1 hour long) 3 times per week for 16 weeks (48 sessions total). Variable resistance equipment used to progress intensity

B. Usual Care (n=44):

Asked to continue regular leisure activities, offered two sessions (1 hour each) of light stretching, relaxation and social interaction during 12 week intervention period.

A vs. B

Age: 64 vs. 64

Female: 100% vs. 100%

Baseline KOOS-Symptoms (0-100): 74.4 vs. 74.8

Baseline KOOS-ADL (0-100): 84.5 vs. 85.2

Baseline KOOS-Sport/Recreation (0-100): 63.6 vs. 64.8

Baseline KOOS-Pain (0-100): 80.6 vs. 82.1

A vs. B

12 months

KOOS-ADL: 89.2 vs. 88.3; difference 1.0 (95% CI −2.6 to 4.3), p=0.397

KOOS-Sport/Rec: 71.0 vs. 68.7; difference 2.5 (95% CI −4.8 to 9.0), p=0.396

KOOS-Symptoms: 81.4 vs. 77.9; difference 3.31 (95% CI −1.2 to 7.3), p=0.119

KOOS-Pain: 86.8 vs. 85.1; difference 1.5 (95% CI −2.7 to 5.7), p=0.187

A vs. B

12 months

KOOS-QoL (0-100): 75.0 vs. 76.4; differnce 1.21 (95% CI −6.0 to 8.0), p=0.308

Weng, 2009⁶⁶

10 months

Duration of pain: 42.5 months

Poor

A. Isokinetic exercise (n=33):

3 sessions a week for 8 weeks. Sessions consisted of sets of concentric and eccentric contractions at varying angular velocities and start and stop angles. Hot packs for 10 minutes and passive range of motion exercises

B. No intervention (n=33):

Warm-up cycling for 10 minutes. Hot packs for 10 minutes and passive range of motion exercises

A+B

Age: 64

Female: 75%

A vs. B

Baseline Lequesne Index (0-24): 7.3 vs. 7.1

Pain VAS (0-10): 4.7 vs. 4.5

A vs. B

10 months

Lequesne Index: 6.3 vs. 7.3

Pain VAS: 3.6 vs. 5.0

A vs. B

10 months

Treatment related pain causing withdrawal: 9% (3/33) vs. 0% (0/33)

RR=infinity, p=0.08

Williamson, 2007⁶⁷

1.5 months

Duration of pain: NR

Poor

A. Combination (Physiotherapy) (n=60):

Groups of 6–10 patients, hourly, once a week for 6 weeks. Exercise circuit of static quadriceps contractions; inner range quadriceps contractions; straight leg raises; sit to stands, stair climbing; calf stretches; theraband resisted knee extensions; wobble board balance training; knee flexion/extension sitting on gym ball and free standing peddle revolutions.

B. Control (n=61):

Usual Care (home exercise and advice leaflet)

A vs. B

Age: 70 vs. 70 years

Female: 52% vs. 54%

Baseline OKS (0-48): 39.3 vs. 40.5

Baseline WOMAC (unclear scale): 50.2 vs. 51.1

Baseline VAS pain (0-10): 6.8 vs. 6.9

A vs. B

1.5 months

OKS: 38.8 vs. 40.8

WOMAC: 49.4 vs. 52.3

VAS Pain: 6.4 vs. 7.2

A vs. B

1.5 months

HADS Anxiety (0-21): 7.1 vs. 6.5

HADS Depression (0-21): 6.8 vs. 7.1

Withdrawals:

17% (10/60) vs. 0% (0/61)

Adverse Events: None

: ADL = activity of daily living; AIMS = Arthritis Impact Measurement Scale; AQoL = Assessment of Quality of Life; CES-D = Center for Epidemiologic Studies Depression; CI = confidence interval; HADS = Hospital Anxiety and Depression Scale; HAQ = Health Assessment Questionnaire; IBET = internet-based exercise training; ITT = intention-to-treat; KOOS = Knee Injury and Osteoarthritis Outcome Score; KPS = Knee Pain Scale; LLFDI = Late-Life Function and Disability Instrument; LSM = Least squares mean; MCS = Mental Component Score; NA = not applicable; NEMEX = neuromuscular exercise; NR = not reported; NS = not statistically significant; OA = osteoarthritis; OKS = Oxford Knee Score; PASE = Physical Activity Scale for the Elderly; PCS = Physical Component Score; PT = physical therapy; RR = relative risk; QoL = quality of life; SF-36 = Short-Form-36; VAS, visual analog scale; WL = waitlist; WOMAC = Western Ontario and McMaster Universities Osteoarthritis Index
a: Unless otherwise noted, followup time is calculated from the end of the treatment period.
b: Group A ceased treatment after 4 months, whereas Groups B and C continued their protocols until the 12 month F/U, therefore the ‘4 month F/U’ for group A is actually the beginning of post-treatment, and their 12 month f/u is therefore 8 months post-treatment. For intermediate followup, only group A’s ‘8 month F/U’ is compared with group C’s last F/U (12 months). Long-term followup is the comparison of 12 month followups for groups B and C only.

Table 26Osteoarthritis knee pain: psychological therapies

Author, Year, Followup,^a Pain Duration, Study Quality

Intervention

Population

Function and Pain Outcomes

Other Outcomes

Bennell, 2016¹³⁴

5 and 9 months

Duration of pain: 6 years

Fair

A. Pain coping skills training (n=74):

10, 45-minute sessions over 12 weeks; consisted of pain education and cognitive and behavioral pain coping skills training

B. Exercise (n=75):

10, 25 minute sessions over 12 week; consisted of 6 strengthening exercises.

A vs. B

Age, years: 63 vs. 63

Female: 61% vs. 59%

Radiographic disease severity:

Grade 2: 45% vs. 40%

Grade 3: 28% vs. 25%

Grade 4: 27% vs. 35%

Opioid use: 4% vs. 1%

Baseline WOMAC physical function (0-68): 35.0 vs. 34.3

Baseline WOMAC pain (0-20): 8.7 vs. 8.6

Baseline pain overall VAS (0-100): 58.7 vs. 59.1

Baseline pain with walking VAS (0-100): 61.3 vs. 60.9

A vs. B

5 months

WOMAC physical function: 23.4 vs. 21.4, difference 2.0 (95% CI −2.4 to 6.4)

WOMAC pain: 6.2 vs. 6.3, difference −0.1 (95% CI −1.2 to 1.0)

Pain overall VAS: 35.7 vs. 36.0, difference −0.3 (95% CI −9.0 to 8.4)

Pain with walking VAS: 39.1 vs. 42.3, difference −3.2 (95% CI −12.4 to 6.0)

9 months

WOMAC physical function: 21.3 vs. 18.1, difference 3.2 (95% CI −0.6 to 7.0)

WOMAC pain: 5.8 vs. 5.4, difference 0.4 (95% CI −0.8 to 1.6)

Pain overall VAS: 34.8 vs. 34.5, difference 0.3 (95% CI −7.8 to 8.4)

Pain with walking VAS: 37.3 vs. 37.5, difference −0.2 (95% CI −9.1 to 8.7)

A vs. B

5 months

DASS21 depression scale (0-42): 4.3 vs. 5.5, difference −1.2 (95% CI −4.0 to 1.6)

DASS21 anxiety scale (0-42): 4.0 vs. 4.9, difference −0.6 (95% CI −3.0 to 1.2)

AQoL-6D (−0.04 to 1.0): 0.79 vs. 0.76, difference 0.03 (95% CI −0.02 to 0.09)

9 months

DASS21 depression scale: 3.5 vs. 4.9, difference −1.4 (95% CI −3.6 to 0.8)

DASS21 anxiety scale: 3.0 vs. 4.6, difference −1.6 (95% CI −3.4 to 0.2)

AQoL-6D: 0.81 vs. 0.78, difference 0.03 (95% CI −0.02 to 0.08)

Percent of patients using opioids: 10% (7/72) vs. 13% (9/71), RR 0.77 (95% CI 0.3 to 1.9)

Gilbert, 2018¹¹¹

3, 6, 12, and 24 months

Mean duration of pain: NR

Fair

[New trial]

A. IMPAACT Motivational Interviewing (MI) (n=76)

1 initial session (45 to 60 minutes long), and 5 additional sessions (10 to 15 minutes long)

B. No treatment (n=79)

All patients: received brief physician consultation with recommendation to increase physical activity

A vs. B

Age: 61 vs. 65

Female: 58% vs. 62%

Mean duration of Chronicity: 9.6 vs. 12.1 years

Baseline WOMAC Function (0-68): 18.0 vs. 17.4

Baseline WOMAC Pain (0-20): 5.9 vs. 5.5

A vs. B

3 months

WOMAC Function: 16.5 (95% CI 14.7 to 18.4) vs. 17.8 (95% CI 16.3, 19.4); difference 1.3 (95% CI −1.1 to 3.7)

WOMAC Pain: 5.2 (95% CI 4.6 to 5.8) vs. 6.1 (95% CI 5.6 to 6.7); difference 1.0 (95% CI 0.2 to 1.8)

6 months

WOMAC Function: 15.1 (95% CI 13.1 to 17.2) vs. 16.7 (95% CI 15.1 to 18.3); difference 1.6 (95% CI −1.0 to 4.2)

WOMAC Pain: 5.3 (95% CI 4.6 to 6.0) vs. 5.5 (95% CI 4.9 to 6.0); difference 0.18 (95% CI −0.7 to 1.1)

12 months

WOMAC Function: 13.4 (95% CI 11.1 to 15.7) vs. 16.6 (95% CI 14.6 to 18.6); difference 3.2 (95% CI 0.1 to 6.2)

WOMAC Pain: 4.8 (95% CI 4.0 to 5.5) vs. 5.7 (95% CI 5.0 to 6.4); difference 0.9 (95% CI −0.1 to 1.9)

24 months

WOMAC Function: 12.5 (95% CI 10.1 to 14.9) vs. 15.3 (95% CI 12.6, 18.1); difference 2.8 (95% CI −0.8 to 6.4)

WOMAC Pain: 4.0 (95% CI 3.2 to 4.7) vs. 4.7 (95% CI 3.8 to 5.7); difference 0.8 (95% CI −0.4 to 2.0)

A vs. B

3 months

SF-36 PCS (0-100): 46.0 (95% CI 44.7 to 47.3) vs. 44.7 (95% CI 43.2 to 46.2); difference 1.4 (95% CI −0.6 to 3.4)

SF-36 MCS (0-100): 54.0 (95% CI 52.3 to 55.6) vs. 54.6 (95% CI 52.8 to 56.4); difference −0.6 (95% CI −3.1 to 1.8)

6 months

SF-36 PCS: 45.0 (95% CI 43.6 to 46.5) vs. 44.8 (95% CI 43.5 to 46.2); difference 0.23 (95% CI −1.8 to 2.2)

SF-36 MCS: 54.3 (95% CI 52.5 to 56.1) vs. 54.1 (95% CI 52.2 to 55.9); difference 0.3 (95% CI −2.3 to 2.8)

12 months

SF-36 PCS: 46.0 (95% CI 44.6 to 47.5) vs. 44.3 (95% CI 42.6 to 46.0); difference 1.7 (95% CI −0.5 to 3.9)

SF-36 MCS: 54.1 (95% CI 51.9 to 56.2) vs. 54.7 (95% CI 52.9 to 56.4); difference −0.6 (95% CI −3.4 to 2.1)

24 months

SF-36 PCS: 45.4 (95% CI 43.4 to 47.5) vs. 44.7 (95% CI 42.3 to 47.0); difference 0.78 (95% CI −2.3 to 3.9)

SF-36 MCS: 54.2 (95% CI 52.0 to 56.3) vs. 52.8 (95% CI 50.0 to 55.6); difference 1.3 (95% CI −2.16 to 4.8)

Helminen, 2015¹⁰⁹

31.5 to 10.5 months

Duration of pain: 7.8 years

Fair

Cognitive-Behavioral Training plus usual care (n=55):

2-hour groups sessions, weekly for 6 weeks (6 sessions total); included attention diversion methods (relaxation, imagery, distraction), activity-rest cycling and pleasant activity scheduling, cognitive restructuring, and homework assignments

B. Usual Care (n=56)

A vs. B

Age: 64.5 vs. 63 years

Female: 71% vs. 68%

BMI: 30 vs. 30 kg/m²

Bilateral knee OA: 33% vs. 30%

Kellgren-Lawrence grade 2: 60% vs. 61%

Duration of Chronicity: 6.6 vs. 8.9 years

Baseline WOMAC Function (0-100): 53.0 vs. 48.4

Baseline WOMAC Pain (0-100): 57.6 vs. 56.4

Baseline NRS pain (0-10), average past week: 6.6 vs. 6.4

Baseline NRS pain (0-10), worst past week: 8.0 vs. 7.5

Baseline NRS pain (0-10), average 3 months: 6.8 vs. 6.6

Baseline NRS pain (0-10), worst 3 months: 8.2 vs. 8.0

A vs. B

Post-Treatment Average (1.5 to 10.5 months)

WOMAC Function: 36.5 vs. 36.7, difference −0.3 (95% CI −8.3 to 7.8)

WOMAC Pain: 35.6 vs. 39.5, difference −3.9 (95% CI −11.8 to 4.0)

NRS pain, average past week: 5.0 vs. 4.9, difference 0.02 (95% CI −0.89 to 0.93)

NRS pain, worst over week: 6.1 vs. 5.9, difference 0.1 (95% CI −0.8 to 1.1)

NRS pain, average 3 months: 5.2 vs. 5.4 difference −0.2 (95% CI −1.0 to 0.6)

NRS pain, worst 3 months: 6.4 vs. 6.6, difference −0.1 (95% CI −0.9 to 0.7)

A vs. B

Post-Treatment Average (1.5 to 10.5 months)

WOMAC Stiffness (0-100): 46.2 vs. 49.0 difference −2.7 (95% CI −11.4 to 5.9)

BDI (0−63): 5.8 vs. 5.9, difference −0.1 (95% CI −2.2 to 2.0)

BAI (0−63): 8.0 vs. 7.1, difference 0.9 (95% CI −1.3 to 3.1)

HRQoL, 15D (scale NR): 0.82 vs. 0.85, difference −0.03 (95% CI −0.06 to 0.00)

SF-36 Physical Functioning (scale NR): 48.0 vs. 49.4 difference −1.4 (95% CI −10.2 to 7.3)

SF-36 Role-Physical: 44.4 vs. 44.5 difference −0.09 (95% CI −14.4 to 14.3)

SF-36 Bodily Pain: 57.3 vs. 57.4, difference −0.1 (95% CI −8.0 to 7.7)

SF-36 General Health: 53.1) vs. 58.2, difference −5.0 (95% CI −12.3 to 2.3)

SF-36 Vitality: 62.7 vs. 67.5, difference −4.8 (95% CI −12.6 to 3.1)

SF-36 Social Functioning: 75.0 vs. 82.8, difference −7.8 (95% CI −16.4 to 0.81)

SF-36 Role-Emotional: 67.9 vs. 74.7, difference −6.7 (95% CI −20.2 to 6.8)

SF-36 Emotional Well-Being: 75.3 vs. 78.5, difference −3.2 (95% CI −9.5 to 3.1)

SF-36 Health Change: 46.6 vs. 47.4, difference −0.8 (95% CI −9.2 to 7.6)

O’Moore, 2018¹¹²

3 months

Duration of pain: NR

Fair

[New trial]

A. iCBT (n=43)

B. Usual Care (n=24)

A vs. B

Age: 63 vs. 60 years

Female: 86% vs. 68%

Baseline WOMAC-ADL (0-68): 32.3 vs. 30.0

Baseline WOMAC-Stiffness (0-8): 4.5 vs. 4.2

Baseline WOMAC-Pain (0-20): 9.9 vs. 9.4

A vs. B

3 months

WOMAC-ADL: 24.1 vs. 30.34; difference −6.3 (95% CI −11.9 to −0.7), p<0.05

WOMAC-Stiffness: 3.3 vs. 4.4, difference −1.1 (95% CI −2.0 to −0.3), p<0.05;

WOMAC-Pain: 7.4 vs. 9.8; difference −2.34 (95% CI −4.2 to −0.5), p<0.05

Somers, 2012¹¹⁰

6-12 months

Duration of pain: NR

Poor

A. Pain Coping Skills Training (n=60): 1-hour group sessions, weekly for 12 weeks then every other week for 12 weeks (total of 18 sessions over 24 weeks); consisted of informational lectures, problem solving, skills training, relaxation exercises, homework assignments, and feedback

B. Usual Care (n=51)

A vs. B

Age: 58 vs. 58 years

Female: 67% vs. 78%

Caucasian: 62% vs. 61%

Mean Duration of Chronicity: NR

Kellgren-Lawrence score (0-4): 2.5 vs. 2.3

Baseline WOMAC function subscale (0-100): 46.2 vs. 46.1

Baseline WOMAC pain subscale (0-100): 42.8 vs. 43.4

A vs. B

Post-treatment Average (6-12 months)

WOMAC function: 35.2 vs. 37.5, p=NS

AIMS physical disability subscale: 1.5 vs. 1.4, p=NS

WOMAC pain subscale: 34.5 vs. 38.0, p=NS

AIMS pain subscale: 4.4 vs. 4.7, p=NS

A vs. B

Post-treatment Average (6-12 months)

WOMAC stiffness subscale (0-100): 44.5 vs. 46.4, p=NS

AIMS psychological subscale (0-10): 2.6 vs. 2.5, p=NS

: ADL = Activities of Daily Living; AIMS = Arthritis Impact Measurement Scale; AQoL = Assessment of Quality of Life; BAI = Beck Anxiety Inventory; BDI = Beck Depression Inventory; CI = confidence interval; DASS21 = Depression, Anxiety, and Stress Scales 21 item quesitonaire; HRQoL = health-related quality of life; iCBT = internet-based cognitive-behavioral therapy; NR = not reported; NRS = numeric rating scale; NS = not statistically significant; OA = osteoarthritis; SF-36 MCS = Short-Form 36 Mental Component Score; SF-36 PCS = Short-Form 36 Physical Component Score; RR = risk ratio; VAS = Visual Analog Scale; WOMAC = Western Ontario and McMaster Universities Osteoarthritis index
a: Unless otherwise noted, followup time is calculated from the end of the treatment period

Table 27Osteoarthritis knee pain: physical modalities

Author, Year, Followup,^a Pain Duration, Study Quality

Intervention

Population

Function and Pain Outcomes

Other Outcomes

Al Rashoud, 2014¹⁵⁰

1.5 and 6 months

Duration of pain: 11 years

Fair

A. Low-level laser therapy (n=26): continuous laser (30 mW, 830 nm wavelength) applied to 5 acupuncture points over approximately of 10 sessions

B. Placebo laser (n=23): placebo laser applied to 5 acupuncture points over approximately 10 sessions

A vs. B

Age: 52 vs. 56 years

Female: 62% vs. 65%

Baseline Saudi Knee Function Scale (SKFS) (0-112), median: 61.0 vs. 60.0

Baseline pain on movement VAS (0-10): 6.4 vs. 5.9

A vs. B

1.5 months

Pain on movement VAS: 3.0 vs. 4.2^b

SKFS, median: 31 vs. 40, median difference −10 (95% CI −23 to −4) p=0.054

6 months

Pain on movement VAS: 3.4 vs. 5.2^b

SKFS, median: 31 vs. 51, median difference −21 (95% CI −34 to −7) p=0.006

Battisti, 2004¹⁵¹

1 month

Duration of pain: 11 years

Poor

A. Therapeutic Application of Musically Modulated Electromagnetic Field (TAMMEF) (n=30):

The anatomical region treated is placed between opposing faces of low frequency electromagnets (3x4 cm). The current from amplifier B feeds a loud speaker that plays music. The music modifies parameters (frequency, intensity, waveform) of the electromagnetic field in time, randomly varying within respective ranges. 15 consecutive daily sessions, 30 minutes each

B. Extremely Low Frequency (ELF) (n=30):

Similar treatment as Intervention A except the electromagnetic field is stabilized at a frequency of 100Hz in a sinusoidal waveform. 15 consecutive daily sessions, 30 minutes each

C. Simulated (Sham) Frequency Field (n=30):

Functionally similar operation to the other groups except a simulated (noneffective) field is used, but the patients remain blinded to its effectiveness. 15 consecutive daily sessions, 30 minutes each

A + B + C

Age: 58.9 (7.4)

Female: 70%

Race: NR

Mean Duration of Chronicity: 11 (3.1)

A vs. B vs. C

Baseline Mean Lequesne Function Score (0-10)^c: 3.65 vs. 4.28 vs. 3.48

Baseline Mean Lequesne Pain Score (0-10)^c: 6.88 vs. 6.28 vs. 6.15

A vs. C

1 month

Mean Lequesne Functionality: 6.5 vs. 3.8

Mean Lequesne Pain Score: 1.4 vs. 6.9

B vs. C

1 month

Mean Lequesne Functionality: 7.1 vs. 3.8

Mean Lequesne Pain Score: 1.4 vs. 6.9

Brouwer, 2006¹⁵²

6 and 12 months

Duration of pain: 6.7 vs. 4.9 years

Poor

A. Brace (n=60):

Device: Oasys brace, Innovation Sports, Irvine, CA, USA, brace allowed medial or lateral unloading; patients also received usual care

B. Usual Care (n=57):

patient education (adaptation of activities and/or weight loss), and (if needed) physical therapy and analgesic

A vs. B

Age^f: 59.2

Female: 48% vs. 51%

Race: NR

Baseline HSS Knee Function Score (0-100): 64.9 vs. 69.0

Baseline VAS pain severity (0-10): 6.6 vs. 5.5

A vs. B

6 months

HSS Knee Function: difference 3.2 (95% CI −0.6 to 7.0)

VAS Pain Severity: difference −0.6 (95% CI −1.5 to 0.3)

12 months

HSS Knee Function: difference 3.0 (95% CI −1.1 to 7.1)

VAS Pain Severity: difference −0.8 (95% CI −1.8 to 0.1)

A vs. B

6 months

EQ-5D: difference 0.01 (95% CI −0.08 to 0.10)

12 months

EQ-5D: difference 0.01 (95% CI −0.08 to 0.10)

Cakir, 2014¹⁵³

6 months

Duration of pain: Mean 4.0 to 5.1 years

Fair

A. Continuous ultrasound (n=20): 5 times a week for 2 weeks

B. Pulsed ultrasound (n=20): 5 times a week for 2 weeks

C. Sham (n=20): 5 times a week for 2 weeks

All patients performed home exercise program 3 days a week for 8 weeks

A vs. B vs. C

Age: 57 vs. 58 vs. 57 years

Female: 70% vs. 80% vs. 85%

Baseline WOMAC physical mean function (0-68): 55.7 vs. 52.4 vs. 52.5

Baseline WOMAC pain (0-20):15.9 vs. 14.5 vs. 14.9

Baseline WOMAC stiffness (0-8): NR

Baseline pain at rest VAS (0-10): 57.9 vs. 55.7 vs. 53.6

Baseline pain on movement VAS (0-10): 75.5 vs. 73.0 vs. 72.2

Baseline disease severity VAS (0-10): 73.9 vs. 67.9 vs. 68.4

A vs. C

6 months

WOMAC physical function: 32.6 vs. 35.5, difference −2.9 (95% CI −9.2 to 3.4)

WOMAC pain: 9.5 vs. 11.1, difference −1.6 (95% CI −3.3 to 0.1)

Pain at rest VAS: 21.4 vs. 22.3, difference 1.2 (95% CI −9.1 to 11.5)

Pain on movement VAS: 38.7 vs. 38.1, difference 0.6 (95% CI −13.7 to 14.9)

Disease severity VAS: 30.0 vs. 29.5, difference 0.5 (95% CI −6.7 to 7.7)

B vs. C

6 months

WOMAC physical function: 37.1 vs. 35.5, difference 1.6 (95% CI −3.0 to 6.2)

WOMAC pain: 11.3 vs. 11.1, difference 0.2 (95% CI −1.3 to 1.7)

Pain at rest VAS: 20.2 vs. 22.3, difference −2.1 (95% CI −11.2 to 7.0)

Pain on movement VAS: 37.5 vs. 38.1, difference −0.6 (95% CI −17.0 to 15.8)

Disease severity VAS: 32.5 vs. 29.5, difference 3.0 (95% CI −4.0 to 10.0)

Fary, 2011¹⁵⁴

6.5 months

Duration of pain: 12 years

Good

A. Pulsed electrical stimulation (TENS) (n=34): pulsed electrical stimulator worn 7 hours a day daily for 26 weeks

B. Placebo electrical stimulation (n=36): placebo pulsed electrical stimulator worn 7 hours a day daily for 26 weeks

A vs. B

Age: 71 vs. 69 years

Female: 50% vs. 44%

Baseline WOMAC total (0-100): 36 vs. 34

Baseline WOMAC function (0-100): 35 vs. 34

Baseline WOMAC stiffness (0-100): 45 vs. 41

Baseline WOMAC pain (0-100): 35 vs. 36

Baseline pain VAS (0-100): 51 vs. 52

A vs. B

6.5 months

Proportion of patients who achieved MCID (≥9.1) in WOMAC function: 38% vs. 39%, RR 1.2 (95% CI 0.6 to 2.2)

Proportion of patients who achieved MCID (≥20) in pain VAS: 56% vs. 44%, RR 1.3 (95% CI 0.8 to 2.0)

Mean change in WOMAC total: 6 vs. 7, MCD −1.3 (−8.8 to 6.3)

Mean change in WOMAC function: 5 vs. 7, MCD −1.9 (95% CI −9.7 to 5.9)

Mean change in WOMAC stiffness: 9 vs. 5, MCD 3.7 (95% CI −6.0 to 13.5)

Mean change in WOMAC pain: 5 vs. 10, MCD −5.6 (95% CI −14.9 to 3.6)

Mean change in pain VAS: 20 vs. 19, MCD 0.9 (95% CI −11.7 to 13.4)

A vs. B

6.5 months

Mean change in SF-36 physical component score (0-100): −1.0 vs. −2.6, MCD 1.7 (95% CI −1.5 to 4.8)

Mean change in SF-36 mental component score (0-100): −1.2 vs. −2.4, MCD 1.2 (95% CI −2.9 to 5.4)

Fukuda, 2011¹⁵⁵

12 months

Duration of pain: NR

Poor

A. Low-dose PSW (n=32): Three, 19 minute applications per week for3 weeks (9 total) Total Energy: 17 kJ Frequency: 27.12 MHz Mean Power Output: 14.5 W Pulse Duration: 400 microseconds Pulse Frequency: 145 Hz

B. High-dose PSW (n=31): Treatment characteristics were identical to Group A except length of treatment (and received total energy) were doubled. Three, 38 min applications per week for3 weeks (9 total) Total Energy: 33 kJ

C. Sham (n=23): Treatment characteristics were identical to Group A except the device was kept in standby mode without any electrical current applied. Three, 19 min applications per week for 3 weeks (9 total)

A vs. B vs. C

Age: 62 vs. 63 vs. 57

Female: 100%

Race: NR

Baseline Knee Injury and Osteoarthritis Outcome Score Symptoms Subscale (0-100): 46.5 vs. 47.0 vs. 42.0

Baseline KOOS Daily Activities Subscale (0-100): 45.8 vs. 51.7 vs. 45.7

Baseline KOOS Recreational Activities Subscale (0-100): 16.6 vs. 15.3 vs. 18.2

Baseline KOOS Pain Subscale (0-100): 37.4 vs. 42.5 vs. 38.0

Baseline NRS Pain (0-10): 7.1 vs. 6.7 vs. 7.7

A vs. C

12 months

KOOS Symptoms Subscale: 61.6 vs. 40.7, difference 20.9 (95% 8.92 to 32.88)

KOOS Daily Activities Subscale: 68.9 vs. 41.6, difference 27.30 (95% 13.73 to 40.87)

KOOS Recreational Activities Subscale: 24.6 vs. 11.0, difference 13.6 (95% −0.73 to 27.93)

KOOS Pain Subscale: 57.5 vs. 33.0, difference 24.5 (95% 12.12 to 36.88)

NRS Pain: 5.7 vs. 7.5, difference −1.8 (95% −3.60 to 0.00)

B vs. C

12 months

KOOS Symptoms Subscale: 54.9 vs. 40.7, difference 14.2 (95% 1.21 to 27.19)

KOOS Daily Activities Subscale: 51.9 vs. 41.6, difference 10.30 (95% −1.24 to 21.84)

KOOS Recreational Activities Subscale: 15.9 vs. 11.0, difference 4.9 (95% −5.32 to 15.12)

KOOS Pain Subscale: 57.6 vs. 33.0, difference 24.6 (95% 14.59 to 34.61)

NRS Pain: 5.2 vs. 7.5, difference −2.3 (95% −3.68 to −0.92)

A vs. C

12 months

KOOS Quality of Life Subscale (0-100): 31.8 vs. 33.0

B vs. C

12 months

KOOS Quality of Life Subscale: 41.2 vs. 33.0

A vs. B vs. C

Adverse Events:

Went on to have a Total Knee Replacement during 12 month followup: 3.1% (1/32) vs. 6.5% (2/31) vs. 4.3% (1/23)

Giombini, 2011¹⁵⁶

3 months

Duration of pain: 3 years

Fair

A. Microwave diathermy (n=29): hyperthermic treatment 3 times a week for 4 weeks

B. Sham diathermy (n=25): sham hyperthermic treatment 3 times a week for 4 weeks

A vs. B

Age: 67 vs. 67 years

Female: 66% vs. 68%

Baseline WOMAC total (0-1.20): 103.1 vs. 101.3

Baseline WOMAC pain (0-25): 19.2 vs. 18.5

Baseline WOMAC stiffness (0-10): 9.7 vs. 9.7

Baseline WOMAC ADL (0-85): 74.3 vs. 73.1

A vs. B

3 months

Mean change in WOMAC total: −46.8 vs. −0.4, difference −46.4 (95% CI −58.3 to −34.5)

Mean change in WOMAC pain; −8.6 vs. −0.6, difference −8.1 (95% CI −10.7 to −5.3)

Mean change in WOMAC ADLs: −33 vs. 0.3, difference −33.2 (95% CI −42.0 to −24.6)

Mean change in WOMAC stiffness: −5.2 vs. −0.1, difference −5.1, p<0.01

Hegedus, 2009¹⁵⁷

2 months

Duration of pain NR

Poor

A. Low-Level Laser Therapy (n=18): 50 mW, continuous wave laser (wavelength 830 nm). Total dose of 48 J/cm2 per session. Twice a week for 4 weeks.

B. Placebo (n=17): Placebo probe (0.5 mW power output) used twice a week for 4 weeks.

Age: 49

Female: 81%

A vs. B

Baseline pain VAS (0-10): 5.8 vs. 5.6

A vs. B

2 months

Pain VAS: 1.2 vs. 4.1, difference −2.9

(no estimate of variability provided or calculable)

Jia, 2016¹⁶³

1 and 3 months

Duration of pain: NR

Good

[New trial]

A. Focused Low-Intensity Pulsed Ultrasound + diclofenac sodium (FLIPUS) (n=53): 20 minute sessions, once daily for 10 days applied to both knees.

B. Sham Ultrasound + Diclofenac Sodium (FLIPUS) (n=53)

A vs. B

Age: 63 vs. 61 years

Female: 73.6% vs. 69.8%

Baseline LI (0-24): 7.56 vs. 7.10

Baseline VAS (0-10): 6.98 vs. 6.76

A vs. B

Short-term (3 months)

LI: 6.8 vs. 7.8, p=0.006; difference −1.1 (95% CI −1.9 to −0.3), p<0.01

VAS pain: 6.4 vs. 7.2, p=0.007

Laufer, 2005¹⁵⁸

3 months

Duration of pain: NR

Poor

A. Low Intensity Pulsed Shortwave Diathermy (n=38): Three, 20 min sessions per week for 3 weeks (9 total); Pulse Duration: 82 μs; Pulse Frequency: 110 Hz; Peak Power: 200 W (mean 1.8W)

B. High Intensity Pulsed Shortwave Diathermy (n=32): Treatment protocol identical to Group A except with a higher intensity (pulse duration and frequency) Pulse Duration: 300 μs Pulse Frequency: 300 Hz Peak Power: 200 W (mean 18W)

C. Sham Shortwave Diathermy (n=33): Identical treatment except the apparatus was turned on but the power output was not raised.

A vs. B vs. C

Age: 75 vs. 73 vs. 73

Female: 82% vs. 91% vs. 67%

Baseline WOMAC Overall: 5.1 vs. 4.6 vs. 5.0

Baseline WOMAC Stiffness: 4.9 vs. 4.3 vs. 4.92

Baseline WOMAC Activities of Daily Living: 5.2 vs. 4.7 vs. 5.1

Baseline WOMAC Pain: 4.9 vs. 4.4 vs. 5.0

A vs. C

3 months

WOMAC Overall: 4.8 vs. 4.6, difference 0.2 (95% CI −1.5 to 2.0)

WOMAC Pain: 4.5 vs. 4.3, difference 0.2 (95% CI −1.6 to 1.9)

WOMAC Stiffness: 4.4 vs. 3.6, difference 0.8 (95% CI −1.0 to 2.6)

WOMAC Activities of Daily Living: 5.0 vs. 4.8, difference 0.2 (95% CI −1.5 to 1.8)

B vs. C

3 months

WOMAC Overall: 4.6 vs. 4.6, difference −0.04 (95% CI −1.8 to 1.7)

WOMAC Pain: 4.1 vs. 4.3, difference −0.2 (95% CI −2.0 to 1.5)

WOMAC Stiffness: 3.8 vs. 3.6, difference 0.2 (95% CI −1.6 to 2.0)

WOMAC Activities of Daily Living: 4.8 vs. 4.8, difference −0.02 (95% CI −1.7 to 1.6)

A vs. B vs. C

Adverse Events:

No adverse reactions to the treatment were reported by the subjects.

Mazzuca, 2004¹⁵⁹

1 month

Duration of pain: NR

Fair

A. Superficial Heat (sleeve) (n=25): Cotton and lycra sleeve with a heat retaining polyester and aluminum substrate, minimum 12 hours/day; continue usual pain medication(s).

B. Placebo Sleeve (n=24) Placebo sleeves did not contain the heat retaining substrate layer.

A + B

Age: 62.7

Female: 77%

Race: 67% white

Baseline WOMAC Function (17-85)^e: 51.8 (11.8)

Baseline WOMAC Stiffness (2−10)^e: 6.5 (1.4)

Baseline WOMAC Pain (5-25)^d: 15.2 vs. 14.7*

A vs. B

1 month

WOMAC Pain: 13.7 vs. 13.9

Tascioglu, 2004¹⁶⁰

6 months

Duration of pain: 7 years

Poor

A. Active laser 3 joule (n=20) continuous laser therapy (50 mW, 830 mm wavelength) applied to 5 painful points 5 days a week for 2 weeks

B. Active laser 1.5 joule (n=20): continuous laser therapy (50 mW, 830 mm wavelength) applied to 5 painful points 5 days a week for 2 weeks

C. Placebo laser (n=20): sham laser therapy applied to 5 painful points 5 days a week for 2 weeks

A vs. B vs. C

Age: 63 vs. 60 vs. 64 years

Female: 70% vs. 75% vs. 65%

Baseline WOMAC function (0-68):36.6 vs. 38.0 vs. 39.5

Baseline WOMAC stiffness (0-8): 4.1 vs. 4.6 vs. 4.5

Baseline WOMAC pain (0-20): 10.3 vs. 11.6 vs. 9.6

Baseline pain at rest VAS (0-100): 39.1 vs. 41.6 vs. 37.9

Baseline pain at activation VAS (0-100): 68.0 vs. 65.7 vs. 63.9

A vs. C

6 months

WOMAC function: 34.8 vs. 38.7, difference −3.8 (95% CI −9.8 to 2.1)

WOMAC stiffness: 3.9 vs. 4.2, difference −0.3 (95% CI −1.6 to 0.9)

WOMAC pain: 10.4 vs. 9.9, difference 0.6 (95% CI −1.5 to 2.7)

Pain at rest VAS: 38.7 vs. 38.9, difference −0.3 (95% CI −9.8 to 9.3)

Pain at activation VAS: 66.8 vs. 62.0, difference 4.8 (95% CI −4.9 to 14.5)

B vs. C

6 months

WOMAC function: 38.5 vs. 38.7

WOMAC stiffness: 4.5 vs. 4.2

WOMAC pain: 11.3 vs. 9.9

Pain at rest VAS: 40.0 vs. 38.9

Pain at activation VAS: 61.8 vs. 62.0

Thamsborg, 2005¹⁶¹

1.5 month

Duration of pain: 8 years

Fair

A. Pulsed Electromagnetic Fields (n=42): ±50V in 50Hz pulses changing voltage in 3 ms intervals; 2-hour sessions, daily, 5 days per week for 6 weeks (30 total)

B. Sham Electromagnetic Field (n=41): noneffective placebo electromagnetic field; 2 hour sessions, daily, 5 days per week for 6 weeks (30 total)

A vs. B

Age: 60 vs. 60

Female: 47.6% vs. 61%

Race: NR

Baseline WOMAC Activities of Daily Living (0-85): 43.83 vs. 46.49

Baseline WOMAC Stiffness (0-10): 5.74 vs. 5.85

Baseline WOMAC Joint Pain (0-25): 13.15 vs. 14.49

A vs. B

1.5 months

WOMAC Activities of Daily Living: 37.9 vs. 41.3, difference −3.5 (95% CI −4.4 to −2.5)

WOMAC Stiffness: 4.8 vs. 5.2, difference −0.3 (95% CI −0.5 to −0.2)

WOMAC Joint Pain: 11.4 vs. 12.2, difference −0.8 (95% CI −1.1 to −0.6)

A vs. B

Adverse Events:

throbbing sensation, warming sensations or aggravation of pain 28.5% (12/42) vs. 14.6% (6/41)

Yegin, 2017¹⁶⁴

1 month

Duration of pain: NR

Poor

[New trial]

A. Continuous Ultrasound (n=30): 8 minutes to each knee (16 minutes total), 5 days a week for 2 weeks (10 sessions total)

B. Sham Ultrasound (n=32): Identical protocol but with device in off mode, and out of view of patient

All patients: use of analgesics was avoided during treatment until end of first month following completed treatment.

No population details provided

Baseline WOMAC-ADL (0-170): 27.3 vs. 27.7

Baseline WOMAC-Stiffness (0-20): 3 vs. 3.5

LI-ADL (0-24): 4.5 vs. 5

Baseline VAS-Mobility (0-10): 5 vs. 5.5

Baseline VAS-At Rest (0-10):1.6 vs. 2.5

Baseline LI-Pain 0-10): 5 vs. 4.5

WOMAC-Pain (0-50): 8.5 vs. 9.3

A vs. B

Short Term (1 month)

Mean WOMAC-ADL: 18.0 vs. 21.2; mean Δ −9.3 vs. −6.5, p=0.414

Median WOMAC-Stiffness: 1.0 vs. 1.5; median Δ −1.0 vs. 1.0, p=0.614

Median LI-ADL: 3.8 vs. 4.5; median Δ −1.0 vs. −0.5, p=0.490

Median VAS-Mobility: 3.5 vs. 3.0; median Δ −1.0 vs. −2.0, p=0.680

Median VAS-At Rest: 0.1 vs. 0.3; median Δ 0.0 vs. 0.0, p=0.513

Median LI-Pain: 3.0 vs. 3.0; median Δ −1.5 vs. −0.5, p=0.153

Mean WOMAC-Pain: 5.6 vs. 6.6; mean Δ −2.9 vs. −2.6, p=0.77

A vs. B

Short Term (1 month)

SF-36 PCS (0-100): 43.0 vs. 40.0; mean Δ 7.9 vs. 6.1, p=0.466

SF-36 MCS (0-100): 45.2 vs. 46.7; mean Δ −0.3 vs. −0.1, p=0.949

SF-36 Pain: 44.3 vs. 41.4; mean Δ 8.3 vs. 5.4, p=0.247

SF-36 Emotional Role: 55.3 vs. 55.3; median Δ 0.0 vs. 0.0, p=0.790

SF-36 Energy-Vitality: 43.2 vs. 44.8; mean Δ 0.6 vs. 0.7, p=0.943

SF-36 Physical Function: 44.6 vs. 44.6; median Δ 5.3 vs. 2.1, p=0.383

SF-36 Physical Role: 56.2 vs. 56.2; median Δ 0.0 vs. 0.0, p=0.597

SF-36 General Health: 40.6 vs. 40.6; median Δ 0.0 vs. 0.0, p=0.556

SF-36 Mental Health: 44.75 vs. 40.25; median Δ 4.6 vs. 0.0, p=0.072

SF-36 Social Function: 54.4 vs. 57.1; median Δ 0.0 vs. 0.0, p=0.785

Yildiz, 2015¹⁶²

2 months

Duration of pain: Mean 2.8 to 5.1 years

Fair

A. Continuous ultrasound (n=30): 5 times a week for 2 weeks

B. Pulsed ultrasound (n=30): 5 times a week for 2 weeks

C. Sham (n=30): 5 times a week for 2 weeks

All patients performed home exercise program 3 days a week for 8 weeks

A vs. B vs. C

Age: 56 vs. 55 vs. 58 years

Female: 83% vs. 80% vs. 87%

Baseline Lequesne Index score (0-24): 13.2 vs. 12.9 vs. 12.4

Baseline pain at rest VAS (0-10): NR

Baseline pain on movement VAS (0-10): 9.0 vs. 8.6 vs. 8.9

A vs. C

2 months

Lequesne Index: 5.5 vs. 11.7, difference −6.2 (95% CI −8.4 to 4.2)

Pain at rest VAS: NR

Pain on movement VAS: 3.9 vs. 7.2, difference −3.3 (95% CI −4.6 to −2.0)

B vs. C

2 months

Lequesne Index: 6.0 vs. 11.7, difference −5.7 (95% CI −7.7 to −3.7)

Pain at rest VAS: NR

Pain on movement VAS: 3.8 vs. 7.2, difference −3.4 (95% CI −4.7 to −2.0)

: ADL = activity of daily living; CI = confidence interval; EQ-5D = EuroQol Quality of Life Instrument 5-D; HSS = Hospital for Special Surgery; Hz = hertz; J/cm² – Joules per square centimeter; kJ = kilojoules; KOOS = Knee Injury and Osteoarthritis Outcome Score; LI = Lequesne Index; MCID = minimal clinically important difference; MHz = Mega Hertz; mW = mega Watts; nm = nanometer; NR = not reported; NRS = numeric rating scale; PSW = pulsed short wave; RR = risk ratio; SKFS = Saudi Knee Function Score; SF-36 MCS = Short Form 36 Questionnaire Mental Component Score; TENS = transcutaneous electrical nerve stimulation; VAS = visual analog scale; W = watts; WOMAC = Western Ontario and McMaster Universities Osteoarthritis Index; μs = microsecond
a: Unless otherwise noted, followup time is calculated from the end of the treatment period
b: Values estimated from graph
c: The study separated outcome values out into slight, moderate and severe disease patient groups for each treatment arm. These values are combined values for each intervention groups estimated from graphs in the study.
d: Values estimated from graph
e: Separate group baseline values not given for stiffness and function subscales
f: Age only reported for population as a whole

Table 28Osteoarthritis knee pain: manual therapies

Author, Year, Followup,^a Pain Duration, Study Quality

Intervention

Population

Function and Pain Outcomes

Other Outcomes

Abbott, 2013⁴⁷

9.75 months

Duration of diagnosis: 2.6 years

Fair

A. Manual therapy (n=54/30 knee OA): 7 sessions in 9 weeks with 2 additional booster sessions

B. Exercise (n=51/29 knee OA): 7 exercise sessions in 9 weeks with 2 additional booster sessions

C. Usual care (n=51/28 knee OA)

A vs. B vs. C (total population, includes hip OA)

Age: 67 vs. 67 vs. 66 years

Female: 49% vs. 52% vs. 58%

Percent knee OA: 56% vs. 57% vs. 55%

Percent hip OA: 44% vs. 43% vs. 45%

Percent both hip OA and knee OA: 22% vs. 20% vs. 26%

Baseline WOMAC (0-240): 114.8 vs. 95.5 vs. 93.8

A vs. C (knee OA only)

9.75 months

WOMAC mean change from baseline: −31.5 vs. 1.6, p=NR

A vs. B

9.75 months

WOMAC mean change from baseline: −31.5 vs. −12.7, p=NR

Perlman, 2012¹⁸⁴

4 months

Duration of pain: NR

Fair

A1. Massage Therapy Group 1 (MT) (n=25): standard Swedish massage strokes, and specified time allocated to various body regions (therapists agreed not to deviate from protocol); one, 30-minute session per week for 8 weeks (8 total sessions)

A2. MT Group 2 (n=25): Identical to group A1 except differing ‘dosage’ of massage; two, 30-min sessions per week for 4 weeks, then once weekly for 4 weeks (12 total sessions)

A3. MT Group 3 (n=25): Identical to group A1 except differing ‘dosage’ of massage; one, 60-min per week for 8 weeks (8 total sessions)

A4. MT Group 4 (n=25): Identical to group A1 except differing ‘dosage’ of massage; two, 60-min sessions per week for 4 weeks, then once weekly for 4 weeks (12 total sessions)

B. Usual Care (n=25): Continued current treatment without the addition of massage therapy.

A1 vs. A2 vs. A3 vs. A4 vs. B

Age: 70 vs. 62 vs. 63 vs. 64 vs. 64

Female: 60% vs. 72% vs. 76% vs. 68% vs. 76%

Race: 92% vs. 88% vs. 76% vs. 80% vs. 88% white

Baseline WOMAC Total (0-100): 52.9 vs. 50.2 vs. 53.6 vs. 48.0 vs. 53.2

Baseline WOMAC Physical Function (0-100): 52.9 vs. 49.5 vs. 49.8 vs. 48.3 vs. 50.5

Baseline WOMAC Pain (0-100): 52.3 vs. 42.4 vs. 52.5 vs. 44.4 vs. 46.3

Baseline VAS Pain (0-100): 61.2 vs. 64.0 vs. 66.4 vs. 59.2 vs. 57.6

A1 vs. A2 vs. A3 vs. A4 vs. B

4 months:

WOMAC Total, mean change from baseline (95% CI): −14.3 (−22.9 to −5.7) vs. −7.0 (−15.6 to 1.6) vs. −14.2 (−23.4 to −5.0) vs. −15.1 (−25.1 to −5.1) vs. −6.0 (−12.6 to 0.5)

WOMAC Physical Function, mean change from baseline (95% CI): −15.3 (−24.5 to 26.1) vs. −7.4 (−14.8 to 0) vs. −12.1 (−22.0 to −2.1) vs. −14.4 (−23.4 to −5.4) vs. −4.2 (−11.1 to 2.7)

WOMAC Pain, mean change from baseline (95% CI): −12.2 (−22.4 to −2.0) vs. −3.9 (−12.7 to 4.9) vs. −13.7 (−23.4 to −4.0) vs. −14.2 (−24.5 to −3.8) vs. −7.5 (−16.0 to 1.1)

VAS Pain, mean change from baseline (95% CI): −14.4 (−25.9, −2.8) vs. −14.0 (−24.7 to −3.3) vs. −18.5 (−29.0 to −8.1) vs. −22.8 (−35.5 to −10.1) vs. −11.5 (−21.0 to −2.0)

A1 vs. A2 vs. A3 vs. A4 vs. B

4 months:

WOMAC Stiffness (0-100), mean change from baseline (95% CI): −15.4 (−26.4 to −4.5) vs. −9.6 (−20.6 to 1.3) vs. −16.9 (−28.5 to −5.2) vs. −16.8 (−29.7 to −3.9) vs. −6.4 (−13.2 to 0.4)

: CI = confidence interval; NR = not reported; OA = osteoarthritis; VAS = visual analog scale; WOMAC = Western Ontario and McMaster Universities Arthritis Index
a: Unless otherwise noted, followup time is calculated from the end of the treatment period

Table 29Osteoarthritis knee pain: mind-body therapies

Author, Year, Followup,^a Pain Duration, Study Quality

Intervention

Population

Function and Pain Outcomes

Other Outcomes

Brismee, 2007²¹⁵

1.5 months

Duration of pain: NR

Poor

A. Tai chi (n=18): group tai chi classes for 6 weeks followed by 6 weeks of home video tai chi practice; 40 minute sessions, 3x/week for 12 weeks (36 total)

B. Attention Control (n=13): group lectures and discussions covering health-related topics, no further activity past 6 week group period; 40 minutes sessions, 3x/week for 6 weeks (18 total)

A vs. B

Age: 71 vs. 69

Female: 86.4% vs. 78.9%

Race: NR

WOMAC Total (26−13)]: 64.6 vs. 59.6

WOMAC Physical Function (17-85): 42.7 vs. 37.6

WOMAC Pain (7−35): 16.5 vs. 16.9

VAS Pain (0−10): 4.7 vs. 4.2

WOMAC Stiffness (2−10): 5.6 vs. 5.1

A vs. B

1.5 months

WOMAC Total: 60.3 vs. 57.7, p=NS

WOMAC Physical Function: 38.6 vs. 37.6, p=NS

WOMAC Pain: 16.4 vs. 16, p=NS

VAS Pain: 3.5 vs. 3.2, p=NS

WOMAC Stiffness: 5.3 vs. 4.5, p=NS

Wang, 2009²¹⁶

3 and 9 months

Duration of pain: 9.7 years

Fair

A. Tai chi (n=20): group tai chi classes, 10 forms from the classic Yang style tai chi; home tai chi practice at least 20 minutes per day with a DVD. Home practice continued after group sessions ended until the 48 week followup.

B. Attention Control (n=20): group classes on nutritional and medical information paired with 20 minutes of stretching. Instruction to practice at least 20 minutes of stretching exercises per day at home.

In both groups, treatments were 2x/week for 12 weeks (24 total), 60 minute sessions

A vs. B

Age: 63 vs. 68

Female: 80% vs. 70%

Race: NR

Baseline WOMAC Physical Function (0−1,700): 707.6 vs. 827

Baseline WOMAC Pain (0-500): 209.3 vs. 220.4

Baseline VAS Patient-Assessed Pain (0−10): 4.2 vs. 4.8

Baseline VAS Physician-Assessed Pain (0−10): 4.8 vs. 5.8

Baseline WOMAC Stiffness (0-200): 105.7 vs. 120.7

A vs. B

3 months

(mean change from baseline)

WOMAC Physical Function: −440.5 (95% CI −574.4 to −306.6) vs. −257.3 (95% CI −391.2 to −123.4); difference −183.2 (95% CI −372.6 to 6.2)

WOMAC Pain: −131.6 (95% CI −177.4 to −85.7) vs. −64.6 (95% CI −110.5 to −18.7); difference −70.0 (95% CI −131.8 to −2.1)

VAS Patient Assessed Pain: −2.4 (95% CI −3.5 to −1.2) vs. −1.7 (−2.9 to −0.5); difference −0.7 (−2.3 to 1.0)

VAS Physician Assessed Pain: −2.6 (95% CI −3.3 to −1.9) vs. −2.1 (95% CI −2.8 to −1.3); difference −0.5 (95% CI −1.6 to 0.5)

WOMAC Stiffness: −65.0 (95% CI −86.3 to −43.7) vs. −50.2 (95% CI −71.5 to −28.9); difference −14.8 (95% CI −44.9 to 15.3)

9 months

WOMAC Physical Function: −405.9 (95% CI −539.8 to −271.9) vs. −300.6 (95% CI −434.5 to −166.6); difference −105.3 (95% CI −294.7 to −84.1)

WOMAC Pain: −115.4 (95% CI −161.2 to −69.5) vs. −69.2 (95% CI −115.1 to −23.3); difference −46.2 (95% CI −111.0 to 18.7)

VAS Patient Assessed Pain: −1.7 (95% CI −2.8 to −0.5) vs. −1.7 (95% CI −2.9 to −0.5); difference 0.04 (95% CI −1.6 to 1.7)

VAS Physician-Assessed Pain: −2.5 (95% CI −3.3 to −1.8) vs. −1.5 (−2.3 to −0.8); difference −1.0 (95% CI −2.1 to 0.02) WOMAC Stiffness: −64.2 (95% CI −85.5 to −42.8) vs. −60.5 (95% CI −81.8 to −39.2); difference −3.7 (95% CI −33.8 to 26.5)

A vs. B

3 months

(mean change from baseline)

SF-36 PCS (0−100): 10.8 (95% CI 7.3 to 14.3) vs. 6.3 (95% CI 2.8 to 9.8); difference 4.5 (95% CI −0.4 to 9.5)

SF-36 MCS (0−100): 4.4 (95% CI −0.11 to 8.9) vs. 4.5 (95% CI 0.0 to 9.0); difference −0.1 (95% CI −6.5 to 6.3)

CES-D (0-60): −6.4 (95% CI −9.9 to −2.9) vs. −1.1 (95% CI −4.6 to 2.4); difference −5.3 (95% CI −10.2 to −0.4)

9 months

SF-36 PCS: 10.4 (95% CI 6.9 to 13.9) vs. 4.1 (95% CI 0.6 to 7.6); difference 6.3 (95% CI 1.4 to 11.3)

SF-36 MCS: 5.8 (95% CI 1.3 to 10.3) vs. 1.0 (95% CI −3.5 to 5.5); difference 4.8 (95% CI −1.6, 11.1)

CES-D: −7.3 (95% CI −10.7 to −3.8) vs. 1.7 (95% CI −1.8 to 5.1); difference −8.9 (95% CI −13.8 to −4.0)

: CES-D = Center for Epidemiologic Studies Depression Scale; CI = confidence interval; MCS = Mental Component Score; NR = not reported; NS = not statistically significant; SF-36 MCS = Short-Form 36 Questionaire Mental Component Score; SF-36 PCS = Short-Form 36 Questionnaire Physical Component Score; VAS = Visual Analog Scale; WOMAC = Western Ontario and McMaster Universities Osteoarthritis index
a: Unless otherwise noted, followup time is calculated from the end of the treatment period

Table 30Osteoarthritis knee pain: acupuncture

Author, Year, Followup,^a Pain Duration, Study Quality

Intervention

Population

Function and Pain Outcomes

Other Outcomes

Berman, 1999²³⁸

1 month

Duration of pain: mean 7.2 years

Fair

A. Acupuncture + usual care (n=36): 20 minute treatments, 2/week for 8 weeks using Traditional Chinese Medicine theory; 9 acupoints points (5 local, 4 distal) with elicitation of de qi; electrical stimulation was used at local points (2.5 to 4 Hz, pulses of 1.0 ms); no new physiotherapy or exercise programs

B. Usual care alone (n=37): current level of oral therapy throughout the trial

A vs. B

Age: 66 vs. 66

Female: 47% vs. 72%

Caucasian: 92% vs. 74%

BMI: 32 vs. 32

Duration of symptoms: 7.5 vs. 6.9 years

Baseline WOMAC total (scale unclear): 48.4 vs. 51.4

Baseline WOMAC function (scale unclear): 34.3 vs. 34.4

Baseline Lequesne Index (0-24): 11.7 vs. 12.3

Baseline WOMAC pain (scale unclear): 9.6 vs. 9.9

A vs. B

1 month

WOMAC total: 31.6 vs. 50.4, difference −18.9 (95% CI −26.5 to −11.2)

WOMAC function: 23.2 vs. 36.8, difference −13.6 (95% CI −19.4 to −7.8)

Lequesne Index: 9.3 vs. 12.4, difference −3.1 (95% CI −4.8 to −1.3)

WOMAC pain: 5.6 vs. 9.5, difference −4.0 (95% CI −5.5 to −2.4)

Berman, 2004²³⁹

6 months

Duration of pain: NR

Fair

A. Acupuncture (n=186): electrical stimulation at knee acupoints (5 local and 4 distal) at low frequency (8 Hz and square biphasic pulses (0.5 ms pulse width) for 20 minutes.

B. Sham acupuncture (n=183): modified combined insertion (at sham points in abdominal area) and noninsertion (at 3 local and 4 distal points on the knee) procedure; mock electric stimulation was attached to sham needles at the knee for 20 minutes.

Both groups received 8 weeks of 2 sessions per week, followed by 2 weeks of 1 session per week, 4 weeks of 1 session every other week, and 12 weeks of 1 session per month. Total of 26 weeks, 25 possible sessions.

A vs. B

Age: 65 vs. 66 years

Female: 63.2% vs. 61.8%

non-Hispanic white: 70% vs. 70.7%

Bilateral OA: 25.0% vs. 28.9%

Length of diagnosis of OA

<5 years: 53.8% vs. 53%

6−10 years: 19.9% vs. 18.0%

>10 years: 25.8% vs. 29.0%

Using opioids: 5.5% vs. 5.0%

Baseline WOMAC Function (0-68): 31.3 vs. 31.3

Baseline WOMAC Pain (0-20): 8.9 vs. 8.9

A vs. B

6 months

Δ from baseline, WOMAC Function: −12.4 vs. −9.9, p<0.01

Δ from baseline, WOMAC Pain: −3.8 vs. −2.9, p<0.01

A vs. B

6 months

Δ from baseline, SF-36 Physical Health Score: 10.7 vs. 8.2, p=0.21

Δ from baseline, Patient Global Assessment: 0.5 vs. 0.2, p=0.02

Hinman, 2014²⁴⁰

9 months

Duration of pain: mean 7.2 years

Good (sham)

Fair (no treatment)

A. Needle acupuncture (n=70): combination of Western and traditional Chinese acupuncture; maximum of 6 points (4 on study limb and 2 distal points) at initial session, in other sessions points were added at therapist’s discretion. Needles were left in while patient rested.

B. Laser acupuncture (n=71): combination of Western and traditional Chinese acupuncture; delivered to selected points using standard Class 3B laser devices (measured output 10mW and energy output 0.2 J/point)

C. No treatment (n=71): did not receive acupuncture; continued in an observational study, unaware they were in an acupuncture trial

D. Sham laser acupuncture (n=70): same as true laser but no laser was emitted, only red nonlaser light at the probe tip lit up.

For all acupuncture and sham groups, sessions were 20 minutes in duration, 1-2 times per week for 12 weeks (8 to 12 sessions total)

A vs. B vs. C vs. D

Age: 64 vs. 63 vs. 63 vs. 64 years

Female: 46% vs. 39% vs. 56% vs. 56%

Duration of symptoms ≥ 10 years: 41% vs. 38% vs. 27% vs. 50%

Bilateral symptoms: 64% vs. 66% vs. 51% vs. 63%

Opioid use: 1% vs. 3% vs. 1% vs. 1%

Previous acupuncture for knee pain: 7% vs. 13% vs. 7% vs. 3%

Baseline WOMAC function (0-68): 31.3 vs. 27.0 vs. 26.1 vs. 27.5

Baseline NRS activity restriction (0-10): 5.0 vs. 4.3 vs. 4.1 vs. 4.5

Baseline WOMAC pain (0-20): 9.0 vs. 8.3 vs. 7.8 vs. 8.6

Baseline NRS average pain overall (0-10): 5.3 vs. 4.9 vs. 5.1 vs. 5.0

Baseline NRS pain on walking (0-10): 5.5 vs. 4.8 vs. 4.8 vs. 5.2

Baseline NRS pain on standing (0-10): 4.6 vs. 3.8 vs. 4.1 vs. 4.3

A vs. C

9 months

WOMAC function: 22.4 vs. 23.6; adjusted difference −3.7 (95% CI −8.2 to 0.8)

Activity restriction, NRS: 3.4 vs. 4.1; adjusted difference −1.1 (95% CI −2.1, −0.2)

WOMAC pain: 6.7 vs. 7.4; adjusted difference −1.4 (95% CI −2.7 to 0.0)

Overall Pain, NRS: 4.0 vs. 4.6; adjusted difference −0.7 (95% CI −1.6 to 0.2)

Pain on walking, NRS: 4.1 vs. 4.4; adjusted difference −0.6 (95% CI −1.5 to 0.4)

Pain on standing, NRS: 3.7 vs. 4.0; adjusted difference −0.5 (95% CI −1.4 to 0.5)

B vs. C

9 months

WOMAC function: 22.6 vs. 23.6; adjusted difference −0.6 (95% CI −1.5 to 0.3)

Activity restriction, NRS: 3.7 vs. 4.1; adjusted difference −0.4 (95% CI −1.4, 0.5)

WOMAC pain: 7.1 vs. 7.4; adjusted difference −0.4 (95% CI −1.8 to 1.0)

Overall Pain, NRS: 4.0 vs. 4.6; adjusted difference −0.6 (95% CI −1.5 to 0.3)

Pain on walking, NRS: 4.1 vs. 4.4; adjusted difference −0.3 (95% CI −1.2 to 0.7)

Pain on standing, NRS: 3.8 vs. 4.0; adjusted difference −0.2 (95% CI −1.1 to 0.8)

B vs. D

9 months

WOMAC function: 22.6 vs. 21.6; adjusted difference 1.1 (95% CI −4.8 to 7.0)

Activity restriction, NRS: 3.7 vs. 3.9; adjusted difference −0.1 (95% CI −1.1 to 1.0)

WOMAC pain: 7.1 vs. 6.9; adjusted difference 0.0 (95% CI −1.9 to 1.9)

Overall pain, NRS: 4.0 vs. 3.9; adjusted difference 0.0 (95% CI −0.9 to 1.0)

Pain on walking, NRS: 4.1 vs. 4.2; adjusted difference 0.0 (95% CI −1.0 to 1.1)

Pain on standing, NRS: 3.8 vs. 3.5; adjusted difference 0.5 (95% CI −0.7 to 1.6)

A vs. C

9 months

AQoL-6D (−0.04 to 1.00): 0.74 vs. 0.77; adjusted difference: −0.01 (95% CI −0.07 to 0.05)

SF−12 PCS (0-100): 41.7 vs. 38.9; adjusted difference 2.3 (95% CI −1.7 to 6.3)

SF-12 MCS (0-100): 51.1 vs. 54.4; adjusted difference −0.9 (95% CI −5.2 to 3.4)

Opioid use: 0% (0/70) vs. 1% (1/71)

B vs. C

9 months

AQoL-6D: 0.73 vs. 0.77; adjusted difference: 0.01 (95% CI −0.05 to 0.06)

SF-12 PCS: 38.8 vs. 38.9; adjusted difference −0.4 (95% CI −4.4 to 3.6)

SF-12 MCS: 52.1 vs. 54.4; adjusted difference −0.9 (95% CI −5.5 to 3.7)

Opioid use: 2% (1/71) vs. 1% (1/71)

B vs. D

9 months

AQoL-6D: 0.73 vs. 0.74; adjusted difference 0.01 (95% CI −0.05 to 0.08)

SF-12 PCS: 38.8 vs. 38.2; adjusted difference 0.4 (95% CI −3.8 to 4.5)

SF-12 MCS: 52.1 vs. 52.8; adjusted difference −0.6 (95% CI −5.4 to 4.2)

Opioid use: 2% (1/71) vs. 0% (0/70)

Jubb, 2008²⁴¹

1 month

Duration of pain: mean 10 years

Fair

A. Acupuncture (n=34): manual acupuncture (10 minutes, total of 9 points; depth of 1-1.5 cm; elicitation of de qi) and electro-acupuncture (10 minutes each on anterior and posterior part of the knee (20 minutes total); low frequency, delivered at 6 Hz at a constant current)

B. Sham (n=34): sham needles, did not penetrate the skin; electrical stimulation apparatus produced sound signals but no electrical current.

Both groups received 30 minute treatments, 2/week for 5 weeks, with 10 sessions in total

A vs. B

Age: 64 vs. 66 years

Female: 85% vs. 76%

Caucasian: 74% vs. 85%

Duration of symptoms: 10 vs. 9.6 years

Baseline WOMAC function (0-1700): 1028 vs. 979

Baseline WOMAC pain (0−500): 294 vs. 261

Baseline Total body pain, VAS (0-100): 49 vs. 49

Baseline Night pain knee, VAS (0-100): 61 vs. 52

Baseline Overall pain knee, VAS (0-100): 63 vs. 53

Baseline Weight-bearing pain knee, VAS (0-100): 71 vs. 60

Baseline EuroQoL VAS (0-100): 63 vs. 54

A vs. B

1 month

WOMAC function: change from baseline, 137 (95% CI 20 to 255) vs. 134 (95% CI 9 to 258); difference, 4 (95% CI −163 to 171)

WOMAC pain: change from baseline, 59 (95% CI 16 to 102) vs. 13 (95% CI −22 to 50); difference, 46 (95% CI −9 to 100)

Weight-bearing knee pain (VAS), change from baseline, 19 (95% CI 9 to 30) vs. 8 (95% CI −1 to 16); difference, 11 (95% CI −2 to 25)

Overall knee pain (VAS), change from baseline, 14 (95% CI 5 to 24) vs. 2 (95% CI −6 to 10); difference, 12 (95% CI −1 to 24)

Nighttime knee pain (VAS), change from baseline, 10 (95% CI −1 to 22) vs. 5 (95% CI −3 to 14); difference, 5 (95% CI −9 to 19)

General body pain (VAS), change from baseline, 5 (95% CI −5 to 15) vs. −8 (95% CI −1 to 18); difference: 13 (95% CI 0 to 27)

EuroQoL-VAS: mean 63 vs. 52, p=0.98

Lansdown, 2009²⁴²

9.5 months

Duration of pain NR

Poor

A. Acupuncture + usual care (n=15): once per week for up to 10 weeks, with maximum of 10 sessions, which varied in length and content (mean number of acupoints was 12, range 4-24; de qi was usually elicited; variety of stimulation methods used including tonification and reduction; retention time for needles ranged from 10-30 minutes); auxiliary treatment included moxibustion (3/14, 21%) and acupressure massage (3/14, 21%); life style advice 11/14 (79%)

B. Usual care (n=15): any appointments, medications prescribed or over the counter) and interventions sought by participants from any health practitioner

A vs. B

Age: 63 vs. 64 years

Female: 60% vs. 60%

Caucasian: 100% vs. 100%

Duration of symptoms: NR

Baseline WOMAC total (0-96): 31 vs. 37.5

Baseline WOMAC function (0-68): 20.5 vs. 26.3

Baseline OKS (12-60): 30.9 vs. 30.6

Baseline WOMAC pain (0-20): 7.3 vs. 7.4

A vs. B

9.5 months

WOMAC total: 24.8 vs. 25.6, adjusted difference −2.9 (95% CI 9.5 to −15.4)

WOMAC function: 17.4 vs. 17.6, adjusted difference −1.4 (95% CI 8.7, −11.4)

WOMAC pain: 4.7 vs. 5.3 (3.9), adjusted difference −1.4 (95% CI 0.8 to −3.6)

OKS: 24.5 vs. 28.1; difference −3.6 (95% CI −9.8 to 2.6)

A vs. B

9.5 months

(SF-36 scales are 0-100 for all)

SF-36 physical functioning: 54.2 vs. 55.6, difference −1.4 (95% CI −21.8 to 19.0)

SF-36 social functioning: 81.3 vs. 76.6, difference 4.7 (95% CI −10.6 to 20.0)

SF-36 role physical: 71.4 vs. 57.8, difference 13.6 (95% CI −6.3 to 33.5)

SF-36 role mental: 79.2 vs. 67.7, difference 11.5 (95% CI −5.8 to 28.8)

SF-36 mental health: 73.1 vs. 65.0, difference 8.1 (95% CI −5.4 to 21.6)

SF-36 vitality: 58.2 vs. 46.9, difference 11.3 (95% CI −0.22 to 22.8)

SF-36 pain: 65.2 vs. 65.9, difference −0.7 (95% CI −15.6 to 14.2)

SF-36 general health: 67.7 vs. 62.4, difference 5.3 (95% CI −4.8 to 15.4), EQ5D: 0.7 vs. 0.63, difference 0.03 (95% CI −0.13 to 0.19)

Suarez-Almazo, 2010²⁴³

1.5 months

Duration of pain: mean 8 years

Good (sham)

Fair (waitlist)

A. Electro-acupuncture (n=153): Traditional Chinese Medicine points; TENS equipment emitted a dense disperse wave (50Hz, dispersed at 15 Hz, 20 cycles/minute); voltage increased from 5V to 60V until maximal tolerance achieved. Patients rested for 20 minutes with needles retaining and with continuing TENS.

B. Sham (n= 302): 40Hz adjustable wave; voltage increased until the patient could feel it and then immediately turned off. Patients rested for 20 minutes with the needles retained, but without TENS stimulation; nonrelevant acupoints used and depth of needle placement was shallow

C. Waitlist (n=72)

A vs. B vs. C

Age: 65 vs. 65 vs. 64

Female: 66% vs. 65% vs. 58%

Caucasian: 70% vs. 68% vs. 65%

Mean duration of chronicity: 9.2 vs. 8.6 vs. 11.5 years

Baseline WOMAC function (0-100): 42.9 vs. 44.6 vs. 40.1

Baseline WOMAC pain (0-100): 44.5 vs. 45.0 vs. 44.1

Baseline VAS pain (0-100): 58.3 vs. 57.4 vs. 54.6

Baseline J-MAP (1-7): 4.4 vs. 4.4 vs. 4.3

A vs. B

1.5 months

WOMAC function: 31.2 (vs. 32.1; difference −0.9 (95% CI −4.4 to 2.6)

WOMAC pain: 30.8 vs. 31.0; difference −0.2 (95% CI −3.8 to 3.4)

VAS pain: 36.2 vs. 36.7; difference −0.5 (95% CI −6.1 to 5.1)

J-MAP: 3.3 vs. 3.4; difference −0.1 (95% CI −0.39 to 0.19)

A vs. C

1.5 months

WOMAC function: 31.2 vs. 41.7; difference −10.5 (95% CI −15.6 to −5.5)

WOMAC pain: 30.8 vs. 42.4; difference −11.6 (95% CI −16.5 to −6.7)

VAS pain: 36.2 vs. 53.2; difference −17.0 (95% CI −24.7 to −9.3)

J-MAP: 3.3 vs. 4.2; difference −0.9 (95% CI −1.3 to −0.5)

A vs. B

1.5 months

SF-12 PCS (0-100): 39.5 vs. 38.7; difference 0.8 (95% CI −1.1 to 2.7)

SF-12 MCS (0-100): 54.1 vs. 53.2; difference 0.9 (95% CI −0.8 to 2.6)

A vs. C

1.5 months

SF-12 PCS: 39.5 vs. 35.8; difference 3.7 (95% CI 1.0 to 6.4)

SF-12 MCS: 54.1 vs. 51.6; difference 2.5 (95% CI 0.04, 5.0)

Williamson, 2007⁶⁷

1.5 months

Duration of symptoms: NR

Poor

A. Acupuncture (n=60): conducted by a physiotherapist in a group setting (6-10 patients); needles inserted into 7 acupoints until de qi was achieved and left in place for 20 minutes; treatments were once per week for 6 weeks, with 6 sessions in total

B. Combination Exercise (Physiotherapy) (n=60): supervised group (6-10 people) exercise comprised of strengthening, aerobic, stretching, and balance training; 60 minutes, once per week for 6 weeks;

C. Usual care (n=61): exercise and advice leaflet; told they were enrolled in the “home exercise group”

A vs. B vs. C

Age: 72 vs. 70 vs. 70 years

Female: 55% vs. 52% vs. 54%

BMI: 30.9 vs. 32.8 vs. 32.7

Baseline WOMAC total (scale unclear): 50.9 vs. 50.2 vs. 51.1

Baseline OKS (12-60): 40.2 vs. 39.3 vs. 40.5

Baseline pain VAS (0-10): 7.3 vs. 6.8 vs. 6.9

Baseline HAD Anxiety (0-21): 7.3 vs. 7.5 vs. 6.7

Baseline HAD Depression (0-21): 7.1 vs. 7.1 vs. 7.4

A vs. B

1.5 months

WOMAC: 48.4 vs. 49.4, difference −1.0 (95% CI −6.7 to 4.7)

OKS: 38.1 vs. 38.8, difference −0.7 (95% CI −3.5 to 2.1)

Pain VAS: 6.6 vs. 6.4, difference 0.22 (95% CI −0.67 to 1.11)

A vs. C

1.5 months

WOMAC: 48.4 vs. 52.3, difference −3.9 (95% CI −9.5 to 1.6)

OKS: 38.1 vs. 40.8, difference −2.6 (95% CI −5.4 to 0.1)

Pain VAS: 6.6 vs. 7.2, difference −0.66 (95% CI −1.45 to 0.12)

A vs. B

1.5 months

HAD Anxiety: 6.9 vs. 7.1, difference −0.20 (95% CI −1.89 to 1.49)

HAD Depression: 6.7 vs. 6.8, difference −0.03 (95% CI −1.30 to 1.24)

A vs. C

1.5 months

HAD Anxiety: 6.9 vs. 6.5, difference 0.34 (95% CI −1.11 to 1.8)

HAD Depression: 6.7 vs. 7.1, difference, −0.41 (95% CI −1.63 to 0.8)

Witt, 2005²⁴⁴

4 and 10 months

Duration of pain: mean 9.4 years

Fair

A. Acupuncture (n=150): semi-standardized; patients received at least 6 local and at least 2 distant Traditional Acupuncture points; elicitation of de qi; needles stimulated manually at least once during each session

B. Minimal acupuncture (n=76): superficial insertion of at nonacupuncture sites away from knee; manual stimulation of the needles and provocation of de qi were avoided

Both groups underwent 12 sessions of 30 minutes duration, administered over 8 weeks

A vs. B

Age: 65 vs. 63 years

Female: 70% vs. 65%

Duration of symptoms: 9.1 vs. 9.9 years

Bilateral OA: 74% vs. 77%

Previous acupuncture: 9% vs. 7%

Baseline WOMAC total (scale unclear): 50.8 vs. 52.5

Baseline PDI (Disability) (0-70): 27.9 vs. 27.8

Baseline VAS pain (0-100): 64.9 vs. 68.5

A vs. B

4 months

WOMAC total: 30.4 vs. 36.3; difference −5.8 (95% CI −12.0 to 0.3)

WOMAC physical function: 30.4 vs. 36.5; difference −6.2 (95% CI −12.4 to 0.1)

PDI: 18.6 vs. 22.8; difference −4.2 (95% CI −8.3 to −0.0)

WOMAC pain: 28.9 vs. 33.8; difference −4.8 (95% CI −11.2 to 1.6)

10 months

WOMAC Total: 32.7 vs. 38.4; difference −5.7 (95% CI −12.1 to 0.7)

WOMAC physical function: 33.0 vs. 38.9; difference −5.9 (95% CI −12.5 to 0.7)

PDI: 20.0 vs. 23.6; difference −3.6 (95% CI −7.7 to 0.5)

WOMAC pain: 30.0 vs. 33.5; difference −3.5 (95% CI −10.0 to 3.0)

A vs. B

4 months

SF-36 Physical: 35.1 vs. 33.0; difference 2.1 (95% CI −0·5 to 4.8)

SF-36 Mental: 52.6 vs. 51.7; difference 0.9 (95% CI 2.3 to 4.2)

ADS (Depression): 48.2 vs. 48.7; difference −0·5 (95% CI −3.6 to 2.5)

10 months

SF-36 Physical: 35.0 vs. 32.8; difference 2.2 (95% CI −0.6 to 5,1)

SF-36 Mental: 52.9 vs. 51.1; difference 1.9 (95% CI −1.3 to 5.1)

ADS: 48.6 vs. 49.8; difference −1.2 (95% CI −4.3 to 1.8)

Yurtkuran, 2007²⁴⁵

3 months

Duration of pain: mean 5.4 years

Fair

A. Laser acupuncture (n=28): applied to the medial side of the knee to the acupuncture point on the sural nerve; infrared 27 GaAs diode laser instrument (output 4 mW, 10 mW/cm2 power density, 120-sec treatment time and 0.48 J dose per session); irradiation was pulsed (duration of 1 pulse was 200 nanosecond), and only one point was treated with contact application technique.

B. Sham laser acupuncture (n=27): performed in the same location and under the same conditions as the true laser acupuncture; patients could see a red light but the machine was turned off

Both groups: 20 minutes sessions, 5 days per week for 2 weeks (total duration of therapy was 10 days, 10 sessions total); in addition, all patients received a home-based, standardized exercise program

A vs. B

Age: 52 vs. 53 years

Female: 96% vs. 96%

Duration of symptoms: 5.2 vs. 5.6 months

Baseline WOMAC total: 66.5 vs. 51.3

Baseline WOMAC physical function: 47.5 vs. 35.3

Baseline WOMAC pain: 13.7 vs. 11.6

Baseline VAS pain on movement (0-10): 6.5 vs. 6.1

A vs. B

2.5 months

WOMAC total: 62.4 vs. 50.6, difference 11.8 (95% CI −1.0 to 24.6)

WOMAC physical function: 44.2 vs. 35.3, difference 11.9 (95% CI 2.9 to 20.9)

WOMAC pain: 13.5 vs. 11.5, difference 2.0 (95% CI −1.3 to 5.3)

VAS pain on movement: 5.6 vs. 4.8, difference 0.8 (95% CI −0.9 to 2.5)

A vs. B

2.5 months

NHP (0-38): 7.6 vs. 6.4. difference 1.2 (95% CI −2.1 to 4.4)

: AQoL = Assessment of Quality of Life; ADS = Anxiety and Depression Scale; BMI = Body Mass Index; CI = confidence interval; HAD = Hospital Anxiety and Depression Scale; J-MAP = Joint-specific Multidimensional Assessment of Pain; NHP = Nottingham health profile; NR = not reported; NRS = numeric rating scale; OA = osteoarthritis; OKS = Oxford Knee Score; SF-12 MCS = Short Form 12 Questionaire Mental Component Score; SF-36 = Short-Form 36 Questionnaire Physical Component Score; V = volt; VAS = Visual Analog Scale; WOMAC = Western Ontario and McMaster Universities Osteoarthritis index
a: Unless otherwise noted, followup time is calculated from the end of the treatment period

Table 31Osteoarthritis hip pain: exercise

Author, Year, Followup,^a Pain Duration, Study Quality

Intervention

Population

Function and Pain Outcomes

Other Outcomes

Abbott, 2013⁴⁷

9.75 months

Duration of pain: 9 months

Fair

A. Exercise therapy (n=51/22 hip OA): 7 sessions of strengthening, stretching, and neuromuscular control over 9 weeks, with 2 booster sessions at week 16. Individual exercises prescribed as needed. Home exercise prescribed 3 times weekly

B. Usual care (n=51/23 hip OA): Routine care provided by patient’s own GP and other healthcare providers

A vs. B (total population, includes knee OA)

Age: 67 vs. 66

Females: 49% vs. 63%

% hip OA: 43.1% vs. 45.1%

WOMAC (0-240): 95.5 vs. 93.8

A vs. B (hip OA only)

9.75 months

WOMAC mean change from baseline: −12.4 vs. 6.6

Juhakoski, 2011⁷²

3, 9, and 21 months

Duration of pain: Mean 8.3 to 8.5 years

Fair

A. Exercise + usual care (n=57): 12 strengthening and stretching exercise sessions of 45 minutes once per week, with 4 booster sessions 1 year later

B. Usual care (n=56): normal routine care offered by patient’s own GP.

All patients attended an hour-long session on basic principles of nonoperative treatment of hip OA

A vs. B

Age: 67 vs. 66 years

Female: 68% vs. 72%

Duration of pain: 8.3 to 8.5 years

Baseline WOMAC function (0-100): 24.7 vs. 28.9

Baseline WOMAC pain (0-100): 21.5 vs. 29.1

A vs. B

3 months

WOMAC function: 22.6 vs. 30.1, (difference −7.5, 95% CI −13.9 to −1.0)

WOMAC pain: 23.4 vs. 28.9 (difference −5.5, 95% CI −13.0 to 2.0)

9 months

WOMAC function: 24.6 vs. 27.6 (difference −3.0, 95% CI −9.2 to 3.2)

WOMAC pain: 22.9 vs. 25.0 (difference −2.1, 95% CI −9.2 to 5.0)

21 months

WOMAC function: 24.4 vs. 30.0 (difference −5.6, 95% CI −12.9 to 1.7)

WOMAC pain: 24.1 vs. 27.9 (difference −3.8, 95% CI −12.0 to 4.4)

A vs. B

3 months

Weak opioid^b use (p=0.73):

Not using: 82.5% vs. 87.7%

1-6 times/week: 10.5% vs. 8.8%

Daily: 7.0% vs. 3.5%

9 months

Mean doctor visits for hip OA: 0.5 vs. 0.8, p=0.07

Mean physiotherapy visits for hip OA: 1.3 vs. 2.0, p=0.05

Weak opioid^b use (p=0.12):

Not using: 81.0% vs. 93.1%

1-6 times/week: 10.4% vs. 1.7%

Daily: 8.6% vs. 5.2%

21 months

Mean doctor visits (between 9 and 21 month followup) for hip OA: 0.5 vs. 1.1, p=0.05

Mean physiotherapy visits (between 9 and 21 month followup) for hip OA: 0.4 vs. 1.3, p<0.001

Weak opioid^b use (p=0.70):

Not using: 80.7% vs. 85.2%

1-6 times/week: 12.3% vs. 7.4%

Daily: 7.0% vs. 7.4%

Tak,^c 2005⁷³

6 months, 3 years

Duration of pain: NR

Poor

A. Exercise (n=45): Eight weekly group sessions of strength training, information on a home exercise program, ergonomic advice, and dietary advice

B. Usual care (n=49): Subject-initiated contact with GP. Reference group (n=NR) consisting of weekly stress management sessions for 10 weeks

A vs. B

Age: 68 vs. 69

Female: 64% vs. 71%

Baseline HHS (0-100): 71.1 vs. 71.0

Baseline GARS (18-72): 22.8 vs. 25.3

Baseline SIP-136 physical (0-100): 7.2 vs. 7.6

Baseline pain VAS (0-10): 3.8 vs. 4.2

Baseline HHS pain subscale (0-44): 27.9 vs. 28.8

A vs. B

3 months

HHS: 75.4 vs. 71.1, (difference 4.3, 95% CI −2.2 to 10.8)

GARS: 23.7 vs. 26.3, (difference −2.6, 95% CI −6.0 to 0.8)

SIP-136 physical: 5.1 vs. 8.4, (difference −3.3, 95% CI −5.3 to −1.3)

Pain VAS: 3.5 vs. 5.1, (difference −1.6, 95% CI −2.6 to −0.6)

HHS pain subscale: 29.6 vs. 26.9, (difference −0.9, 95% CI −4.7 to 2.9)

A vs. B

3 months

QoL VAS (0-10): 5.0 vs. 4.2, (difference 1.4, 95% CI −0.2 to 3.0)

HRQoL (7-39): 28.6 vs. 27.3, (difference 0.9, 95% CI −0.4 to 2.2)

Teirlinck, 2016⁷⁴

3 and 9 months

Duration of pain: Median 1 year

Fair

A. Exercise therapy (n=101): 12 sessions over 3 months consisting of strengthening, stretching, and aerobic exercise

B. Usual care (n=102): Routine care provided by patient’s own GP

A vs. B

Age: 64 vs. 67

Females: 62% vs. 55%

Pain duration median (IQR): 365 (810) vs. 365 (819) days

Baseline HOOS function (0-100): 35.4 vs. 32.2

Baseline HOOS pain (0-100): 37.6 vs. 38.9

Baseline ICOAP constant pain (0-20): 5.4 vs. 5.8

Baseline ICOAP intermittent pain (0-24): 8.0 vs. 8.4

Baseline ICOAP total pain (0-100): 30.4 vs. 32.2

A vs. B

3 months

HOOS function: 30.8 vs. 35.3, (adjusted difference −2.4, 95% CI −6.7 to 1.9)

HOOS pain: 34.4 vs. 37.2, (adjusted difference −2.2, 95% CI −6.2 to 1.7)

ICOAP constant pain: 4.0 vs. 5.3, (adjusted difference −0.9, 95% CI −1.9 to 0.1)

ICOAP intermittent pain: 7.0 vs. 7.9, (adjusted difference −0.6, 95% CI −1.7 to 0.6)

ICOAP total pain: 24.9 vs. 29.8, (adjusted difference −3.3, 95% CI −8.0 to 1.4)

9 months

HOOS function: 26.8 vs. 34.2, (adjusted difference −3.0, 95% CI −6.7 to 0.2)

HOOS pain: 31.6 vs. 34.6, (adjusted difference −1.6, 95% CI −6.2 to 3.0)

ICOAP constant pain: 3.6 vs. 4.7, (adjusted difference −0.7, 95% CI −1.7 to 0.4)

ICOAP intermittent pain: 6.1 vs. 7.2, (adjusted difference −0.6, 95% CI −1.8 to 0.6)

ICOAP total pain: 22.2 vs. 27.0, (adjusted difference −2.8, 95% CI −7.6 to 2.0)

A vs. B

3 months

EuroQol 5D−3L (−0.329−1.0): 0.77 vs. 0.76, (adjusted difference −0.01, 95% CI −0.06 to 0.04)

9 months

EuroQol 5D-3L: 0.78 vs. 0.78, (adjusted difference −0.01, 95% CI −0.06 to 0.04)

Total hip replacements: 6 vs. 9

: CI = confidence interval; GARS = gait abnormality rating scale; GP = general practitioner; HHS = Harris Hip Score; HOOS = hip disability and osteoarthritis outcome score; HRQoL = Health Related Quality of Life; ICOAP = intermittent and constant pain score; IQR = Inter-quartile range; NR = not reported; OA = osteoarthritis; QoL = quality of life; SIP-136 = Sickness Impact Profile-136; VAS = visual analog scale; WOMAC= Western Ontario and McMaster Universities Osteoarthritis Index
a: Unless otherwise noted, followup time is calculated from the end of the treatment period
b: Authors defined weak opioids as tramadol or codeine
c: Cluster RCT where clusters were formed from participants selecting a time that best fit their schedule

Table 32Osteoarthritis hip pain: manual therapy

Author, Year, Followup,^a Pain Duration, Study Quality

Intervention

Population

Function and Pain Outcomes

Other Outcomes

Abbott, 2013⁴⁷

9.75 months

Duration of diagnosis: 2.6 years

Fair

A. Manual therapy (n=54/24 hip OA): 7 manual therapy sessions in 9 weeks with 2 additional booster sessions

B. Exercise (n=51/22 hip OA), 7 exercise sessions in 9 weeks with 2 additional booster sessions

C. Usual care (n=51/23 hip OA)

A vs. B vs. C (total population, includes knee OA)

Age: 67 vs. 67 vs. 66 years

Female: 49% vs. 52% vs. 58%

Percent knee OA: 56% vs. 57% vs. 55%

Percent hip OA: 44% vs. 43% vs. 45%

Percent both hip OA and knee OA: 22% vs. 20% vs. 26%

Baseline WOMAC (0-240): 114.8 vs. 95.5 vs. 93.8

A vs. B (hip OA only)

9.75 months

WOMAC, mean change from baseline: −22.9 vs. −12.4, p=NR

A vs. C (hip OA only)

9.75 months

WOMAC, mean change from baseline: −22.9 vs. 6.6, p=NR

None

Hoeksma, 2004¹⁹³

3 and 6 months

Duration of symptoms: NR

Fair

A. Manual therapy (n=56): Sessions consisted of stretching followed by traction manipulation in each limited position (high velocity thrust technique).

B. Exercise therapy (n=53): Sessions implemented exercises for muscle functions, muscle length, joint mobility, pain relief, and walking ability and were tailored to the specific needs of the patient. Instructions for home exercises were given.

Both groups received 2 sessions per week for 5 weeks (9 sessions in total).

Age: 72 vs. 71 years

Females: 68% vs. 72%

Symptom duration of 1 month to 5 years: 76% vs. 81%

Severe OA on radiography: 45% vs. 38%

Baseline HHS (0-100): 54 vs. 53

Baseline pain at rest VAS (0-100): 22.5 vs. 23.0

Baseline pain walking VAS (0-100): 34.0 vs. 28.8

A vs. B

3 months

HHS: 68.4 vs. 56.0, adjusted difference 11.1, 95% CI 4.0 to 18.6

Pain at rest VAS: 19.1 vs. 26.9, adjusted difference −7.2, 95% CI −13.8 to −0.5

Pain walking VAS: 16.4 vs. 23.7, adjusted difference −12.1, 95% CI −22.9 to −2.5

6 months

HHS: 70.2 vs. 59.7, adjusted difference 9.7, 95% CI 1.5 to 17.9

Pain at rest VAS: 14.0 vs. 21.6, adjusted difference −7.0, 95% CI −20.3 to 5.9

Pain walking VAS: 17.0 vs. 24.3, adjusted difference −12.7, 95% CI −24.0 to −1.9

A vs. B

3 months

SF-36 physical function (0-100): 45.3 vs. 46.6, adjusted difference −2.1, 95% CI −11.7 to 7.7

SF-36 role physical function: 25.4 vs. 29.8, adjusted difference −23.5 to 10.2

SF-36 bodily pain: 47.4 vs. 46.1, adjusted difference −3.2, 95% CI −13.1 to 6.8

6 months

SF-36 physical function: 50.4 vs. 45.3, adjusted difference 3.1, 95% CI −4.1 to 10.5

SF-36 role physical function: 36.7 vs. 32.4, adjusted difference 2.2, 95% CI −16.8 to 21.1

SF-36 bodily pain: 51.4 vs. 49.9, adjusted difference −1.5, 95% CI −11.1 to 7.7

: CI = confidence interval; HHS = Harris Hip Score; NR = not reported; OA = osteoarthritis; SF-36 = Short Form 36 Questionnaire; VAS = visual analog scale; WOMAC =Western Ontario and McMaster Universities Arthritis Index
a: Unless otherwise noted, followup time is calculated from the end of the treatment period

Table 33Osteoarthritis hand pain: exercise

Author, Year, Followup,^a Pain Duration, Study Quality

Intervention

Population

Function and Pain Outcomes

Other Outcomes

Osteras, 2014⁷⁵

3 months

Duration of pain: NR

Poor

A. Exercise (n=46): ROM/strength exercises, 4 group sessions supplemented by instructions for home exercise 3 times per week for 12 weeks

B. Usual care (n=64): Subjects received no particular attention, referral, or treatment from the study.

A vs. B

Age: 67 vs. 65 years

Females: 89% vs. 91%

Fulfillment of ACR criteria for hand OA 91% vs. 91%

Self-reported hip OA: 39% vs. 46%

Self-reported knee OA: 40% vs. 51%

Other rheumatic disease: 13% vs. 15%

Severe mental distress: 17% vs. 39%

Baseline FIHOA (0-30): 10.8 vs. 9.8

PSFS (0-10): 3.5 vs. 3.9

Baseline hand pain NRS (0-10): 4.2 vs. 3.9

A vs. B

3 months

FIHOA: 10.9 vs. 10.5; adjusted difference −0.5 (95% CI −1.9 to 0.8)

Hand pain NRS: 4.3 vs. 4.3; adjusted difference −0.2 (95% CI −0.8 to 0.3)

OARSI OMERACT no. of responders: 30% vs. 28% (NS)

A vs. B

3 months

PSFS (0-10): 4.3 vs. 4.4; adjusted difference 0.1 (95% CI −0.7 to 1.0)

Patient global assessment of disease activity (0-10): 4.2 vs. 4.1; adjusted difference 0.1 (95% CI −0.5 to 0.7)

Patient global assessment of disease activity affecting ADL: 3.8 vs. 3.8; adjusted difference −0.2 (95% CI −0.8 to 0.4)

: ACR = American College of Radiology; ADL = activity of daily living; CI = confidence interval; FIHOA = Functional Index for Hand OsteoArthritis; NR = not reported; NRS = numeric rating scale; NS = not statistically significant; OA = osteoarthritis; OARSI OMERACT = Osteoarthritis Research Society International Outcome Measures in Rheumatology; PSFS = patient-specific function scale; ROM = range of motion
a: Unless otherwise noted, followup time is calculated from the end of the treatment period

Table 34Osteoarthritis hand pain: physical modalities

Author, Year, Followup,^a Pain Duration, Study Quality

Intervention

Population

Function and Pain Outcomes

Other Outcomes

Brosseau, 2005¹⁶⁵

4.5 months

Duration of pain: NR

Good

A. Low-level laser therapy (n=42): 3 J/cm² applied for 1 second each to the skin overlying the radial, medial and ulnar nerves (total of 15 points irradiated); 3 sessions lasting 20 minutes per week for 6 weeks

B. Sham low-level laser therapy (n=46): same procedure as the active treatment but a sham laser probe was used.

A vs. B

Age: 64 vs. 65 years

Female: 74% vs. 83%

Medication use: 60% vs. 61%

Diagnosis of OA: 7.5 vs. 8.5 years

Baseline AUSCAN function (0-4)^b: 2.2 vs. 2.1

Baseline AUSCAN pain (0-4)^b: 2.4 vs. 2.1

Baseline pain intensity VAS (0-100): 56.9 vs. 49.4

A vs. B

4.5 months

AUSCAN function: 1.9 vs. 1.7, difference 0.2 (95% CI −0.2 to 0.6)

AUSCAN pain: 1.9 vs. 1.8, difference 0.1 (95% CI −0.3 to 0.5)

Pain VAS: NR

A vs. B

4.5 months

Patient global assessment:

Fully improved: 0% vs. 3%

Partially improved: 40% vs. 33.3%

No improvement: 60% vs. 52%

Dilek, 2013¹⁶⁶

2.25 months

Duration of pain: Mean 5.5 years

Fair

A. Dip-wrap paraffin bath therapy (n=24): patients dip both hands into 50°C paraffin bath 10 times, paraffin left on for 15 minutes, treatment administered 5 days per week for 3 weeks

B. Control group (n=22): Details NR; assumed to be no treatment

Only paracetamol intake was permitted during the study

A vs. B

Age: 59 vs. 60 years

Female: 83% vs. 91%

Baseline AUSCAN function (0-36)^c: 16.2 vs. 17.1

Baseline AUSCAN pain (0-20)^c: 10.7 vs. 9.8

Baseline Pain at rest, median (VAS 0-10): 5.0 vs. 4.0

Baseline Pain during ADL, median (VAS 0-10): 7.0 vs. 8.0

A vs. B

2.25 months

AUSCAN function: 13.8 vs. 17.8, difference −4.0 (95% CI −8.6 to 0.6)

AUSCAN pain: 6.5 vs. 9.5, difference −3 (95% CI −5.5 to −0.5)

Pain VAS at rest, median: 0.0 vs. 5.0, p<0.001

Pain VAS during ADL, median: 5.0 vs. 7.0, p=0.05

: ADL = activity of daily living; AUSCAN = Australian Canadian Osteoarthritis Hand Index; CI =confidence interval; NR = not reported; VAS = visual analog scale
a: Unless otherwise noted, followup time is calculated from the end of the treatment period
b: Data for the AUSCAN was presented as an average of all responses, on a 5-point Likert scale (0-4), for both the physical function (9 items) and pain (5 items) subscale
c: Data for the AUSCAN was presented as a sum of the values across all items within the physical function (9 items) and pain (5 items) subscales; a 5-point Likert scale (0-4) was used to rate each item resulting in score ranges of 0-36 and 0-20, respectively

Table 35Osteoarthritis hand pain: multidisciplinary rehabilitation

Author, Year, Followup,^a Pain Duration, Study Quality

Intervention

Population

Function and Pain Outcomes

Other Outcomes

Stukstette, 2013²⁶¹

3 months

Duration of pain: Mean 4 years

Fair

A. Multidisciplinary treatment program (n=75): 4 group based therapy sessions of 2.5-3 hours duration (time period NR), supervised by a specialized nurse and occupational therapist

B. Waiting list (n=72)

All patients: 30 minute explanation of written information about OA

A vs. B

Age: 60 vs. 58

Female: 18% vs. 16%

Mean duration of diagnosis: 4 vs. 4 years

Proportion taking opioids: 3% vs. 4%

Baseline AUSCAN function (0-36): 21.0 vs. 21.8

Baseline AUSCAN pain (0-20):10.4 vs. 10.2

A vs. B

3 months

AUSCAN function: 18.6 vs. 18.8, adjusted difference 0.5 (95% CI −0.09 to 0.4)

AUSCAN pain: 9.4 vs. 9.0, adjusted difference 0.4 (95% CI −0.5 to 1.3)

OARSI OMERACT responders: 33% vs. 37%, OR 0.8 (95% CI 0.4 to 1.6)

A vs. B

3 months

Patient global assessment (0-100): 60.4 vs. 66.0, adjusted difference −5.2 (95% CI −11.4, 1.0)

SF-36 PCS (0-100): 39.8 vs. 39.9, adjusted difference −0.14 (95% CI −1.62 to 1.35)

SF-36 MCS (0-100): 50.3 vs. 51.6, adjusted difference 0.27 (95% CI −2.13 to 2.67)

: AUSCAN = Australian Canadian Osteoarthritis Hand Index; OA = osteoarthritis; OARSI-OMERACT = Osteoarthritis Research Society International Outcome Measures in Rheumatology; OR = Odds ratio; SF-36 MCS = Short-Form 36 Questionnaire Mental Component ScoreSF-36 PCS = Short-Form 36 Physical Component Score
a: Unless otherwise noted, followup time is calculated from the end of the treatment period.

Table 36Fibromyalgia: exercise therapies

Author, Year, Followup,^a Pain Duration, Study Quality

Intervention

Population

Function and Pain Outcomes

Other Outcomes

Altan, 2009⁷⁶

3 months

Duration of pain: NR

Fair

A. Pilates (n=25): 1 hour session 3 times per week for 3 months: Pilates postural education, search for neutral position, sitting, antalgic, stretching and, proproceptivity improvement exercises, and breathing education

B. Attention control (n=25): Instructions in home exercise relaxation/stretching program of 1 hour sessions 3 times per week for 3 months

All patients: Education session about available diagnosis and treatment of FM

A vs. B

Age: 48 vs. 50 years

Female: 100% vs. 100%

Baseline FIQ (0-100): 80.8 vs. 80.1

Baseline pain VAS (0-10): 6.1 vs. 6.3

A vs. B

3 months:

FIQ: 69.3 vs. 77.6, difference −8.3 (95% CI −21.8 to 5.2)

Pain VAS: 5.2 vs. 6.5, difference −1.3 (95% CI −2.6 to 0.03)

A vs. B

3 months:

NHP (0-100): 224.2 vs. 246.3, difference −22.1 (95% CI −96.0 to 51.8)

Baptista, 2012⁷⁷

4 months

Duration of pain: NR

Fair

A. Belly dance (n=40): One hour belly dance classes twice a week for 16 weeks (combination exercise)

B. Waiting list control (n=40): dance offered at end of the study

A vs. B:

Age: 50 vs. 49 years

Female: 100% vs. 100%

Race: NR

Baseline FIQ (0-10): 5.9 vs. 6.3

Baseline pain VAS (0-10): 7.7 vs. 7.5

A vs. B

4 months

FIQ: 4.3 vs. 5.9; difference −1.6 (95% CI −2.5 to −0.8)

Pain VAS: 4.7 vs. 7.3; difference −2.6 (95% CI −3.6 to −1.6)

A vs. B

4 months

BDI (0-63): 23.1 vs. 23.5; difference −0.40 (95% CI −7.09 to 6.29)

STAI part 1: 49.4 vs. 51.8; difference −2.40 (95% CI −6.87 to 2.07)

STAI part 2: 49.8 vs. 54.1; difference −4.3 (95% CI −8.72 to 0.12)

SF-36 function (0-100): 56.3 vs. 39.1; difference 17.2 (95% CI 7.55 to 26.85)

SF-36 limitation due to physical aspects (0-100): 36.5 vs. 13.8; difference 22.7 (95% CI 9.06 to 36.34)

SF-36 pain (0-100): 46.0 vs. 29.1; difference 16.9 (95% CI 7.62 to 26.18)

SF-36 mental (0-100): 52.3 vs. 46.2; difference 6.1 (95% CI −3.89 to 16.09)

Buckelew, 1998⁷⁸

3 and 24 months

Duration of symptoms: 11 years

Poor

A. Combination exercise (n=30): included active range of motion exercises, strengthening exercises, low to moderate intensity aerobic exercise, proper posture and body mechanics instruction, and instructions on use of heat, cold, and massage; one 90 minute session per week for 1.5 months and instructions to train 2 additional times independently per week then 24 months of monthly 1-hour groups.

B. Attention control (n=30): one 90-180 minute education session weekly for 1.5 months

A vs. B

Age: 46 vs. 44 years

Female: 93% vs. 90%

Duration of symptoms: 12 vs. 10 years

Duration of diagnosis: 3.0 vs. 2.5 years

Baseline AIMS physical activity subscale (0-10): median 4.0 vs. 6.0

Baseline pain VAS (0-10): median 6.3 vs. 5.9

A vs. B

3 months:

AIMS physical activity subscale: median 4.0 vs. 6.0; median change from baseline 0 vs. 0

Pain VAS: median 5.4 vs. 5.8, median change from baseline −0.8 vs. −0.5

24 months

AIMS physical activity subscale: median 4.0 vs. 6.0, median change from baseline 0 vs. 0

Pain VAS: median 5.5 vs. 5.4, median change from baseline −1.2 vs. −0.6

A vs. B

3 months:

SCL-90-R Global Severity Index (0-90): median 65.5 vs. 65.0, median change from baseline −3 vs. 0

CES-D (0-60): median 13.5 vs. 13.0, median change from baseline −2.5 vs. 3

Sleep scale (0-12), median 8.0 vs. 5.0, median change from baseline 0 vs. 0

24 months

SCL-90-R Global Severity Index: median 65.5 vs. 67.0, median change from baseline −2.5 vs. −1

CES-D: median 11.5 vs. 12.0, median change from baseline −3.5 vs. −2

Sleep scale: median 7.5 vs. 6.0, median change from baseline 0 vs. 0

Clarke-Jenssen, 2014⁷⁹

3 and 12 months

Duration of symptoms: 14 years

Fair

A. Aerobic exercise (n=44): conducted on land and in warm water provided in a warm climate; also stretching, relaxation, and education; provided in groups 5 days per week for 4 weeks

B. Aerobic exercise (n=44): on land and in warm water provided in a cold climate; also stretching, relaxation, education, provided in groups 5 days per week for 4 weeks

C. Usual Care (n=44): no intervention

A vs. B vs. C:

Age: 46 vs. 46 vs. 45 years

Female: 88% vs. 93% vs. 96%

Symptom duration: 17 vs. 13 vs.12 years

Baseline pain VAS (mean, 0-10): 6.6 vs. 6.9 vs. 6.6

A vs. C, between-group difference in change from baseline:

3 months

FIQ: data NR, p=NS

Pain VAS: −1.2 (95% CI −2.2 to −0.1)

12 months

FIQ data NR, p=NS

Pain VAS: 0.1 (95% CI −0.9 to 1.1)

B vs. C, between-group difference in change from baseline:

3 months

FIQ: data NR, p=NS

Pain VAS: −0.9 (95% CI −1.9 to 0.2)

12 months

FIQ: data NR, p=NS

Pain VAS: 0 (95% CI −1 to 1)

A vs. C, between-group difference in change from baseline:

3 months

HADS: data NR, p=NS

SF-36 Physical: data NR, p=NS

SF-36 Mental: data NR, p=NS

12 months

HADS: data NR, p=NS

SF-36 Physical: data NR, p=NS

SF-36 Mental: data NR, p=NS

B vs. C, between-group difference in change from baseline:

3 months

HADS: data NR, p=NS

SF-36 Physical: data NR, p=NS

SF-36 Mental: data NR, p=NS

12 months

HADS: data NR, p=NS

SF-36 Physical: data NR, p=NS

SF-36 Mental: data NR, p=NS

Da Costa, 2005⁸⁰

3 and 9 months

Duration of symptoms: 11 years

Fair

A. Combination Exercise (n=39): aerobic exercise, stretching, and strength exercises; 4 visits (initial 90 minutes, others 30 minutes) over 12 weeks with exercise physiologist; individualized home-based program.

B. Usual care (n=41): subjects asked to record exercise activity weekly during the 12-week intervention phase and monthly thereafter.

A vs. B

Age, years: 49 vs. 52

Female: 100% vs. 100%

Symptom duration: 10.5 vs. 11.2 years

Baseline FIQ (0-100): 55.1 vs. 48.6

Baseline upper body pain VAS (0-100): 49.5 vs. 47.4

Baseline lower body pain VAS (0-100): 47.0 vs. 47.0

A vs. B, mean change from baseline

3 months:

FIQ: −7.8 (95% CI −13.9 to −1.7) vs. −0.04 (95% CI −5.2 to 5.1), p=0.05

Pain VAS, upper body: −10.6 (95% CI −17.8 to −3.4) vs. −1.9 (95% CI −6.9 to 3.2), p=0.048

Pain VAS, lower body: −8.21 (95% CI −15.7 to −0.74) vs. −2.0 (95% CI −9.4 to 5.4), p=0.24

9 months:

FIQ: −10.1 (95% CI −16.1 to −4.0) vs. −0.024 (95% CI −4.4 to 3.9), p=0.009

Pain VAS, upper body: −7.9 (95% CI −14.3 to −1.4) vs. 2.4 (95% CI 3.7 to 8.5), p=0.02

Pain VAS, lower body: −5.6 (95% CI −13.3 to 2.2) vs. −0.29 (95% CI −8.6 to 8.0), p=0.35

A vs. B, mean change from baseline

3 months:

SCL 90-R GSI (30−81): −0.02 (95% CI −0.3 to −0.04) vs. −0.07 (95% CI −0.2 to 0.05), p=0.26

9 months:

SCL 90-R GSI (30-81): −0.16 (95% CI −0.28 to 0.35) vs. −0.09 (95% CI −0.21 to 0.03), p=0.39

Fontaine, 2010, 2011⁸¹^,⁸²

6 and 12 months

Duration of fibromyalgia: Mean 7.4 years

Fair

A. Aerobic Exercise (n=30): Lifestyle Physical Activity; 6, 60-minute group sessions over 3 months with the goal to increase moderate-intensity physical exercise by accumulating short bursts of physical activity throughout the day to 30 minutes 5-7 days per week.

B. Attention control (n=23): FM education, monthly sessions for 3 months. Included education about FM and social support.

A vs. B

Age: 46 vs. 49 years

Female: 94% vs. 100%

Race, white: 78% vs. 82%

Years since diagnosis: 5.9 vs. 9.6

Baseline FIQ (scale NR): 67.5 vs. 69.7

Baseline pain VAS (0-100): 54.6 vs. 58.9

A vs. B

6 months:

FIQ: 65.3 vs. 63.9, difference 1.4 (95% CI −10.0 to 12.8)

Pain VAS: 54.9 vs. 49.4, difference 5.5 (95% CI −7.8 to 18.8)

12 months:

FIQ: 64.4 vs. 65.1, difference −0.7 (95% CI −13.6 to 12.2)

Pain VAS: 51.6 vs. 50.9, difference 0.7 (95% CI −12.9 to 14.3)

A vs. B

6 months:

CES-D (scale NR): 18.1 vs. 19.9, difference −1.8 (95% CI −7.5 to 3.9)

12 months:

CES-D: 19.8 vs. 20.6, difference −0.8 (95% CI −7.1 to 5.5)

Giannotti, 2014⁸³

1 and 6 months

Duration of pain: NR

Poor

A. Combination exercise (n=21): stretching, strengthening, active and passive mobilization, spine flexibility, and aerobic training plus education 2 days a week (60 minutes per session) for 10 weeks; instructions to perform at home the exercise program at least 3 times per week.

B. No intervention (n=20)

A vs. B

Age: 53 vs. 51 years

Female: 95% vs. 92%

Baseline FIQ (0-100): 62.7 vs. 59.1

Baseline pain VAS (0-10): 6.1 vs. 6.1

A vs. B

1 month:

FIQ: 55.5 vs. 50.9, difference 4.6 (95% CI −6.38 to 15.58)

Pain VAS: 5.3 vs. 5.5, difference −0.20 (95% CI −1.87 to 1.47)

6 months

FIQ: 48.8 vs. 56.9, difference −8.1 (95% CI −20.33 to 4.13)

Pain VAS: 5.8 vs. 5.4, difference 0.4 (95% CI −1.4 to 2.2)

A vs. B

1 month

Sleep VAS (0-10): 4.6 vs. 5.0, difference −0.40 (95% CI −2.51 to 1.71)

6 months

Sleep VAS (0-10): 6.3 vs. 6.1, difference 0.20 (95% CI −2.15 to 2.55)

Gowans, 2001⁸⁴

6 months

Duration of symptoms: 9 years

Poor

A. Aerobic exercise (n=30): 3 pool and walking exercise classes (plus stretching) per week for 6 months

B. Control group (n=27): continued ad libitum activity

A vs. B

Age: 45 vs. 50 years

Female: 89% vs. 87%

Baseline FIQ (0-80): 57.7 vs. 56.6

A vs. B

6 months:

FIQ: 48.6 vs. 54.9, p**<0.05; difference −6.3 (95% CI −14.8 to 2.2)

A vs. B

6 months:

BDI (0-63): 16.9 vs. 21.3, p**<0.05 difference −4.4 (95% CI −10.4 to 1.6), p=0.15

STAI (20-80): 41.3 vs. 51.7, P**<0.05; difference −10.4 (95% CI −18.2 to −2.6), p=0.01

Gusi, 2006⁸⁵

3 months

Duration of symptoms: 22 years

Poor

A. Combination exercise (n=18): 1-hour pool exercise (warm up, aerobic exercise, mobility and lower-limb strength exercises, cool down) 3 times per week for 12 weeks (subjects instructed to avoid physical exercise for the next 12 weeks)

B. Control (n=17): Normal daily activities, which did not include any exercise related to those in the therapy.

A vs. B

Age, years: 51 vs. 51

Female: 100% vs. 100%

Baseline pain VAS (0-100): 63.1 vs. 63.9

A vs. B

Change from baseline

3 months

Pain VAS: −1.6 (95% CI −12.7 to 0.9) vs. 0.9 (95% CI −7.3 to 9.2), p=0.69

A vs. B

Change from baseline

3 months

EQ-5D (0-1): 0.14 (95% CI −0.03 to 0.32) vs. −0.02 (−0.17 to 0.13), p=0.14

EQ-5D Pain/discomfort (1-3): −0.1 (95% CI −0.4 to 0.3) vs. 0 ((95% CI −0.3 to 0.3), p=0.79

EQ-5D Anxiety/depression (1-3): −0.5 (95% CI −0.8 to −0.1) vs. 0 (95% CI −0.2 to 0.2), p=0.01

Kayo, 2012⁸⁶

3 months

Duration of symptoms: 5 years

Fair

A. Aerobic exercise (n=30): Walking program, 60 minutes 3 times per week for 16 weeks, supervised by physical therapist.

B. Muscle strengthening exercise (n=30): 60 minutes 3 times per week for 16 weeks, supervised by physical therapist.

C. No treatment (n=30)

A vs. B:

Age: 48 vs. 47 vs. 46 years

Symptom duration: 4.0 vs. 4.7 vs. 5.4

Baseline FIQ total (0-100): 63.1 vs. 67.3 vs. 63.8

Baseline pain VAS (0-10): 8.6 vs. 8.7 vs. 8.4

A vs. C

3 months

FIQ: 38.5 vs. 57.7; overall group X time interaction p=NS

Pain VAS: 4.8 vs. 6.7; overall group X time interaction p=NS

B vs. C

3 months

FIQ: 50.5 vs. 57.7; overall group X time interaction p=NS

Pain VAS: 5.9 vs. 6.7; overall group X time interaction p=NS

King, 2002⁸⁷

3 months

Duration of symptoms: 8.5 years

Poor

A. Aerobic exercise (n=30): aerobic land and water activities; three, 10-40 minute supervised exercise sessions per week for 3 months

B. Control (n=18): instructions on stretches and coping strategies and contacted 1-2 times during the 3 month treatment period to answer any questions

A vs. B

Age: 45 vs. 47 years

Female: 100% vs. 100%

Duration of symptoms: 7.8 vs. 9.6 years

Baseline FIQ (0-80): 52.4 vs. 55.2

A vs. B

3 months

FIQ: 47.5 vs. 51.5, difference −4.0 (95% CI −12.2 to 4.2)

Mannerkorpi, 2009⁸⁸

6-7 months

Duration of pain: NR

Fair

A. Aerobic exercise (n=81): One 45 minute pool aerobic exercise session per week for 20 weeks, stretching exercise also, plus six 1 hour weekly sessions of strategies to cope with FM symptoms, plan for physical activity for the following week and short relaxation exercise

B. Education control (n=85): six 1 hour weekly sessions of strategies to cope with FM symptoms, plan for physical activity for the following week and short relaxation exercise

A vs. B

Age: 45 vs. 47 years

Female: 100% vs. 100%

Baseline FIQ (0-100): 61.6 vs. 66.6

Baseline FIQ pain subscale (0-100): 67.7 vs. 70.4

A vs. B

6-7 months

FIQ: mean change from baseline: −3.9 vs. −4.5, p=0.04

FIQ pain: mean change from baseline: −6.5 vs. −2.5, p=0.018

A vs. B

6-7 months

HADS depression scale (0-21): mean change from baseline −0.4 vs. 0.0, p=0.99

HADS anxiety scale (0-21): mean change from baseline −0.7 vs. 0.4, p=0.15

SF-36 PCS (0-100): mean change from baseline 2.9 vs. 1.3, p=0.13

SF-36 MCS (0-100): mean change from baseline 0.5 vs. 1.3, p=0.15

SF-36 physical functioning (0-100): mean change from baseline 2.2 vs. 1.3, p=0.70

SF-36 role-physical (0-100): mean change from baseline 12.1 vs. 9.3, p=0.72

SF-36 bodily pain (0-100): mean change from baseline 5.0 vs. 3.6, p=0.24

Paolucci, 2015⁸⁹

3 months

Duration of symptoms: NR

Fair

A. Combination exercise (n=19): Low-impact aerobic training, agility training balance and postural exercises, hip flexor strengthening, static stretching, diaphragmatic breathing, and relaxation; 10, 60-minute sessions, twice a week for 5 weeks

B. Control (n=18): No rehabilitation interventions, continued normal activities

A vs. B

Age: 50 vs. 48 years

Female: 100% vs. 100%

Baseline FIQ total (0-100): 64.8 vs. 63.9

A vs. B

3 months:

FIQ total: 53.8 vs. 64.3, difference −10.5 (95% CI −17.8, −3.2)

Sanudo, 2010⁹²

6 months

Duration of pain: NR

Fair

A. Combination exercise (n=21): supervised aerobic, muscle strengthening, and flexibility exercises; twice-weekly sessions for 24 weeks

B. Aerobic exercise (n=22): warm-up, aerobic exercise, cool down; two, 45-60 minute sessions/week for 6 months

C. Usual care control (n=21): medical treatment for FM and continued normal daily activities, which did not include aerobic exercise.

A vs. B vs. C

Age: 56 vs. 56 vs. 57 years

Baseline FIQ (0-100): 62.2 vs. 60.9 vs. 60.5

A vs. C

6 months

FIQ: mean change from baseline −8.8 vs. NR; p<0.01

B vs. C

6 months

FIQ: mean change from baseline −8.8 vs. NR; p<0.05

A vs. C

6 months

BDI (0-63): mean change from baseline −6.4 vs. NR; p<0.01

SF-36 total (0-100): mean change from baseline 8.4 vs. NR; p<0.01

B vs. C

6 months

BDI: −8.5 vs. NR; p<0.01

SF-36 total: 8.9 vs. NR; p<0.05

Sanudo, 2012⁹¹

6, 18 and 30 months

Duration of pain: NR

Poor

A. Combination exercise (n=21): Twice-weekly 45- to 60-minute sessions of exercise (warm up, aerobic exercise, muscle strengthening exercise, flexibility exercises) for 6 months.

B. Usual care (n=20): alternated between 6 months of training and 6 months with no exercise intervention (asked not to participate in any structured exercise program) for 30 months.

A vs. B

Female: 100% vs. 100%

Baseline FIQ (0-80): 58.6 vs. 55.6

A vs. B

6 months:

FIQ: 48.5 vs. 55.4, p<0.0005; difference −6.9 (95% CI −14.4 to 0.6), p=0.07

18 months:

FIQ: 45.6 vs. 51.3, p=NR; difference −5.7 (95% CI −14.6 to 3.2), p=0.20

30 months

FIQ: 38.5 vs. 49.5, p NS; difference −11.0 (95% CI −19.9 to −2.1), p=0.02

A vs. B

6 months:

SF-36 (0-100): 49.5 vs. 37.9, p=0.13; difference 4.68 (95% CI .096 to 21.104), p=0.02

BDI (0-63): 14.7 vs. 16.6, p=0.18; difference −1.9 (95% CI −6.5 to 2.7), p=0.41

18 months:

SF-36: 51.8 vs. 41.3, p=NR; difference 10.5 (95% CI 0.5 to 20.5), p=0.04

BDI: 14.3 vs. 14.2, p=NR; difference 0.10 (95% CI −5.4 to 5.6), p=0.97

30 months

SF-36: 60.5 vs. 42.0, p=NS

BDI: 9.7 vs. 17.9, p=NS

Sanudo, 2015⁹⁰

6 months

Duration of pain: NR

Poor

A. Aerobic exercise (n=16): consisted of warm up, steady state exercise at 60-65% of predicted maximum heart rate, interval training at 75-80% of predicted maximum heart rate, and cool-down; 2, 45-60 minute sessions per week for 6 months

B. Usual care (n=16): normal activities, which did not include structured exercise.

A vs. B

Age: 55 vs. 58 years

Female: 100% vs. 100%

Baseline pain VAS (0-10): 7.4 vs. 7.2

A vs. B

6 months:

Pain VAS: 6.7 vs. 7.0, difference −0.3 (95% CI −6.3 to 5.7),

A vs. B

6 months

Anxiety VAS (0-10): 5.7 vs. 7.5, difference −1.8 (95% CI −10.8 to 7.2)

Depression VAS (0-10): 5.6 vs. 6.7 (2.2), difference −1.1 (95% CI −10.1 to 7.9)

Sleep disturbance VAS (0-10): 7.2 vs. 8.6 (1.9), difference −1.4 (95% CI −8.9 to 6.1)

Sencan, 2004⁹³

6 months

Duration of pain: 5.4 years

Poor

A. Exercise group (n=14): 3 40-minute aerobic exercise sessions per week for 6 weeks

B. Paroxetine (n=18): 20/mg paroxetine/day for 6 weeks

C. Sham (n=20): placebo TENS with electrodes applied to two most painful tender points for 20 minutes, 3 times/week for 6 weeks.

All patients instructed to take paracetamol as a rescue medication throughout the study.

A vs. B vs. C

Age: 35 vs. 36 vs. 36 years

Female: 100% vs. 100% vs. 100%

BMI: 24 vs. 24 vs. 15

Duration of symptoms: 4.7 vs. 6.5 vs. 5.1 years

Baseline VAS (0-10): 6.85 vs. 6.62 vs. 7.70

Baseline Beck Depression Index (BDI 0-60): 16.20 vs. 20.80 vs. 18.50

A vs. C

6 months

VAS: 4.75 vs. 5.01, difference −0.3 (95% CI −1.5 to 0.9)

A vs. B

6 months

VAS:4.75 vs. 5.84, difference −1.1 (95% CI −2.4 to 0.2)

A vs. C

6 months

BDI: 9.95 vs. 15.15, difference −5.2 (95% CI −7.41 to −2.99)

Analgesic Consumption: 1.15 vs. 4.35, difference −3.17 (95% CI −3.79 to −2.55)

A vs. B

6 months

BDI: 9.95 vs. 10.12, difference −0.17 (95% CI −2.09 to 1.75)

Analgesic Consumption: 1.15 vs. 2.40, difference −1.25 (95% CI −1.39 to −1.11)

Tomas-Carus, 2008/2009⁹⁴^,⁹⁵

8 months

Duration of symptoms: 20 years

Poor

A. Combination exercise (n=17): Pool exercise in 1 hour sessions 3 times per week for 8 months (warm up, aerobic exercise, mobility and lower limb strength exercises using water resistance and upper limb strength exercises without water resistance, cool down)

B. Control (n=16): normal activities for 8 months, which did not include exercise similar to that in group A.

A vs. B

Age: 51 vs. 51 years

Female: 100% vs. 100%

Baseline FIQ Total (0-10): 6.1 vs. 6.3

FIQ Physical Baseline Function (0-10): 3.0 vs. 3.7

Baseline FIQ Pain (0-10): 5.6 vs. 6.4

A vs. B

8 months

FIQ Total: 5.2 vs. 6.5, difference −1.3 (95% CI −0.23 to −0.3)

FIQ Physical Function: 2.4 vs. 3.7, difference −1.3 (95% CI −2.7 to 0.09)

FIQ Pain: 5.3 vs. 6.6, difference −1.3 (95% CI −2.5 to −0.09)

A vs. B

8 months

FIQ Anxiety (0-10): 4.7 vs. 6.6, difference −1.9 (95% CI −3.7 to −0.1)

FIQ Depression (0-10): 4.0 vs. 6.1, difference −2.1 (95% CI −4.1 to −0.1)

STAI State Anxiety (20-80): 37.5 vs. 44.4, difference −6.9 (95% CI −13.2 to −0.6)

SF-36 physical function (0-100): 54.1 vs. 36.6, difference 17.5 (95% CI 3.4 to 31.6)

SF-36 bodily pain (0-100): 51.7 vs. 27.1, difference 24.6 (95% CI 11.6 to 37.6)

SF-36 Mental Health (0-100): 67.3 vs. 49, difference 18.3 (95% CI 2.5 to 34.0)

van Eijk-Hustings, 2013⁹⁶

18 months

Duration of pain: NR

Fair

A. Aerobic exercise (n=47): two group sessions per week for 12 weeks (warm up, aerobic exercise, resistance training to strengthen muscles, cool down). Subjects were asked to practice exercises at home with videodisc once a week.

B. Usual care (n=48): individualized FM education and lifestyle advice within 1-2 consultations, plus care as usual

A vs. B

Age: 44 vs. 43 years

Female: 100% vs. 98%

Baseline FIQ total (0-100): 60.0 vs. 55.4

Baseline FIQ physical function (0-10): 3.6 vs. 3.4

Baseline FIQ Pain (0-10): 6.2 vs. 5.5

A vs. B

18 months:

FIQ total: 52.0 vs. 56.2, ES=0.22 (95% CI −0.20 to 0.61)

FIQ physical function: 3.6 vs. 3.9, ES=0.11 (95% CI −0.29 to 0.52)

FIQ pain: 5.2 vs. 5.3, ES=0.05 (95% CI −0.36 to 0.44)

A vs. B

18 months:

FIQ Depression (0-10): 5.0 vs. 4.2, ES=0.09 (95% CI −0.31 to 0.49)

FIQ Anxiety (0-10): 5.0 vs. 4.8, ES=−0.06 (95% CI −0.46 to 0.34)

EQ-5D (−0.59 to 1): 0.54 vs. 0.51, ES=0.10 (95% CI −0.31 to 0.50)

GP consultations^b: 1.0 vs. 0.7, ES=−0.10 (95% CI −0.48 to 0.32)

Medical specialist consultations^b: −0.4 vs. 0.2, ES=−0.29 (95% CI −0.58 to 0.22)

Physiotherapist consultations^b: 0.4 vs. 2.8, ES=−0.29 (−0.58 to 0.22)

Other paramedical professional consultations^b: 2.1 vs. 0.2, ES=−0.68 (95% CI −1.00 to −0.18)

van Santen, 2002⁹⁷

6 months

Duration of symptoms: 12 years

Poor

A. Combination exercise (n=58): group sessions (60 minutes) twice a week for 24 weeks (aerobic exercises, stretching, general flexibility and balance exercises, and isometric muscle strengthening); encouraged to attend a third, unsupervised, 60 minute session weekly and to use sauna or swimming pool after all sessions.

B. Usual care (n=29): analgesics NSAIDs, or tricyclic antidepressants, if appropriate; GPs informed that aerobic exercises and relaxation should not be prescribed or encouraged

A vs. B

Age: 46 vs. 43 years

Female: 100% vs. 100%

Duration of symptoms: 9.7 vs. 15.4 years

Baseline SIP physical score (mean, 0-100): 11.3 vs. 9.8

Baseline SIP total score (mean, 0-100): 14.4 vs. 11.4

Baseline AIMS (mean, 0-10): 1.9 vs. 5.4

Baseline Pain VAS (mean, 0-100): 66.8 vs. 62.4

A vs. B, mean change from baseline

6 months:

SIP physical score: −1.7 (95% CI −3.7 to 0.3) vs. −0.6 (95% CI −2.9 to 1.7), p=NS

SIP total score: −1.9 (95% CI −3.9 to 0.1) vs. −1.4 (95% CI −3.4 to 0.6) p=NS

AIMS: 0.1 (95% CI −0.6 to 0.8) vs. 0.8 (95% CI −1.8 to −0.2), p=NS

Pain VAS: −5.5 (95% CI −10.9 to −0.1) vs. 1.3 (95% CI −4.5 to 7.1), p=NS

A vs. B, mean change from baseline

6 months:

SCL-90-R Global Severity Index (scale unclear): −6.8 (95% CI −20.1 to 6.5) vs. −8.1 (95% CI −19.8 to 3.6), p=NS

SIP psychosocial score (0-100): −3.2 (95% CI −6.2 to 0.2) vs. −3.5 (95% CI −7.0 to 0.0), p=NS

Patient global assessment (1-5): 0.5 (95% CI 0.2 to 0.8) vs. 0.5 (95% CI 0.2 to 0.8), p=NS

Villafaina, 2019⁹⁹

6 months (Immediately postintervention)

Duration of symptoms:NR

Fair

[New trial]

A. Exercise via an exergame specifically designed for patients with FM (n=28): 2, 1-hour sessions per week for 24 weeks; exercises targeting aerobic fitness, strength, mobility, postural control, and corrdination of the upper and lower limb.

B. Usual Care (n=27): continued with their usual daily activities.

A vs. B

Age: 54 vs. 53 years

% Female: 100% vs. 100%

Baseline VAS-pain: 62.14 vs. 60.37

A vs. B

6 months

VAS-pain: 58.88 vs. 68.20, effect size 0.076, p=0.04

A vs. B

6 months

VAS-EQ 5D health perception (0-100): 52.30 vs. 45.88, effect size 0.113, p=0.01

EQ-5D-5L utility (0-1): 0.56 vs. 0.52, effect size 0.04, p=0.12

Wigers, 1996⁹⁸

48 months

Duration of symptoms: 10 years

Fair

A. Aerobic exercise (n=20): sessions consisted of training to music (further details not given) and aerobic games; 45 minute group sessions 3 times a week for 14 weeks

B. Treatment as usual (n=20)

A vs. B

Age: 43 vs. 46 years

Female: 90% vs. 95%

Duration of symptoms: 9 vs. 11 years

Baseline pain VAS (0-100): 72 vs. 65

A vs. B

48 months:

Pain VAS: 68 vs. 69, difference −1.0 (95% CI −16.3 to 14.4)

A vs. B

48 months

Depression VAS (0-100): 32 vs. 30, difference 2.0 (95% CI −18.8 to 22.8)

Global subjective improvement: 75% vs. 12%, RR 5.9 (95% CI 1.5 to 22.2)

: AIMS = Arthritis Impact Measurement Scale; BDI = Beck Depression Inventory; CES-D: Center for Epidemiologic Studies Depression Scale Revised; CI = confidence interval; EQ5D = EuroQoL 5 Dimensions; ES = effect size; FIQ = Fibromyalgia Impact Questionnaire; FM = fibromyalgia; GP = general practitioner; GSI – Global Severity Index; HADS = Hospital Anxiety and Depression Scale; NHP = Nottingham Health Profile; NR = not reported; NS = not statistically significant; NSAID = nonsteroidal anti-inflammatory drug; SCL-90-R = Symptom Checklist-90-Revised; SF-36 = Short-Form 36 Questionnaire; SIP = Sickness Impact Profile; STAI = State-Trait Anxiety Inventory; VAS = visual analog scale
a: Unless otherwise noted, followup time is calculated from the end of the treatment period
b: Total number of consultations over a period of 2 months prior to measurement

Table 37Fibromyalgia: psychological therapies

Author, Year, Followup,^a Pain Duration, Study Quality

Intervention

Population

Function and Pain Outcomes

Other Outcomes

Alda, 2011¹¹³

6 months

Years since diagnosis: 12.9 vs. 11.2 vs.11.7

Fair

A. CBT (n=57): 10-12 week program; 10 weekly 90-minute group sessions of cognitive restructuring and training in cognitive and behavioral coping strategies.

B. Recommended pharmacological treatment (n=56): pregabalin (300-600 mg/day); duloxetine (60-120 mg/day) for patients with major depressive disorder.

C. Usual care (n=56): standard care offered by general practitioners at subjects’ health centers who received a guide for the treatment of FM in primary care.

A vs. B vs. C

Age: 46 vs. 47 years vs. 47 years

Females: 95% vs. 93% vs. 96%

Race NR

Baseline FIQ (mean, 0-100): 65.9 vs. 66.4 vs. 64.5

Baseline Pain VAS (mean, 0-100): 64.2 vs. 68.1 vs. 64.7

A vs. B

6 months:

FIQ: 48.8 vs. 52.8; difference −4.0 (95% CI −7.730 to −0.270)

Pain VAS: 40.7 vs. 40.5; difference 0.2 (95% CI −3.996 to 4.396)

A vs. C

6 months:

FIQ: 48.8 vs. 53.3, difference −4.5 (95% CI −7.91 to −1.09)

Pain VAS: 40.70 vs. 44.3, difference −3.6 (95% CI −7.617 to 0.417)

A vs. B

6 months:

HAM-D (0-50): 7.9 vs. 8.2; difference −0.3 (95% CI −1.226 to 0.626)

HAM-A (0-50): 7.3 vs. 7.4; difference −0.1 (95% CI −1.247 to 1.047)

A vs. C

6 months:

HAM-D: 7.9 vs. 8.6, difference −0.7 (95% CI −1.719 to 0.319)

HAM-A: 7.3 vs. 7.6, difference −0.3 (95% CI −1.361 to 0.761)

Ang, 2010¹¹⁴

1.5 months

Duration of fibromyalgia, years: 11.8 vs. 12.3

Poor

A. CBT (n=17): 6 weekly 30-40 minute sessions of telephone-delivered CBT (activity pacing, pleasant activity scheduling, relaxation, automatic thoughts and pain, cognitive restructuring, and stress management)

B. Usual care (n=15): customary care from subject’s treating physician

A vs. B

Age: 51 vs. 47 years

Female: 100% vs. 100%

White: 81% vs. 80%

Baseline FIQ total (mean, 0-100): 62.2 vs. 67.8

Baseline FIQ Physical Impairment (PI) (0-10): 5.6 vs. 5.4

Baseline FIQ Pain (0-10): 7.6 vs. 7.8

A vs. B

1.5 months:

Proportion of patients with clinically meaningful improvement from baseline FIQ total (14%): 33% vs. 15%, RR 2.2 (95% CI 0.5 to 9.3)

mean change from baseline:

FIQ PI: −0.6 vs. 0.5, adjusted p=0.13;

FIQ Pain: −0.6 (1.6) vs. −0.3 (1.7), adjusted p=0.60

A vs. B

1.5 months:

PHQ-8 (0-24): mean change from baseline −0.9 (5.2) vs. 0.0 (4.1), adjusted p=0.80; overall effect size=0.60

Baumueller 2017¹²⁵

3 months

Duration of symptoms: 12.4 vs. 16.4 years

Fair

[New trial]

A. Electromyogram-Biofeedback (n=18): 14 sessions over 8 weeks

B. Attention control (n=18): 2 encounters with a therapist over 8 weeks

A vs. B

Age: 55 vs. 56 years

Smoking: 6% vs. 22%

Baseline FIQ-total: 42.59 vs. 40.44

A vs. B

3 months

FIQ: 37.87 vs. 38.28, p=0.52

A vs. B

3 months

SF-36 Physical function: 51.64 vs. 50.9, p=0.35

SF-36: Role-physical: 15.62 vs. 20.83, p=0.57

SF-36 Bodily pain: 36.88 vs. 36.17, p=0.81

SF-36 General health: 43.50 vs. 44.44, p=0.44

SF-36 vitality: 28.63 vs. 38.80, p=0.59

SF-36 social functioning: 53.68 vs. 61.11, p=0.65

SF-36 Role-emotional: 35.42 vs. 59.26, p=0.83

SF-36 Mental health: 51.06 vs. 57.50, p=0.75

BDI: 16.91 vs. 12.30, p=0.31

SCL-90-R Global Severity Index: 66.11 vs. 63.22, p=0.27

Buckelew, 1998⁷⁸

3, 12, and 24 months

Duration of symptoms, years: 11.6 vs. 10.0 vs. 11.6

Poor

A. Electromyographic biofeedback and relaxation training (n=29): 1 session for 1.5-3 hours per week for 6 weeks and instructions to train 2 times independently per week; taught cognitive and muscular relaxation strategies; 6-week individual training was followed by 2-year group maintenance phase of 1-hour groups once per month.

B. Attention control (n=30): 1 session for 1.5-3 hours per week for 6 weeks; educational information on diagnosis and treatment of FM and general health topics information; followed by one hour groups once per month for 2 years.

C. Combination Exercise (n=30): 1 session for 1.5 hours per week for 6 weeks and instructions to train 2 times independently per week. Sessions consisted of active range of motion exercises, strengthening exercises, low to moderate intensity aerobic exercise, proper posture and body mechanics instruction, and instructions on the use of heat, cold, and massage. 6-week individual training was followed by 2-year group maintenance phase of 1-hour groups once per month.

A vs. B vs. C

Age: 44 vs. 44 vs. 46 years

Female: 97% vs. 90% vs. 93%

Race NR

Baseline AIMS physical activity subscale (median, 0-10): 6.0 vs. 6.0 vs. 4.0

Baseline pain VAS (median, 0-10): 5.8 vs. 5.9 vs. 6.3

A vs. B

3 months:

AIMS physical activity subscale, median (median change from baseline): 6.0 (0) vs. 6.0 (0), NS

Pain VAS, median (median change from baseline): 5.2 (−0.2) vs. 5.8 (−0.5), NS

24 months:

AIMS physical activity subscale, median (median change from baseline): 6.0 (0) vs. 6.0 (0), NS

Pain VAS, median (median change from baseline): 5.2 (−1.1) vs. 5.4 (−0.6), NS

A vs. C

3 months:

AIMS physical activity subscale, median (median change from baseline): 6.0 (0) vs. 4.0 (0), p≤0.05

Pain VAS, median (median change from baseline): 5.2 (−0.2) vs. 5.4 (−0.8), NS

24 months:

AIMS physical activity subscale, median (median change from baseline): 6.0 (0) vs. 4.0 (0), p≤0.05

Pain VAS, median (median change from baseline): 5.2 (−1.1) vs. 5.5 (−1.2), NS

A vs. B

3 months:

SCL-90-R Global Severity Index, median (median change from baseline): 65.0 (−2) vs. 65.0 (0), NS

CES-D, median (median change from baseline): 10.0 (−2) vs. 13.0 (3), NS

Sleep scale, median (median change from baseline): 7.0 (0) vs. 5.0 (0), NS

24 months:

SCL-90-R Global Severity Index, median (median change from baseline): 64.0 (−1) vs. 67.0 (−1), NS

CES-D, median (median change from baseline): 10.0 (−2) vs. 12.0 (−2), NS

Sleep scale, median (median change from baseline): 6.0 (−2) vs. 6.0 (0), NS

A vs. C

3 months:

SCL-90-R Global Severity Index, median (median change from baseline): 65.0 (−2) vs. 65.5 (−3), NS

CES-D, median (median change from baseline): 10.0 (−2) vs. 13.5 (−2.5), NS

Sleep scale, median (median change from baseline): 7.0 (0) vs. 8.0 (0), NS

24 months:

SCL-90-R Global Severity Index, median (median change from baseline): 64.0 (−1) vs. 65.5 (−2.5), NS

CES-D, median (median change from baseline): 10.0 (−2) vs. 11.5 (−3.5), NS

Sleep scale, median (median change from baseline): 6.0 (−2) vs. 7.5 (0), NS

Castel, 2012¹¹⁵

3 and 6 months

A vs. B

Pain duration, years: 13.6 vs. 11.6

Poor

A. CBT plus usual pharmacological care (n=34): CBT conducted in groups (except for one individual session); 14 weekly 2 hour sessions. CBT included education about FM and pain, autogenic training, cognitive restructuring, CBT for insomnia, assertiveness training, activity pacing, pleasant activity scheduling, goal setting, and relapse prevention.

B. Usual care (n=30): usual pharmacological care, including analgesics, antidepressants, anticonvulsants, and myorelaxants

A vs. B

Age: 50 vs. 49 years

Female: 94% vs. 100%

White: 100% vs. 100%

Baseline FIQ (scale NR): 62.7 vs. 66.1

Baseline pain NRS (0-10): 6.1 vs. 6.9

A vs. B

3 months:

Proportion of patients with MCSD (≥14% improvement from baseline):

FIQ: 55.9% vs. 20%; OR 5.1 (95% CI 1.7 to 15.6); RR 2.8 (95% CI 1.3 to 6.1)

Pain (≥30% improvement from baseline):: 14.6% vs. 10%; RR 1.5 (95% CI 0.4 to 5.7)

FIQ: 52.8 vs. 66.3; difference −13.5 (95% CI −15.5 to −11.5)

Pain NRS: 5.9 vs. 6.8; difference −0.9 (95% CI −1.1 to −0.7)

6 months:

Proportion of patients with MCSD:

FIQ: 58.8% vs. 20%; OR 5.7 (95% CI 1.9 to 17.8); RR 2.9 (95% CI 1.4 to 6.3)

Pain: 17.6% vs. 13.3%; RR 1.3 (95% CI 0.4 to 4.2)

FIQ: 50.5 vs. 68.5; difference −18.0 (95% CI −20.095 to −15.905)

Pain NRS: 5.7 vs. 6.8; difference −1.1 (95% CI −1.333 to −0.867)

A vs. B

3 months:

HADS (scale NR): 15.4 (1.3) vs. 22.3 (1.4); difference −6.9 (95% CI −7.685 to −6.115)

MOS Sleep quantity (scale NR): 6.9 (0.2) vs. 5.5 (0.3); difference 1.4 (95% CI 1.254 to 1.546), p <0.0001

MOS Sleep index problems (scale NR): 40.1 (1.6) vs. 28.8 (1.7); difference 11.3 (95% CI 10.340 to 12.260)

6 months:

HADS: 15.7 (1.3) vs. 23.7 (1.4); difference −8.0 (95% CI −8.785 to −7.215)

MOS Sleep quantity: 6.7 (0.2) vs. 5.6 (0.3); difference 1.1 (95% CI 0.954 to 1.25)

MOS Sleep index problems: 39.9 (1.5) vs. 28.0 (1.6); difference 11.9 (95% CI 10.998 to 12.802)

Falcão, 2008¹³⁰

3 months

Disease duration, years: 3.5 vs. 3.7

Fair

A. CBT plus Amitriptyline (n=30): 1 group CBT session per week for 10 weeks, consisting of progressive relaxation training with electromyographic biofeedback, cognitive restructuring, and stress management; also received amitriptyline as in control group

B. Amitriptyline only (control) (n=30): amitriptyline 12.5/mg per day during first week, then increase dose to 25 mg/day; those with intolerance or side effects to amitriptyline were given cyclobenzaprine 5 mg/day in the first week and then 10 mg/day. Routine medical visits once a week for 10 weeks

A vs. B

Age: 45 vs. 46 years

Female: 100% vs. 100%

Caucasian: 80% vs. 77%

Baseline FIQ (0-100): 64.9 vs. 69.6

Baseline pain VAS (0-10): 6.9 vs. 7.0

A vs. B

3 months:

FIQ: 38.7 vs. 42.8; difference −4.1 (95% CI −18.765 to 10.565)

Pain VAS: 4.4 vs. 5.1; difference −0.7 (95% CI −2.841 to 1.441)

A vs. B

3 months:

BDI (0-63): 10.6 vs. 15.6; difference −5.0 (95% CI −11.122 to 1.122)

STAI-State scale (20-80): 45.8 (2.5) vs. 46.8 (2.3); difference −1.0 (95% CI −2.351 to 0.351)

SF-36 Physical Capacity (0-100): 59.6 vs. 54.0; difference 5.6 (95% CI −11.905 to 23.105)

SF-36 Pain (0-100): 48.4 vs. 45.5; difference 2.9 (95% CI −10.783 to 16.583)

SF-36 Mental Health (0-100): 69.9 vs. 56.2; difference 13.7 (95% CI 0.070 to 27.330)

Jensen, 2012¹¹⁶

Wicksell, 2013¹¹⁹

3-4 months

Time since FM onset, years: 10.5 vs. 11.8

Fair

A. ACT (n=25): 12 weekly 90-minute group sessions: exposure to personally important situations and activities previously avoided due to pain and distress, training to distance self from pain and distress.

B. Waiting list control (n=18)

A vs. B

Age: 45 vs. 47 years

Female: 100% vs. 100%

Baseline FIQ (0-100): 49.3 vs. 48.7

Baseline PDI (scale NR): 40.0 vs. 39.0

Baseline pain VAS (0-100): 61 vs. 65.0

Baseline pain NRS (0-10): 4.2 vs. 4.3

A vs. B

3-4 months

FIQ: 37.4 vs. 45.7, Cohen’s d=0.66 (95% CI −0.06 to 1.37); difference −8.3 (95% CI −17.056 to 0.456)

PDI: 28.1 vs. 38.1, Cohen’s d=0.73 (95% CI −0.00 to 1.44); difference −10.0 (95% CI −19.740 to −0.260)

Pain VAS: means NR but group X time interaction p=0.26

Pain NRS: 3.9 vs. 4.8, Cohen’s d= 0.82 (95% CI 0.08 to 1.54); difference −0.90 (95% CI −1.674 to −0.126)

A vs. B

3-4 months

BDI (0-63): 10.7 vs. 16.4, Cohen’s d=0.64 (95% CI −0.08 to 1.35); difference −5.7 (95% CI −12.044 to 0.644)

STAI-State: 39.8 vs. 45.4; Cohen’s d=0.55 (95% CI −0.17 to 1.26); difference −5.6 (95% CI −12.751 to 1.551)

SF-36 Mental: 46.0 vs. 34.7, Cohen’s d=1.06 (95% CI 0.28 to 1.82); difference 11.3 (95% CI 3.761 to 18.839)

SF-36 Physical (0−100): 28.4 vs. 31.1, Cohen’s d=0.28 (95% CI −0.45 to 1.00); difference −2.7 (95% CI −9.401 to 4.001),

Karlsson, 2015¹²⁶

6 month (end of treatment)

Duration of pain: 10.7 vs. 12 years

Duration of FM diagnosis: 5.3 vs. 5 years

Fair

[New trial]

A. CBT stress management program (n=24): 20, 3 hour group sessions over 6 months

Median attendance rate: 93%

B. Waitlist (n=24)

A vs. B

Age: 48 vs. 49 years

Female: NR

Baseline MPI pain severity: 3.85 vs. 3.38

Baseline MPI pain interference: 4.04 vs. 3.37

Baseline MADRS-S: 17.38 vs. 13.04

A vs. B

6 months

MPI pain severity: 4.20 vs. 3.37

MPI pain interference: 4.07 vs. 3.45

MADRS-S: 13.09 vs. 16.45

A vs. B

6 months

MADRS-S: 13.09 vs. 16.45

Kayiran, 2010¹³¹

4 to 5 months

Duration of symptoms: 5 years

Poor

A. EEG Biofeedback (Neurofeedback) (n=20): 5 sessions based on sensorimotor rhythm training protocol per week for 4 weeks. Each session consisted of 10 sensorimotor rhythm training periods lasting for 3 minutes for a total of 30 minutes

B. Escitalopram (n=20): 10 mg/day for 8 weeks (control group)

A vs. B

Age: 32 vs. 32 years

Female: 100% vs. 100%

Baseline FIQ (mean, 0−100): 70 vs. 74*

Baseline pain VAS (mean, 0−10): 8.9 vs. 9.1

A vs. B

4-5 months:

FIQ: 19 vs. 48*, p=NR

Pain VAS: 2.6 vs. 5.3; difference −2.7 (95% CI −3.7 to −1.7)

A vs. B

4-5 months:

HAM-D (0-50): 6.3 vs. 13.4; difference −7.1 (95% CI −9.1 to −5.1)

BDI (0-63): 4.7 vs. 12.3; difference −7.6, 95% CI −9.7 to −5.5)

HAM-A (0-56): 7.1 vs. 15.2; difference −8.1 (95% CI −11.0 to −5.2)

BAI (0-63): 7.2 vs. 16.7; difference −9.5 (95% CI −13.9 to −5.1)

SF-36*:

Physical functioning (0-100): 77 vs. 65, p<0.05

Bodily pain: 70 vs. 45, p<0.05

Role-physical (0-100): 90 vs. 43, p<0.05

Role-emotional (0-100): 95 vs. 51, p<0.05

Social functioning (0-100): 76 vs. 65, p<0.05

General mental health (0-100): 74 vs. 59, p<0.05

General health (0-100): 72 vs. 28, p<0.05

Vitality (0-100): 70 vs. 50, p<0.05

Lami, 2017¹²¹

3 months

Duration of symptoms: Mean 8.5 to 11.6 years

Poor

[New trial]

A. Cognitive Behavioral Therapy for Pain (n=24): weekly, 90-minute group sessions for 9 weeks

B. Cognitive Behavioral Therapy for Pain and Insomnia (n=22): weekly, 90-minute group sessions for 9 weeks

C. Usual Care (n=26)

A vs. B vs. C

Age: 49 vs. 50 vs. 51 years

Female: 100% vs. 100% vs. 100%

Baseline FIQ (0-100): 65.5 vs. 62.0 vs. 55.6

Baseline pain VAS (0-10): 7.6 vs. 7.4 vs. 7.2; p<0.05

A vs. C

3 months

FIQ: 53.3 vs. 53.2, difference 0.1 (95% CI −8.9 to 9.1)

VAS: 7.2 vs. 7.2, difference 0.0 (95% CI −0.9 to 1.0)

B vs. C

3 months

FIQ: 56.5 vs. 53.2, difference 3.3 (95% CI −5.7 to 12.3)

VAS: 6.6 vs. 7.2, difference −0.6 (95% CI −1.5 to 0.3)

A vs. C

3 months

PSIQ Total Sleep Quality (0-21): 13.8 vs. 11.9, difference 1.91 (95% CI −0.6 to 4.5)

MFI (0-5): 4.4 vs. 4.0, difference 0.32 (95% CI −0.1 to 0.7)

SCL-90-R Depression (0-4): 2.1 vs. 1.5, difference 0.6 (95% CI 0.2 to 1.1)

SCL-90-R Anxiety (0-4): 1.6 vs. 1.2, difference 0.42 (95% CI −0.1 to 0.9)

PCS (0-52): 22.8 vs. 24.2, difference −1.4 (95% CI −8.7 to 6.0)

CPAQ (0-120): 53.5 vs. 57.5, difference −4.1 (95% CI −15.8 to 7.6)

B vs. C

3 months

PSIQ Total Sleep Quality: 13.6 vs. 11.9, difference 1.7 (95% CI −0.8 to 4.2)

MFI: 4.1 vs. 4.0, difference 0.0 (95% CI −0.4 to 0.4)

SCL-90-R Depression: 2.0 vs. 1.5, p<0.05

SCL-90-R Anxiety: 1.6 vs. 1.18, difference 0.44 (95% CI −0.05 to 0.9)

PCS: 24.1 vs. 24.2, difference −0.2 (95% CI −7.7 to 7.4)

CPAQ: 53.7 vs. 57.5, difference −3.9 (95% CI −15.1 to 7.4)

Larsson, 2015¹³⁵

13 to 18 months

Duration of symptoms: 10 years

Poor

A. Relaxation therapy (n=63): Two group sessions of 5-8 subjects per week for 15 weeks. The intervention was preceded by an individual meeting covering instructions and allowing for adjustments to the intervention. The sessions lasted 25 minutes and consisted of autogenic training guided by physiotherapist and were followed by stretching.

B. Resistance exercise (Strength) (n=67): Two group sessions of 5-7 subjects per week for 15 weeks. The intervention was preceded by an individual meeting going over instructions on the intervention, testing, and modifications of specific exercises. Sessions were based on a resistance exercise program aiming to improve muscle strength, focusing on large muscle groups in the lower extremity.

A vs. B

Age: 52 vs. 51

Female: 100% vs. 100%

Baseline FIQ (0-100): 61.1 vs. 60.5

Baseline pain VAS (0-100): 52.4 vs. 49.3

Baseline PDI (0-70): 35.0 vs. 35.3

A vs. B

13-18 months

FIQ: 55.4 vs. 57.1, (difference −1.7, 95% CI −9.3 to 5.9)

Pain VAS: 52.1 vs. 49.2, (difference 2.9, 95% CI −5.5 to 11.3)

PDI: 33.7 vs. 33.0, (difference 0.7, 95% CI −4.0 to 5.4)

A vs. B

13-18 months

SF-36 PCS (0-100): 32.0 vs. 32.2, (difference −0.2, 95% CI −3.8 to 3.4)

SF-36 MCS (0-100): 40.0 vs. 39.2, (difference 0.8, 95% CI −4.6 to 6.2)

Patient global impression of change (mean, 1-7): Values NR but difference was NS

Luciano, 2014/2017¹²²^,¹²³

6 months

Duration of Disease: Mean 11.4 to 14.1 years

Fair

[New trial]

A. ACT (n=51): Eight 2.5 hour group sessions. Additional daily 15-30 minute homework sessions.

-Received eight sessions: 43.1% (22/51)

-Received seven sessions: 31.4% (16/51)

-Received six sessions: 9.8% (8/51)

-Received three sessions: 2% (1/51)

-Received two sessions: 7.8% (4/51)

B. Pharmacological Treatment (n=52): Treatment with pregabalin (300–600 mg/d) was administered. In addition, patients who fulfilled the criteria for major depression also received duloxetine (60–120 mg/d).

C. Waitlist (n=53)

A vs. B vs. C

Age: 49 vs. 48 vs. 48

Female: 96.1% vs. 98.1% vs. 94.3%

Baseline FIQ (0-100): 68.2 vs. 69.9 vs. 65.9

Baseline pain VAS (0-100): 65.4 vs. 63.0 vs. 64.0

A vs. C

6 months

FIQ: 49.5 vs. 67.5, difference −18.0 (95% CI −21.4 to −14.5)

VAS: 49.6 vs. 64.4, difference −14.8 (95% CI −20.0 to −9.6)

A vs. B

6 months

FIQ: 49.49 (8.77) vs. 65.11 (8.87), difference −15.6 (95% CI −19.1 to −12.2)

VAS: 49.6 vs. 56.3, difference −6.7 (95% CI 11.0 to −2.3)

A vs. C

6 months

CPAQ (0-120): 58.6 vs. 39.5, difference 19.1 (95% CI 13.9 to 24.4)

PCS (0-52): 23.1 vs. 30.3, difference −7.2 (95% CI −10.5 to −3.9)

EQ5D (0-100): 63.3 vs. 51.2 (11.8, difference 12.2 (95% CI 7.9 to 16.5)

HADS-A (0-21): 8.7 vs.12.2, difference −3.4 (95% CI −4.7 to −2.1)

HADS-D (0-21): 5.8 vs. 9.3, difference −3.5 (95% CI −4.4 to −2.5)

Total Cost for Treatment (in 2014 Euro): 2,267.3 vs. 4,163.6, difference −1896.3 (95% CI −3018 to −775)

A vs. B

6 months

CPAQ: 58.6 vs. 42.5, difference 16.1 (95% CI 10.8 to 21.5)

PCS: 23.1 vs. 28.0, difference −4.9 (95% CI −7.9 to −1.8)

EQ5D: 63.3 vs. 53.8, difference 9.6 (95% CI 5.2 to 14.0)

HADS-A: 8.7 vs. 9.7, difference −1.0 (95% CI −1.8 to −0.06)

HADS-D: 5.8 vs. 7.5, difference −1.7 (95% CI −2.6 to −0.8)

Total Cost for Treatment (in 2014 Euro): 2,267.3 vs. 2,654.6, difference −387.3 (95% CI −1205 to 430)

Lumley, 2017¹²⁴

6 months

Duration of diagnosis: Mean 13.5 to 13.8 years

Fair

[New trial]

A. CBT (n=75): 8, 90-minute, weekly group sessions.

B. Emotion and Awareness Expression Therapy (n=79): 8, 90-minute, weekly group sessions.

C. Fibromyalgia Education (attention control) (n=76): 8, 90-minute, weekly group sessions.

All patients continued their usual care.

A vs. B. vs. C

Age: 48 vs. 49 vs. 50 years

Female: 91% vs. 92% vs. 99%

Baseline BPI (0-10): 5.4 vs. 5.3 vs. 5.5

Baseline WPI (1-12): 9.9 vs. 11.2 vs. 10.7

Baseline SF-12 Physical component score: 35.5 vs. vs. 35.2 vs. 34.9

6 months

A vs. C

BPI: 4.8 vs. 4.9, difference −0.12 (95% CI −0.7 to 0.5)

Proportion of patients reporting at least 50% pain reduction: 8.3% vs. 12%

Proportion of patients reporting much/very much pain improvement: 22.9% vs. 15.4%

WPI: 8.40 vs. 9.14, difference −0.74 (95% CI −2.2 to 0.7)

B vs. C

BPI: 4.4 vs. 4.9, difference −0.54 (95% CI −1.2 to 0.1)

Proportion of patients reporting at least 50% pain reduction: 22.5% vs. 12%, p=0.07

Proportion of patients reporting much/very much pain improvement: 34.8% vs.15.4%, p=0.015

6 months

A vs. C

FSS (0-31): 15.0 vs. 16.0, difference −1.1 (95% CI −3.1 to 0.9)

SF-12 Physical (0-100): 39.1 vs. 36.9, difference 2.2 (95% CI −0.9 to 5.3)

SWLS (0-35): 19.6 vs. 18.6, difference 1.1 (95% CI −1.4 to 3.6)

PSQI (0-21): 10.1 vs. 10.7, difference −0.61 (95% CI −2.0 to 0.8)

PANAS-positive score (10-50): 30.1 vs. 27.6, difference 2.5 (95% CI −0.2 to 5.3)

PANAS-negative score (10-50): 18.6 vs. 19.4, difference −0.8 (95% CI −3.2 to 1.7)

MASQ (0-190): 92.6 vs. 96.9, difference −4.3 (95% CI −10.5 to 1.9)

CES-D (0-60): 17.3 vs.18.5, difference −1.1 (95% CI −5.0 to 2.7)

GAD-7 (0-21): 5.8 vs. 7.1, difference −1.3 (95% CI −2.9 to 0.3)

PROMIS-SF-F (0-100): 58.4 vs. 59.0, difference −0.62 (95% CI −2.4 to 1.2)

Healthcare utilization: 3.4 vs. 4.8, difference −1.4 (95% CI −3.1 to 0.3)

B vs. C

WPI: 7.2 vs. 9.1, difference −1.9 (95% CI −3.4 to −0.4)

FSS: 13.2 vs. 16.0, difference −2.9 (95% CI −4.9 to −0.8)

SF−12 Physical: 39.4 vs. 36.9, difference 2.5 (95% CI −0.6 to 5.5)

SWLS: 18.9 vs. 18.6, difference 0.3 (95% CI −2.3 to 2.9)

PNAS−negative score: 20.0 vs. 19.4, difference 0.62 (−1.7 to 2.9)

PNAS−positive score: 28.5 vs. 27.6, difference 0.97 (95% CI −1.8 to 3.7)

MASQ: 94.5 vs. 96.9, difference −2.37 (95% CI −8.8 to 4.0)

CES−D: 19.3 vs. 18.5, difference 0.79 (95% CI −2.9 to 4.5)

GAD−7: 7.2 vs. 7.1, difference 0.12 (95% CI −1.4843 to 1.7243)

PROMIS−SF−F: 58.2 vs. 59.0, difference −0.84 (95% CI −2.9 to 1.2)

Healthcare utilization: 4.1 vs. 4.8, MD −0.70 (95% CI −2.6 to 1.2)

McCrae, 2019¹²⁷

6 months

Duration of FM diagnosis: 9.5 vs. 7.9 vs. 9.1 years

Fair

[New trial]

A. CBT for Insomnia (n=39): 8, 50-minute sessions over 8 weeks

B. CBT for Pain (n=37): 8, 50-minute sessions over 8 weeks

C. Waitlist (n=37)

A vs. B vs. C

Age: 54 vs. 52 vs. 52 years

% Female: 100% vs. 92% vs. 100%, p=0.04

Baseline McGill Pain: 25.85 vs. 29.95 vs. 28.53

Baseline morning pain: 53.49 vs. 54.04 vs. 54.72

Baseline evening pain: 47.26 vs. 54.26 vs. 54.18

Baseline pain disability index: 34.14 vs. 37.27 vs. 37.59

A vs. C

6 months

McGill Pain: 23.62 vs. 23.30

Morning VAS: 43.29 vs. 50.60

Evening VAS: 41.99 vs. 49.26

Pain Disability Index: 27.76 vs. 34.87

B vs. C

6 months

McGill Pain: 28.99 vs. 23.30

Morning VAS: 47.78 vs. 50.60

Evening VAS: 49.77 vs. 49.26

Pain Disability Index: 36.37 vs. 34.87

A vs. C

6 months

Sleep Quality Rating (1-5): 3.27 vs. 2.65

BDI (0-63): 8.22 vs. 15.01

State-Trait Anxiety Inventory (20-80): 38.07 vs. 43.87

B vs. C

6 months

Sleep Quality Rating (1-5): 3.14 vs. 2.65

BDI (0-63): 14.38 vs. 15.01

State-Trait Anxiety Inventory (20-80): 43.86 vs. 43.87

Redondo, 2004¹³⁶

6 and 12 months

Pain duration NR

Poor

A. CBT (n=21): 1, 2.5 hour session per week for 8 weeks. Sessions included information about chronic pain and FM, relaxation techniques, and pain coping strategies training.

B. Combination Exercise (n=19): 5, 45-minute sessions per week for 8 weeks. Each week included 1 session of aquatic exercises, 2 sessions of flexibility and endurance exercises, and 2 sessions of cardiovascular exercises.

All subjects: Offered ibuprofen or diclofenac, 25 mg of amitriptyline a day, and acetaminophen.

A vs. B

Age NR

Female: 100% vs. 100%

Baseline FIQ total (mean, 0-80): 52.0 vs. 52.0

Baseline FIQ pain (mean, 0-10): 7.3 vs. 6.8

A vs. B

6 months:

FIQ total: 47.4 vs. 48.0, (difference −0.6, 95% CI −12.6 to 11.4)

FIQ pain: 5.9 vs. 6.9, (difference −1.0, 95% CI −2.8 to 0.8)

12 months:

FIQ: 47.8 vs. 47.7; (difference 0.1, 95% CI −10.5 to 10.7)

FIQ pain: 6.3 vs. 6.6; (difference −0.3, 95% CI −2.0 to 1.3)

A vs. B

6 months:

FIQ depression (0-10): 5.2 vs. 5.3, (difference −0.1, 95% CI −2.6 to 2.4)

FIQ anxiety (0-10): 6.0 vs. 5.8, (difference 0.2, 95% CI −2.2 to 2.6)

BAI: 25.2 vs. 22.1, (difference 3.1, 95% CI −5.1 to 11.3)

BDI (0-63): 17.1 vs. 15.0, (difference 2.1, 95% CI −6.6 to 10.8)

SF-36 physical functioning (0-100): 52.2 vs. 43.9, (difference 8.3, 95% CI −6.4 to 23.0)

SF-36 physical role (0-100): 22.4 vs. 18.3, (difference 4.1, 95% CI −21.2 to 29.4)

SF-36 bodily pain (0-100): 31.4 vs. 32.9, (difference −1.5, 95% CI −16.1 to 13.1)

SF-36 social functioning (0-100): 66.4 vs. 66.9, (difference −0.5, 95% CI −21.6 to 20.6)

SF-36 emotional role (0-100): 68.4 vs. 66.0, (difference 2.4, 95% CI −28.2 to 33.0)

SF-36 mental health (0-100): 48.9 vs. 51.8, (difference −2.9, 95% CI −19.3 to 13.5)

12 months:

FIQ depression: 5.4 vs. 4.9; (difference 0.5, 95% CI −2.0 to 3.0)

FIQ anxiety: 6.0 vs. 5.8; (difference 0.2, 95% CI −2.1 to 2.5)

BAI: 20.0 vs. 20.0; (difference 0.0, 95% CI −7.4 to 7.4)

BDI: 13.0 vs. 13.6; (difference −0.6, 95% CI −7.9 to 6.7)

SF-36 physical functioning: 38.9 vs. 41.6; (difference −2.7, 95% CI −19.5 to 14.1)

SF-36 physical role: 26.1 vs. 31.0; (difference −4.9, 95% CI −27.9 to 18.1)

SF-36 bodily pain: 33.8 vs. 34.3; (difference −0.5, 95% CI −20.9 to 19.9)

SF-36 social functioning: 60.7 vs. 57.2; (difference 3.5, 95% CI −17.2 to 24.2)

SF-36 emotional role: 66.7 vs. 58.7; (difference 8.0, 95% CI −19.2 to 35.2)

SF-36 mental health: 56.5 vs. 53.8; (difference 2.7, 95% CI −19.1 to 24.5)

Thieme, 2006¹¹⁷

6 and 12 months

Duration of symptoms, years: 9.1 vs. 8.7

Poor

A. CBT (n=42): 2-hour group sessions weekly for 15 weeks. Sessions focused on changing patients’ thinking and problem-solving, stress and pain coping strategies, and relaxation exercises performed during and between sessions.

B. Attention control (n=40): 2-hour group sessions weekly for 15 weeks: general discussions about medical and psychosocial problems of fibromyalgia.

A vs. B

Age: 49 vs. 47 years

Female: 100% vs. 100%

Baseline FIQ physical impairment (mean, 0-10): 4.4 vs. 4.2

Baseline WHYMPI pain intensity (mean, 0-6): 4.2 vs. 3.8

A vs. B

6 months

FIQ physical impairment: 3.0 vs. 4.8; difference −1.8 (95% CI −2.899 to −0.701)

WHYMPI pain intensity: 3.7 vs. 4.1; difference −0.4 (95% CI −0.841 to 0.041)

12 months

FIQ physical impairment: 3.4 vs. 5.2; difference −1.8 (95% CI −2.855 to −0.745)

WHYMPI pain intensity: 3.2 vs. 4.1; difference −0.9 (95% CI −1.537 to −0.263)

A vs. B

6 months

WHYMPI affective distress: 2.6 vs. 4.0; difference −1.4 (95% CI −1.952 to −0.848)

12 months

WHYMPI affective distress: 2.6 vs. 4.2; difference −1.6 (95% CI −2.172 to −1.028)

Van Santen, 2002⁹⁷

Post 6-month intervention

Duration of symptoms, years: 10.1 vs. 15.4 vs. 15.4

Poor

A. Electromyographic biofeedback (n=56): Progressive muscle relaxation and frontalis EMG biofeedback; 30-minute individual sessions 2 times per week for 8 weeks; subjects encouraged to practice at home twice daily for the 8 weeks then for 16 more weeks. Subjects randomized to education aimed at compliance with biofeedback training (6 90-minute sessions over 24 weeks).

B. Usual care (n=29): General physicians informed not to prescribe or encourage aerobic exercises and relaxation. Intervention duration: 6 months

C. Combination Exercise (n=58): 60-minute group sessions of twice a week for 24 weeks; aerobic exercises, postural strengthening, general flexibility and balance exercises, and isometric muscle strengthening; subjects encouraged to attend third, unsupervised, 60-minute session and to use sauna or swimming pool after sessions.

A vs. B

Age: 44 vs. 43 vs. 46 years

Female: 100% vs. 100% vs. 100%

Race NR

Baseline SIP Physical score (0-100): 11.4 vs. 9.8 vs.11.3

Baseline SIP Total score (0-100): 14.0 vs. 11.4 vs. 14.4

Baseline AIMS (0-10): 3.1 vs. 5.4 vs. 1.9

Baseline pain VAS (0-100): 59.1 vs. 62.4 vs. 66.8

A vs. B

6 months:

SIP physical score, mean change: −1.6 (95% CI −3.4 to 0.2) vs. −0.6 (95% CI −2.9 to 1.7)

SIP total score, mean change: −2.3 (95% CI −4.3 to −0.3) vs. −1.4 (95% CI −3.4 to 0.6)

AIMS, mean change: 0.4 (95% CI −0.1 to 0.9) vs. 0.8 (95% CI −1.8 to −0.2)

SIP total score, mean change: −2.3 (95% CI −4.3 to −0.3) vs. −1.4 (95% CI −3.4 to 0.6)

Pain VAS, mean change: −0.6 (95% CI −6.5 to 5.3) vs. 1.3 (95% CI −4.5 to 7.1)

A vs. C

6 months:

SIP physical score, mean change: −1.6 (95% CI −3.4 to 0.2) vs. −1.7 (95% CI −3.7 to 0.3), NS

SIP total score, mean change: −2.3 (95% CI −4.3 to −0.3) vs. −1.9 (95% CI −3.9 to 0.1)

AIMS, mean change: 0.4 (95% CI −0.1 to 0.9) vs. 0.1 (95% CI −0.6 to 0.8)

Pain VAS, mean change: −0.6 (95% CI −6.5 to 5.3) vs. −5.5 (95% CI −10.9 to −0.1), NS

A vs. B

6 months:

SIP psychosocial score (0-100), mean change: −3.7 (95% CI −4.9 to −2.5) vs. −3.5 (95% CI −7.0 to 0.0)

Patient global assessment of well-being, mean change: 0.3 (95% CI 0.0 to 0.6) vs. 0.5 (95% CI 0.2 to 0.8)

A vs. C

6 months:

SIP psychosocial score, mean change: −3.7 (95% CI −4.9 to −2.5) vs. −3.2 (95% CI −6.2 to 0.2)

Patient global assessment of well-being, mean change: 0.3 (95% CI 0.0 to 0.6) vs. 0.5 (95% CI 0.2 to 0.8)

Verkaik, 2014¹¹⁸

1.5 months

Duration of symptoms, NR

Poor

A. Guided imagery (n=33): Two 1.5 hour group sessions of 6-12 subjects. The first sessions consisted of group discussion, the theoretical background of guided imagery, and instructions to practice at least one exercise daily for 4 weeks. Each exercise was a CD and contained relaxation techniques, music, positive imagery, and pain management techniques. The second group session took place after the 4 weeks and consisted of a group discussion.

B. Attention control (n=37): Two 1.5 hour group sessions of 6-12 subjects held 4 weeks apart. Group sessions were a group discussion and did not contain any information or training on guided imagery.

A vs. B

Age: 47 vs. 48

Female: 100% vs. 97%

Baseline FIQ (0-100): 53.7 vs. 56.4

Baseline pain VAS (0-10): 5.9 vs. 5.8

A vs. B

1.5 months

FIQ: 54.2 vs. 53.0, difference 1.2, 95% CI −0.2 to 2.6)

Pain VAS: NR

Wigers, 1996⁹⁸

48 months

Fibromyalgia duration

A vs. B vs. C

Mean: 11 vs. 9 years

Poor

A. Stress management (n=20): 90 minute group sessions of 10 patients done 2 times a week for 6 weeks followed by 1 session per week for the next 8 weeks. Sessions consisted of equal portions of presentations stress mechanisms and strategies for improving quality of life, group discussions on patients’ experiences of stress and coping with pain, and relaxation training aimed at helping cope with stress and pain.

B. Usual care (n=20): Subjects continued treatments they had been using at baseline.

C. Aerobic exercise (n=20): 45 minute group sessions of 10 patients done 3 times a week for 14 weeks. The exercise program involved the whole body and aimed to minimize eccentric muscle strain. Sessions consisted of training to music (further details not given) and aerobic games.

A vs. B

Age: 44 vs. 46 vs. 43 years

Female: 90% vs. 95% vs. 90%

Baseline pain VAS (0-100): 72 vs. 65 vs. 72

A vs. B

48 months

Pain VAS: 70 vs. 69, (difference 1, 95% CI −12.6 to 14.6)

A vs. C

48 months

Pain VAS: 70 vs. 68, (difference 2, 95% CI −11.6 to 15.6)

A vs. B

48 months

Depression VAS (0-100): 40 vs. 30, (difference 10, 95% CI −8.9 to 28.9)

Global subjective improvement: 47% (6/13) vs. 12% (2/16), (RR 3.7, 95% CI 0.9 to 15.3)

A vs. C

48 months

Depression VAS: 40 vs. 32, (difference 8, 95% CI −11.9 to 27.9)

Global subjective improvement: 47% (6/13) vs. 75% (11/15), (RR 0.6, 95% CI 0.3 to 1.2)

Williams, 2002¹²⁰

12 months

Fibromyalgia duration, 8.6 years

Poor

A. Group CBT plus Usual Care (n=76): 6 1-hour group sessions over 4-week period: progressive muscle relaxation, imagery, activity pacing, pleasant activity scheduling, communication skills and assertiveness training, cognitive restructuring, stress management and problem-solving.

B. Usual Care (n=69): Standard pharmacological management (typically low-dose tricyclic antidepressant medication, analgesics, and/or antidepressants) plus suggestions to engage in aerobic fitness.

A + B

Age, mean, years: 47.7

Females: 90%

Race: White non-Hispanic 88%, black non-Hispanic 9%, Hispanic 2%, Asian American 1%

Baseline MPQ-Sensory (scale NR): 14.8

Baseline MPQ-Affective pain score (scale NR): 4.6

A vs. B

12 months

Mean (SD): NR

Proportion of subjects who improved more than 12 points from baseline on MPQ-Sensory scale: 3.9% vs. 7.2%; RR 0.54 (95% CI 0.14 to 2.2)

A vs. B

12 months

Mean (SD) NR

Proportion of subjects who improved more than 6.5 points from baseline on SF-36 PCS Score: 25% vs. 11.6%, OR 2.9; RR 2.2 (95% CI 0.98 to 4.99)

Proportion of subjects who improved more than 5 points from baseline on MPQ-Affective scale: 9.2% vs. 8.7%, RR 1.1 (95% CI 0.37 to 3.0)

: ACT = acceptance and commitment therapy; AIMS = Arthritis Impact Measurement Scales; BAI = Beck Anxiety Inventory; BDI = Beck Depression Inventory; BPI = Brief Pain Inventory; CBT = cognitive-behavioral therapy; CES-D = Center for Epidemiologic Studies Depression Scale; CI = confidence interval; CPAQ = Chronic Pain Acceptance Questionnaire; EEG = electroencephalogram; EMG = electromyography; FIQ = Fibromyalgia Impact Questionnaire; FM = fibromyalgia; FSS = Fatigue Severity Scale; GAD-7 = Generalized Anxiety Disorder 7-item scale; HAM-D = Hamilton Rating Scale for Depression; HAM-A = Hamilton Anxiety Rating Scale; HADS-A = Hospital Anxiety and Depression Scale, Anxiety; HADS-D = Hospital Anxiety and Depression Scale, Depression; MADRS = Montgomery-Åsberg Depression Rating Scale; MASQ = Mood and Anxiety Symptom Questionnaire; MCS = Mental Component Summary Score; MCSD = Minimal Clinically Significant Difference; MFI = Modified Fatigue Impact Scale; mg = milligrams; MOS = Medical Outcomes Study; MPI = West Haven-Yale Multidimensional Pain Inventory; MPQ = McGill Pain Questionnaire; NR = not reported; NRS = numerical rating scale; NS = not statistically significant; OR = odds ratio; PANAS = Positive and Negative Affect Schedule; PCS = Physical Component Summary Score; PDI = Pain Disability Index; PHQ = Patient Health Questionnaire; PI = Physical Impairment; PSQI = Pittsburg Sleep Qualtiy Index; RR = risk ratio; SCL-90-R = Symptoms Checklist 90-Revised; SD = standard deviation; SIP = Sickness Impact Profile; SF-12 = Short-Form 12 questionaire; SF-36 = Short-Form 36 questionnaire; SWLS = Satisfaction With Life Scale; STAI = State-Trait Anxiety Inventory; VAS = visual analog scale; WHYMPI = West Haven-Yale Mulidemensional Pain Inventory; WPI = Widespread Pain Index
a: Unless otherwise noted, followup time is calculated from the end of the treatment period

Table 38Fibromyalgia: physical modalities

Author, Year, Followup,^a Pain Duration, Study Quality

Intervention

Population

Function and Pain Outcomes

Other Outcomes

Alfano, 2001¹⁶⁷

6 months

Duration of pain: >3 months (mean NR)

Fair

A. Magnetic mattress pad designed to expose body to a uniform magnetic field of negative polarity (n=37)

B. Magnetic mattress pad exposing body to magnetic field that varied spatially and in polarity (n=33)

C. Sham magnetic field (n=32): combined group of 2 sham magnetic mattress pads; identical in appearance to real magnetic pads but contained demagnetized magnets.

D. Usual care (n=17): maintain current treatment under PCP, refrain from new treatments

Treatment period was 6 months for all groups.

A vs. B vs. C vs. D

Age: 44 vs. 47 vs. 46 vs. 45 years

Female: 92% vs. 87% vs. 96% vs. 100%

Baseline FIQ (0-80): 51.6 vs. 55.5 vs. 51.5 vs. 53.9

Baseline pain intensity FIQ NRS (0-10): 7.1 vs. 7.0 vs. 6.7 vs. 7.0

A + B vs. C

Post 6-month intervention

FIQ: 42.9 vs. 47.9, difference −5.0 (95% CI −14.1 to 4.1)

Pain intensity NRS: 5.6 vs. 6.2, difference −0.6 (95% CI −1.9 to 0.7)

A + B vs. D

Post 6-month intervention

FIQ: 42.9 vs. 48.4, difference −5.5 (95% CI −14.4 to 3.4)

Pain intensity NRS: 5.6 vs. 6.6, difference −1.0 (95% CI −2.2 to 0.2)

A vs. C

Post 6-month intervention

FIQ: 38.3 vs. 47.9, difference −9.6 (95% CI −20.0 to 0.8)

Pain intensity NRS: 4.8 vs. 6.2, difference −1.4 (95% CI −2.8 to 0.05)

B vs. C

Post 6-month intervention

FIQ: 47.4 vs. 47.9, difference −0.5 (95% CI −11.2 to 10.2)

Pain intensity NRS: 6.3 vs. 6.2, difference 0.1 (95% CI −1.4 to 1.6)

A vs. D

Post 6-month intervention

FIQ: 38.3 vs. 48.4, difference −10.1 (95% CI −21.9 to 1.7)

Pain intensity NRS: 4.8 vs. 6.6, difference −1.8 (95% CI −3.4 to −0.2)

B vs. D

Post 6-month intervention

FIQ: 47.4 vs. 48.4, difference −1.0 (95% CI −13.0 to 11.0),

Pain intensity NRS: 6.3 vs. 6.6, difference −0.3 (95% CI −2.0 to 1.4)

Paolucci, 2016¹⁶⁸

1 month

Duration of pain:

Poor

A. Extremely low-frequency magnetic field first (n=16): three 30-minute sessions per week for 4 weeks (12 sessions total). Patients laid on a bed with multi-low-frequency mattress that delivered a magnetic field at an intensity of 100 uT and a multifrequency of 1 to 80 Hz.

B. Sham extremely low-frequency magnetic field first (n=17): three 30-minute sessions per week for 4 weeks (12 sessions total). Patients laid on a bed with multi-low-frequency mattress but no magnetic field was delivered.

Washout period: 1 month

A vs. B

Age, years: 50 vs. 51

Female: 100% vs. 100%

Fibromyalgia duration, years: 7 vs. 5

Baseline FIQ: 58.7 (11.3) vs. 57.2 (12.3)

Baseline FIQ pain: NR

Baseline pain VAS: 4.9 (1.4) vs. 4.8 (1.2)

Baseline FAS (0−10): 6.1 (1.7) vs. 6.4 (1.4)

A vs. B, mean

1 month

FIQ: 19.2 vs. 57.9, p<0.001

Percent change from baseline in FIQ: −67.3% vs. 2.9%, p<0.001

FIQ pain: values NR, p<0.001

Pain VAS: 2.2 vs. 5.3, p<0.001

Percent change from baseline in pain

VAS: −54.1% vs. 6.3%, p<0.001

FAS: 3.2 vs. 6.1, p<0.001

Percent change from baseline in FAS: −46.5% vs. −4.5% p<0.001

B vs. A (after cross-over)

1 month

FIQ: 25.1 vs. 53.9, p<0.001

Percent change from baseline in FIQ: −56.0% vs. −8.1%, p<0.001

Pain VAS: 3.1 vs. 4.6, p=0.02

Percent change from baseline in pain

VAS: −39.7% vs. −9.1%, p=0.006

FAS: 3.5 vs. 6.2, p=0.002

Percent change from baseline in FAS: −46.9% vs. −1.2%, p<0.001

A vs. B

1 month

HAQ (0-3): 0.3 vs. 1.1, p=0.03

Percent change from baseline in HAQ: NR

B vs. A (after cross-over)

1 month

HAQ: 0.7 vs. 0.8, p=0.41

Percent change from baseline in HAQ: NR

: CI = confidence interval; FAS = Fibromyalgia Assessment Status; FIQ = Fibromyalgia Impact Questionnaire; HAQ = Health Assessment Questionnaire; Hz = Hertz; NR = not reported; NRS = numeric rating scale; PCP = primary care physician; uT = microtesla; VAS = Visual Analog Scale
a: Unless otherwise noted, followup time is calculated from the end of the treatment period

Table 39Fibromyalgia: manual therapies

Author, Year, Followup,^a Pain Duration, Study Quality

Intervention

Population

Function and Pain Outcomes

Other Outcomes

Castro-Sanchez, 2011a¹⁸⁵

6 and 12 months

Duration of pain, NR

Fair

A. Myofascial Release (n=47): myofascial release (across 10 pain regions) administered by a physiotherapist; 60 minutes sessions twice weekly for 20 weeks

B. Sham short-wave and ultrasound electrotherapy (n=47): both applied to the cervical, dorsal and lumbar regions using disconnected equipment; 30 minute sessions (10 minutes each region), twice weekly for 20 weeks

A vs. B

Age: 55 vs. 54 years

Female: NR

Race: NR

Mean duration of pain: NR

FIQ total (0-100): 65.0 vs. 63.9

Pain (FIQ, 0-10): 9.2 vs. 8.9

Pain (VAS, 0-10): 9.1 vs. 8.9

MPQ sensory dimension (0-33): 19.3 vs. 19.9

MPQ affective dimension (0-12): 5.6 vs. 4.9

MPQ evaluative (sensory + affective) dimension (0-45): 24.9 vs. 25.3

A vs. B

6 months

FIQ Total: 58.6 vs. 64.1, p=0.048

FIQ pain: 8.5 vs. 8.0, p=0.042

VAS pain: 8.25 vs. 8.94, p=0.043

MPQ sensory: 17.3 vs. 20.7, p=0.042

MPQ affective: 4.5 vs. 5.2, p=0.042

MPQ evaluative: 21.9 vs. 26.2, p=0.022

12 months

FIQ Total: 62.8 vs. 65.0, p=0.329

FIQ pain: 8.8 vs. 8.7, p=0.519

VAS pain: 8.74 vs. 8.92, p=0.306

MPQ sensory: 18.2 vs. 21.2, p=0.038

MPQ affective: 4.8 vs. 5.1, p=0.232

MPQ evaluative: 23.2 vs. 26.7, p=0.036

p-values are from authors’ ANOVA^b

A vs. B

6 months

Clinical Global Impression Severity Scale (Likert, 1-7): 5.3 vs. 6.0, p=0.048

Clinical Global Impression Improvement Scale (Likert, 1-7): 5.6 vs. 6.3, p=0.046

12 months

Clinical Global Impression Severity Scale: 5.5 vs. 6.2 p=0.147

Clinical Global Impression Improvement Scale: 5.8 vs. 6.5, p=0.049

p-values are from authors’ ANOVA^b

Castro-Sanchez, 2011b¹⁸⁶

1 and 6 months

Duration of pain, NR

Poor

A. Massage-Myofascial Release (n=32): Massage-Myofascial release therapy (across 18 pain regions) administered by a physiotherapist; weekly 90-minute session for 20 weeks.

B. Sham magnotherapy (n=32): weekly 30-minute session of disconnected magnotherapy (applied on cervical and lumbar area for 15 minutes each) for 20 weeks.

A vs. B

Age: 49 vs. 46 years

Female: 94% vs. 96%

Race: NR

Mean duration of pain: NR

Pain Intensity (VAS, 0-10)^c: 9.1 vs. 9.6

A vs. B

1 month

VAS pain^c: 8.4 vs. 9.4, p<0.043

6 months

VAS pain^c: 8.8 vs. 9.7, p=NS

p-values are from authors’ ANOVA^b

A vs. B

1 month

STAI state anxiety (20-80)^c: 21.5 vs. 22, p=NS

STAI trait anxiety (20-80)^c: 25.1 vs. 26.3, p=NS

BDI (0-63)^c: 2.1 vs. 2.5, p=NS

SF-36 physical function (0-100): 46.8 vs. 49.6, p=0.049

SF-36 physical role (0-100): 24.6 vs. 29.0, p=0.047

SF-36 bodily pain (0-100): 75.1 vs. 89.9, p=0.046

SF-36 general health (0-100): 66.8 vs. 68.4, p=0.093

SF-36 vitality (0-100): 61.6 vs. 59.2, p=0.055

SF-36 social function (0-100): 60.6 vs. 63.6, p=0.081

SF-36 emotional role (0-100): 50.5 vs. 47.0, p=0.057

SF-36 mental health (0-100): 75.0 vs. 78.3, p=0.082

PSQI, sleep duration, p=0.041^d:

patients with severe problems, 60% vs. 83%; moderate problems, 37% vs. 10%; and no problems, 3% vs. 7%

6 months

BDI^c: 2.3 vs. 2.5, p=NS

STAI state anxiety^c: 22.0 vs. 23.0, p=NS

STAI trait anxiety^c: 25.8 vs. 26.2, p=NS

SF-36 physical function: 48.2 vs. 51.2, p=0.281

SF-36 physical role: 25.5 vs. 27.5, p=0.213

SF-36 body pain: 75.6 vs. 77.8, p=0.293

SF-36 general health: 67.5 vs. 68.1, p=0.401

SF-36 vitality: 62.2 vs. 58.9, p=0.312

SF-36 social function: 61.3 vs. 63.9, p=0.088

SF-36 emotional role: 49.1 vs. 46.9, p=0.219

SF-36 mental health: 76.5 vs. 80.0, p=0.126

PSQI, sleep duration, p=0.047^d:

patients with severe problems, 57% vs. 93%; moderate problems, 37% vs. 0%; and no problems, 7% vs. 7%

p-values are from authors’ ANOVA^b

: ANOVA = repeated-measures analysis of variance; BDI = Beck Depression Inventory; FIQ = Fibromyalgia Impact Questionnaire; MPQ = McGill Pain Questionnaire; NR = not reported; NS = not statistically significant; PSQI = Pittsburgh sleep quality index; SF-36 = Short-Form 36 health questionnaire; STAI = State-Trait Anxiety Inventory; VAS = visual analog scale
a: Unless otherwise noted, followup time is calculated from the end of the treatment period
b: Changes in scores were analyzed by using a 2 (groups: experimental and placebo) X 4 (time points: baseline, immediately postintervention, at 1 and 6 months) repeated-measures analysis of variance
c: Values estimated from figures in the article.
d: For all other dimensions of the PSQI (subjective sleep quality, sleep latency, habitual sleep efficiency, sleep disturbance, daily dysfunction), there were no statistically significant difference between groups in the proportion of patients experiencing severe, moderate or no problems in the authors’ analysis of variance (ANOVA).

Table 40Fibromyalgia: mindfulness practices

Author, Year, Followup,^a Pain Duration, Study Quality

Intervention

Population

Function and Pain Outcomes

Other Outcomes

Cash 2015,²⁰⁰ Sephton, 2007²⁰²^,^b

2 months

Duration of pain NR

Poor

A. Mindfulness-based Stress Reduction (n=51): 8-week group-based program with one 2.5 hour session/week including instruction in techniques, meditation, and simple yoga positions to encourage relaxation. Participants were asked to complete daily practices with workbook and audiotapes for 45 min a day for 6 days a week.

B. Waitlist (n=39)

A vs. B

Age: 48 vs. 48 years

Female: 100% vs. 100%

Caucasian: 94% vs. 93%

Baseline FIQ Physical Functioning (0-10): 1.3 vs. 1.2

Baseline pain VAS (0-100): 68.1 vs. 69.2

Baseline FIQ Severity (0-100)^c: 67.5 vs. 62.5

A vs. B

2 months:

FIQ Physical Functioning: 1.2 vs. 1.2; difference 0.0 (95% CI −0.32 to 0.32)

Pain VAS: 65.2 vs. 65.1; difference 0.1 (95% CI −9.96 to 10.16)

FIQ Severity^c: 62.0 vs. 66.7; difference −4.7 (95% CI −12.24 to 2.84)

A vs. B

2 months

BDI Total^b: 13.3 vs. 14.8; difference −1.5 (95% CI −4.76 to 1.76)

BDI Cognitive Subscale^b: 5.3 vs. 6.4; difference −1.1 (95% CI −2.98 to 0.78)

BDI Somatic Subscale^b: 7.4 vs. 7.7; difference −0.3 (95% CI −1.73 to 1.13)

PSS: 20.2 vs. 20.8; difference −0.60 (95% CI −3.37 to 2.17)

SDQ: 8.4 vs. 9.5; difference −1.10 (95% CI −2.58 to 0.38)

FSI: 5.5 vs. 6.0; difference −0.50 (95% CI −1.28 to 0.28)

Schmidt, 2011²⁰¹

2 months

Duration of fibromyalgia, years: 14 years

Fair

A. Mindfulness-based Stress Reduction (n=53): 8-week group-based program; 1, 2.5 hour session/week and one 7 hour all-day session covering training in specific exercises and topics of mindfulness practices. Participants were asked to complete daily practices of 45-60 minutes each

B. Active-control Intervention (n=56) Controlled for nonspecific aspects of the MBSR program with similar meeting structure and format to MBSR treatment arm. Equivalent levels of social support and weekly topical education was provided along with Jacobson Progressive Muscle Relaxation training and fibromyalgia-specific gentle stretching exercises. Participants were asked to complete daily homework assignments with the same duration as MBSR group.

C. Waitlist (n=59)

A vs. B vs. C

Age: 53 vs. 52 years

Female: 100% (all female study)

Race: NR

A vs. C

Baseline FIQ Total (0-10): 5.8 vs. 5.7

Baseline PPS Affective (scale unclear): 35.5 vs. 34.8

Baseline PPS Sensory (scale unclear): 22.4 vs. 22.6

A vs. B

2 months

Proportion of patients with >14% improvement in FIQ scores (MCID): 30% vs. 25%; RR 1.21 (95% CI 0.79 to 1.82)

FIQ: 5.23 vs. 5.33; difference −0.10 (95% CI −0.84 to 0.64)

PPS Affective: 30.79 vs. 32.17; difference −1.38 (95% CI −4.79 to 2.03)

PPS Sensory: 21.16 vs. 21.87; difference −0.71 (95% CI −2.77 to 1.34)

A vs. C

2 months

Proportion of patients with >14% improvement in FIQ scores (MCID): 30% vs. 22%; RR 1.37 (95% CI 0.83 to 1.94)

FIQ: 5.23 vs. 5.29; difference −0.06 (95% CI −0.75 to 0.63)

PPS Affective: 30.79 vs. 32.38; difference −1.59 (95% CI −5.01 to 1.83)

PPS Sensory: 21.16 vs. 21.44; difference −0.28 (95% CI −2.30 to 1.74)

A vs. B

2 months

Proportion of Patients who saw Clinically Relevant Improvement (score of <23) in CES-D scores: 28% vs. 23%; RR 0.53 (95% CI 0.54 to 1.12)

CES-D: 21.70 vs. 22.55; difference −0.85 (95% CI −4.66 to 2.96)

STAI Trait Subscale: 47.86 vs. 48.44; difference −0.58 (95% CI −4.42 to 3.26)

Proportion of Patients with PSQI score <5 indicates good sleep): 17%vs. 7%; RR 2.38 (95% CI 0.85 to 2.34)

PSQI: 10.01 vs. 10.25; difference −0.24 (95% CI −1.71 to 1.23)

FMI: 37.66 vs. 35.14; difference 2.52 (95% CI 0.04 to 5.00)

GCQ: 42.63 vs. 43.91; difference −1.28 (95% CI −6.51 to 3.95)

PLC: 12.83 vs. 12.16; difference 0.67 (95% CI −0.60 to 1.94)

A vs. C

2 months

Proportion of Patients who saw Clinically Relevant Improvement (score of <23) in CES-D scores: 28% vs. 19%; RR 1.52 (95% CI 0.85 to 2.04)

CES-D: 21.7 vs. 24.0; difference −2.3 (95% CI −5.96 to 1.36)

STAI Trait Subscale: 47.9 vs. 49.2; difference −1.32 (95% CI −5.02 to 2.38

Proportion of Patients with PSQI score <5 indicates good sleep): 17% vs. 10%; RR 1.67 (95% CI 0.80 to 2.14)

PSQI: 10.0 vs. 10.4; difference −0.36 (95% CI −1.8 to 1.1)

FMI: 37.7 vs. 36.1; difference 1.5 (95% CI −0.9 to 3.91)

GCQ: 42.6 vs. 45.3; difference −2.7 (95% CI −7.8 to 2.5)

PLC: 12.8 vs. 12.3; difference 0.5 (95% CI −0.7 to 1.7)

Van Gordon, 2017²⁰³

6 months

Duration of pain: NR

Fair

[New trial]

A. Meditation Awareness Training (MAT) (n=74): MAT is a second-generation mindfulness-based intervention (SG-MBI); 1, 2-hour session per week for 8 weeks in addition to receiving a CD of guided meditations to facilitate daily self-practice

B. “Cognitive Behavior Therapy for Groups" (CBTG) (attention control) (n=74): designed to be educational only and an attention control condition.

A vs. B

Age (mean): 46 vs. 47 years

Female: 82% vs. 84%

Baseline FIQ-R (0-100): 55.2 vs. 54.0

A vs. B

6 months

FIQ-R: 45.7 vs. 52.4, adjusted difference −7.9 (95% CI −8.2 to −4.3), p<0.001

A vs. B

6 months

PSQI (0-21): 11.4 vs. 13.6, adjusted difference −2.3 (95% CI −2.9 to −1.6), p<0.001

SF-MPQ (0-45): 23.8 vs. 26.4, adjusted difference v3.0 (95% CI −4.1 to −1.9), p<0.001

DASS (0-100): 20.7 vs. 25.2, adjusted difference −4.9 (95% CI −6.3 to −3.4), p<0.001

NAS (0-42): 22.8 (5.4) vs. 19.1, adjusted difference 3.6 (95% CI 2.5 to 4.6), p<0.001

: BDI = Beck Depression Inventory; CES-D = Center for Epidemiological Studies Depression Scale; CI = confidence interval; DASS = Depression Anxiety Stress Scale; FSI= Fatigue Symptom Inventory; FIQ = Fibromyalgia Impact Questionnaire; FMI = Freiburg Mindfulness Inventory; FSI = Fatigue Symptom Inventory; GCQ = Giessen Complaint Questionnaire; MCID = minimal clinically important difference; PLC = Profile for the Chronically Ill; PPS = Pain Perception Scale; PSQI = Pittsburgh Sleep Quality Index; PSS = Perceived Stress Scale; RR = risk ratio; SF-MPQ = Short-Form McGill Pain Questionaire; SDQ = Stanford Sleep Disorders Questionnaire; STAI = State-Trait-Anxiety-Inventory; VAS = visual analog scale
a: Unless otherwise noted, followup time is calculated from the end of the treatment period
b: Sephton is the same population as Cash 2015 but the focus of the study was on depression (Beck Depression Inventory).
c: FIQ symptom severity is comprised of visual analog ratings of pain, fatigue, morning sleepiness, stiffness, anxiety, and depression

Table 41Fibromyalgia: mind-body therapies

Author, Year, Followup,^a Pain Duration, Study Quality

Intervention

Population

Function and Pain Outcomes

Other Outcomes

Lynch, 2012²¹⁷

(N=100)

4 months

Duration of fibromyalgia, mean: 9.6 years

Fair

A. Qigong (n=53): Chaoyi Fanhuan Qigong; 3consecutive half-day training sessions then weekly practice sessions for 8 weeks plus daily at-home practice for 45 to 60 minutes.

B. Waitlist (n=47): continued with usual care; offered qigong after the trial ended

A vs. B

Age: 53 vs. 52 years

Female: 94% vs. 98%

Previous opioid therapy: 42% vs. 30%

Current opioid therapy: 36% vs. 23%

Current NSAID therapy: 49% vs. 57%

FIQ (0-100): 65.5 vs. 61.8

NRS pain (0-10): 6.5 vs. 6.6

SF-36 PCS (0-100): 30.0 vs. 32.6

SF-36 MCS (0-100): 38.1 vs. 40.4

PSQI (0-21): 13.8 vs. 13.1

A vs. B

4 months

Mean change from baseline:

FIQ: −16.1 vs. −4.8; difference −11.3 (95% CI −19.3 to −3.3)

NRS pain: −1.21 vs. −0.27; difference −0.9 (95% CI −1.7 to −0.1)

A vs. B

4 months

Mean change from baseline:

SF-36 PCS: 4.6 vs. 0.2; difference 4.4 (95% CI 1.5 to 7.3)

SF-36 MCS: 4.4 vs. 0.7; difference 3.7 (95% CI −0.3 to 7.7)

PSQI: −3.3 vs. −1.1; difference −2.2 (95% CI −3.6 to −0.8)

Wang, 2010²¹⁸

(N=66)

3 months

Duration of fibromyalgia pain: 11 years

Fair

A. Tai chi (n=33) Classic Yang style tai chi; at home practice for at least 20 minutes a day; encouraged to maintain tai chi practice using an instructional video.

B. Attention control (n=33): 40 minutes of education then 20 minutes of supervised stretching (upper body, trunk, and lower body); plus 20 minutes of daily at-home stretching

Both groups had 60-minute sessions twice a week for 12 weeks and continued regular medications and routine activities.

A vs. B

Age: 50 vs. 51 years

Female: 85% vs. 88%

Analgesic use: 88% vs. 73%

FIQ (0-100): 62.9 vs. 68.0

VAS pain (0-10): 5.8 vs. 6.3

CES-D (0-60): 22.6 vs. 27.8

SF-36 PCS (0-100): 28.5 vs. 28.0

SF-36 MCS (0-100): 42.6 vs. 37.8

PSQI (0-21): 13.9 vs. 13.5

A vs. B

3 months

Proportion with clinically meaningful improvement:

FIQ^b: 81.8% vs. 51.5%; RR 1.6 (95% CI 1.1 to 2.3)

VAS pain^c: 54.5% vs. 27.3%; RR 2.0 (95% CI 1.1 to 3.8)

Mean change from baseline: FIQ: −28.6 vs. −10.2; difference −18.3 (95% CI −27.1 to −9.6)

VAS pain: −2.4 vs. −0.7; difference −1.7 (95% CI −2.7 to −0.8)

A vs. B

3 months

Proportion with clinically meaningful improvement:

CES-D^d: 69.7% vs. 39.4%; RR 1.8 (95% CI 1.1 to 2.9)

SF-36 PCS^e: 51.5% vs. 15.2%; RR 3.4 (95% CI 1.4 to 8.1)

SF-36 MCS^f: 48.5% vs. 24.2%; RR 2.0 (95% CI 1.0 to 4.0)

PSQI^g: 45.5% vs. 18.2%; RR 2.5 (95% CI 1.1 to 5.6)

Mean change from baseline:

CES-D: −6.5 vs. −2.4; difference −4.1 (95% CI −8.2 to 0.1)

SF-36 PCS: 8.4 vs. 1.5; difference 7.0 (95% CI 2.9 to 11.0)

SF-36 MCS: 8.5 vs. 1.2; difference 7.3 (95% CI 1.9 to 12.8)

PSQI: −4.2 vs. −1.2; difference −3.0 (95% CI −5.2 to −0.9)

Wang, 2018²²³

All groups were assessed at 12, 24, and 52 weeks from the start of treatment

Duration of pain: Mean 11.1 to 13.8 years

Fair

[New trial]

A. Yang style tai chi (n=39): one 60-minute session/week for 12 weeks. Mean adherence rate (SD): 66.7% (28.7%)

B. Yang style tai chi (n=37): two 60-minute sessions/week for 12 weeks. Mean adherence rate (SD): 65.1% (26%)

C. Yang style tai chi (n=39): one 60-minute session/week for 24 weeks. Mean adherence rate (SD): 57.2% (27.9%)

D. Yang style tai chi (n=36): two 60-minute sessions/week for 24 weeks. Mean adherence rate (SD): 57.8% (33.3%)

E. Aerobic exercise (n=75): two 60-minute sessions/week for 24 weeks.

All groups received educational information about the importance of physical activity and home practice; encouraged to integrate at least 30 minutes of tai chi or aerobic exercise into their daily routine; asked to continue exercise after completing their 12 week or 24 week sessions, as well as throughout 52 weeks of followup.

A vs. B vs. C. vs. D vs. E

Age: 53 vs. 52 vs. 51 vs. 52 vs. 51 years

Female: 85% vs. 81% vs. 97% vs. 100% vs. 96%

Baseline FIQ-R (0-100): 52.4 vs. 53.8 vs. 56.5 vs. 60.4 vs. 57.3

All results reported as mean change from baseline (95% CI)

C vs. E

6 months

FIQ-R: −16.7 (−23.4 to −10.1) vs. −9.2 (−14.3 to −4.1)

12 months

FIQ-R: −13.6 (−20.4 to −6.8) vs. −11.7 (−16.7 to −6.6)

D vs. E

6 months

FIQ-R: −25.4 (−32.3 to −18.4) vs. −9.2 (−14.3 to −4.1); difference 16.2 (8.7 to 23.6), p<0.001

12 months

FIQ-R: −22.7 (−30.0 to −15.4) vs. −11.7 (−16.7 to −6.6); difference 11.1 (2.7 to 19.6), p=0.01

Any tai chi vs. E

3-6 months

FIQ-R; differnce 5.5, (0.6 to 10.4) p=0.03

6-12 months

(FIQ-R, differnce −2.7, 95% CI −2.3 to 7.7)

All results reported as mean change from baseline (95% CI)

C vs. E

6 months

SS (0-12): −1.8 (−2.6 to −1.0) vs. −0.8 (−1.4 to −0.2)

PGAS (0-10): −1.6 (−2.4 to −0.8) vs. −0.4 (−1.0 to 0.2)

HAQ (0-100): −3.9 (−8.6 to 0.9) vs. −4.1 (−7.8 to −0.5)

BDI (0-63): −7.5 (−10.8 to −4.1) vs. −5.2 (−7.7 to −2.7)

HADS-D (0-21): −1.4 (−2.6 to 0.3) vs. −0.6 (−1.5 to 0.4)

HADS-A (0-21): −1.4 (−2.5 to −0.2) vs. 0.0 (−0.9 to 0.9)

SF-36 MCS (0-100): 5.3 (1.9 to 8.7) vs. 0.9 (−1.8 to 3.6)

SF-36 PCS (0-100): 5.0(2.5 to 7.6) vs. 4.0 (2.0 to 6.0)

PSQI (0-100): −1.9 (−3.2 to −0.6) vs. −1.1 (−2.1 to −0.1)

12 months

SS: −1.4 (−2.3 to −0.6) vs. −1.1 (−1.8 to −0.4)

PGAS: −1.4 (−2.2 to −0.5) vs. −0.3 (−0.9 to 0.3)

HAQ: −3.5 (−8.8 to 1.8) vs. −3.9 (−7.8 to 0.0)

BDI: −5.5 (−9.4 to −1.6) vs. −6.4 (−9.3 to −3.5)

HADS-D: −0.9 (−2.2 to 0.5) vs. −0.6 (−1.6 to 0.4)

HADS-A: −1.3 (−2.7 to 0.0) vs. −0.4 (−1.4 to 0.6)

SF-36 MCS: 3.8 (−0.5 to 8.0) vs. 3.0 (−0.1 to 6.0)

SF-36 PCS: 6.9 (3.9 to 9.9) vs. 2.6 (0.4 to 4.7)

PSQI: −1.1 (−2.6 to 0.4) vs. −1.2 (−2.3 to −0.1)

D vs. E

6 months

SS: −1.7 (−2.5 to −0.8) vs. −0.8 (−1.4 to −0.2); difference 0.9 (−0.1 to 1.9), p=0.09

PGAS: −2.0 (−2.8 to −1.2) vs. −0.4 (−1.0 to 0.2); difference 1.6 (0.7 to 2.5), p=0.0006

HAQ: −6.7 (−12.0 to −1.3) vs. −4.1 (−7.8 to −0.5); difference 2.4 (−4.3 to 9.0), p=0.48

BDI: −9.5 (−13.0 to −6.0) vs. −5.2 (−7.7 to −2.7); difference 4.3 (0.0 to 8.5), p=0.049

HADS-D: −2.7 (−4.1 to 1.4) vs. −0.6 (−1.5 to 0.4); difference 2.1 (0.5 to 3.7), p=0.01

HADS-A: −2.1 (−3.4 to −0.8) vs. 0.0 (−0.9 to 0.9); difference 2.1 (0.6 to 3.6), p=0.008

SF-36 MCS: 7.4 (3.6 to 11.2) vs. 0.9 (−1.8 to 3.6); difference 6.2 (1.9 to 10.6), p=0.006

SF-36 PCS: 5.9 (3.1 to 8.8) vs. 4.0 (2.0 to 6.0); difference 2.0 (−1.3 to 5.3), p=0.24

PSQI: −2.1 (−3.5 to −0.7) vs. −1.1 (−2.1 to −0.1); difference 1.0 (−0.6 to 2.5), p=0.22

12 months

SS: −1.8 (−2.8 to −0.9) vs. −1.1 (−1.8 to −0.4); difference 0.7 (−0.3 to 1.8), p=0.18

PGAS: −1.7 (−2.7 to −0.8) vs. −0.3 (−0.9 to 0.3); difference 1.5 (0.4 to 2.5), p=0.008

HAQ: −5.0 (−10.8 to 0.7) vs. −3.9 (−7.8 to 0.0); difference 1.8 (−5.9 to 9.4), p=0.65

BDI: −11.1 (−15.2 to −6.9) vs. −6.4 (−9.3 to −3.5); difference 4.6 (−0.5 to 9.7), p=0.08

HADS-D: −2.2 (−3.7 to 0.8) vs. −0.6 (−1.6 to 0.4); difference 1.6 (0.0 to 3.2), p=0.05

HADS-A: −2.1 (−3.6 to −0.7) vs. −0.4 (−1.4 to 0.6); difference 1.6 (0.1 to 3.1), p=0.04

SF-36 MCS: 5.4 (0.8 to 9.9) vs. 3.0 (−0.1 to 6.0); difference 2.2 (−2.7 to 7.1), p=0.38

SF-36 PCS 5.4 (2.2 to 8.6): vs. 2.6 (0.4 to 4.7); difference 3.0 (−0.7 to 6.8), p=0.11

PSQI: −2.0 (−3.6 to −0.4) vs. −1.2 (−2.3 to −0.1); difference 0.9 (−0.7 to 2.5), p=0.26

Any tai chi vs. E:

Change in narcotics use:

24 weeks: OR 0.89 (0.28, 2.80)

52 weeks: OR 1.08 (0.33, 3.51)

: BDI = Beck Depression Inventory; CES-D = Center for Epidemiologic Studies Depression index; CI = confidence interval; FIQ = Fibromyalgia Impact Questionnaire; HADS = Hospital Anxiety and Depression Score; MCS = Mental Component Summary; NRS = numeric rating scale; NSAIDs = nonsteroidal anti-inflammatory drugs; PCS = Physical Component Summary; PSQI = Pittsburgh Sleep Quality Index; RR = risk ratio; SF-36 = Short-Form-36 Questionaire; SS = Symptom Severity; VAS = visual analog scale
a: Unless otherwise noted, followup time is calculated from the end of the treatment period
b: A reduction of ≥8.1 points from baseline on the FIQ was considered a clinically meaningful improvement
c: A reduction of ≥2 points from baseline on the VAS was considered a clinically meaningful improvement
d: A reduction of ≥6 points from baseline on the CES-D was considered a clinically meaningful improvement
e: An increase of ≥6.5 points from baseline on the SF-36 PCS was considered a clinically meaningful improvement
f: An increase of ≥7.9 points from baseline on the SF-36 MCS was considered a clinically meaningful improvement
g: A reduction of >5 points from baseline on the PSQI was considered a clinically meaningful improvement

Table 42Fibromyalgia: acupuncture

Author, Year, Followup,^a Pain Duration, Study Quality

Intervention

Population

Function and Pain Outcomes

Other Outcomes

Assefi, 2005²⁴⁶

3 and 6 months

Mean duration of pain: 9 to 12 years

Good

A. Acupuncture (n=25): in accordance with Traditional Chinese Medicine

B. Sham Acupuncture (n=24): Needling for Unrelated Condition

C. Sham Acupuncture (n=24): Sham Needling

D. Sham Acupuncture (n=23): Simulated Acupuncture

Treatment protocol: 24 sessions (2/week for 12 weeks)

A vs. B vs. C vs. D

Mean age: 46 vs. 46 vs. 49 vs. 48 years

Female: 88% vs. 96% vs. 100% vs. 96%

Race (white): 96% vs. 88% vs. 96% vs. 92%

Mean duration of pain: 12 vs. 9 vs. 9 vs. 10 years

Baseline pain Intensity VAS (0-10): 7.0 vs. 6.9 vs. 6.8 vs. 7.3

A. vs. B vs. C vs. D

3 months

Pain Intensity VAS^b: 6.0 vs. 5.4 vs. 5.4 vs. 4.5

6 months

Pain Intensity VAS^b: 5.7 vs. 6.0 vs. 5.2 vs. 5.2

A vs. B+C+D

Across all timepoints^c

Pain intensity VAS: adjusted difference 0.5, (95% CI −0.3 to 1.2)

A. vs. B vs. C vs. D

3 months

SF-36 PCS (0-100)^b: 31 vs. 39 vs. 31.5 vs. 40

SF-36 MSC (0-100)^b: 46 vs. 46.5 vs. 48.5 vs. 47

Sleep Quality VAS (0-10)^a: 4.3 vs. 4.1 vs. 5.2 vs. 5.5

Overall Well-Being VAS (0-10)^b: 4.9 vs. 4.9 vs. 5.0 vs. 6.3

6 months

SF-36 PCS^b: 31 vs. 36 vs. 31. vs. 39

SF-36 MCS^b: 43 vs. 45 vs. 50 vs. 46.5

Sleep Quality VAS^b: 4.3 vs. 3.4 vs. 5.4 vs. 5.5

Overall Well-Being VAS^b: 4.6 vs. 4.6 vs. 5.7 vs. 5.7

A vs. B+C+D

Across all time-points^c

SF-36 PCS: adjusted difference −0.4 (95% CI −2.3 to 1.5)

SF-36 MCS: adjusted difference −1.5, (95% CI −4.0 to 1.0)

Sleep Quality VAS: adjusted difference −0.5, (95% CI −1.3 to 0.2)

Overall Well-Being VAS: adjusted difference −0.3, (95% CI −1.0 to 0.3)

Karatay, 2018²⁴⁹

1 and 3 months

Duration of pain: Mean 3.9 to 5.0 years

Fair

[New trial]

A. Acupuncture (n=24): 18 acupoints using 0.25x25 mm stainless steel needles; 2, 30 minute sessions per week for 4 weeks (8 total)

B. Sham acupuncture (n=25): 2, 30 minute sessions per week for 4 weeks (8 total)

C. Simulated acupuncture (n=23): 2, 30 minute sessions per week for 4 weeks (8 total)

A vs. B vs. C

Age: 35 vs. 34 vs. 35 years

Duration of disease: 4.4 vs. 3.9 vs. 5 years

Baseline FIQ (0-100): 70.8 vs. 65.9 vs. 57.4

Baseline pain VAS (0-10): 8.1 vs. 7.7 vs. 8.7

Baseline NHP pain (0-100): 82.6 vs. 65.2 vs. 67.9

A vs. B

3 months

FIQ (0-100): 43.6 vs. 58.4, difference −14.8 (95% CI −26.5 to −3.0)

VAS (0-10): 4.5 vs. 7.0, difference −2.5 (95% CI −4.1 to −1.0)

NHP pain (0-100): 18.6 vs. 57.9, difference −39.3 (95% CI −59.4 to −19.1)

A vs. C

3 months

FIQ: 43.6 vs. 55.6, difference −11.94 (95% CI −23.1 to −0.8)

VAS: 4.5 vs. 8.2, difference −3.7 (95% CI −5.1 to −2.4)

NHP pain: 18.6 vs. 72.3, difference −53.6 (95% CI −72.3 to −34.9)

A vs. B

3 months

NHP physical mobility: 15.4 vs. 33.1, difference −17.7 (95% CI −31.4 to −4.0)

NHP energy: 29.3 vs. 69.7, difference −40.4 (95% CI −65.8 to −15.0)

NHP sleep: 9.7 vs. 47.9, difference −38.2 (95% CI −55.9 to −20.6)

NHP social isolation: 8.1 vs. 29.0, difference −20.9 (95% CI −38.2 to −3.6)

NHP emotional reactions: 20.6 vs. 56.4, difference −35.9 (95% CI −56.8 to −14.9)

BDI: 10.1 vs. 31.4, difference −21.2 (95% CI −29.5 to −13.0)

A vs. C

3 months

NHP physical mobility: 15.4 vs. 52.8, difference −37.4 (95% CI −53.1 to −21.7)

NHP energy: 29.3 vs. 71.4, difference −42.1 (95% CI −66.9 to −17.4)

NHP sleep: 9.7 vs. 63.3, difference −53.6 (95% CI −71.6 to −35.7)

NHP social isolation: 8.1 vs. 48.8, difference −40.7 (95% CI −57.9 to −23.5)

NHP emotional reactions: 20.6 vs. 59.32, difference −38.74 (95% CI −59.4 to −18.1)

BDI: 10.1 vs. 35.4, difference −25.2 (95% CI −32.4 to −18.1)

Martin, 2006²⁴⁷

1 and 7 months

Duration of pain: NR

Good

A. Acupuncture (n=25): 6 treatments over 2 to 3 weeks

B. Sham Acupuncture (n=25): sham needling; 6 treatments over 2 to 3 weeks

A vs. B

Age: 48 vs. 52 years

Female: 100% vs. 96%

Race: 96% vs. 100% white

Baseline FIQ total (0-80): 42.4 vs. 44.0

Baseline FIQ Physical Function (0-10): 4.1 vs. 3.6

Baseline MPI Interference (scale NR): 42.6 vs. 36.9

Baseline MPI General Activity Level (scale NR): 55.7 vs. 56.6

Baseline MPI Pain Severity (scale NR): 40.4 vs. 43.0

Baseline FIQ Pain (0-10): 6.2 vs. 6.5

A vs. B

1 month

FIQ Total: 34.8 vs. 42.2, difference −4.9 (95% CI −8.7 to −1.2)

FIQ Physical Function: 3.7 vs. 3.3, difference –0.4 (95% CI –1.1 to 0.3)

MPI Interference: 38.3 vs. 34.9, difference 0.1 (95% CI –3.4 to 3.6)

MPI General Activity Level: 55.4 vs. 58.3, difference –1.2, (95% CI –3.8 to 1.4)

MPI Pain Severity: 34.2 vs. 41.6, difference –4.6 (95% CI –8.7 to –0.5)

FIQ pain: 4.7 vs. 5.9, difference –0.8, (95% CI –1.8 to 0.2)

7 months

FIQ Total: 38.1 vs. 42.7, difference –4.3 (95% CI –7.7 to –0.9)

FIQ Physical Function: 3.5 vs. 3.3, difference –0.3 (95% CI –0.9 to 0.3)

MPI Interference: 37.7 vs. 35.5, difference 0.1 (95% CI –3.2 to 3.4)

MPI General Activity Level: 58.1 vs. 59.5, difference –0.6 (95% CI –3.1 to 1.8)

MPI Pain Severity: 37.3 vs. 41.4, difference –3.8 (95% CI –7.5 to –0.2)

FIQ Pain: 5.5 vs. 6.4, difference –0.7 (95% CI –1.5 to 0.3)

A vs. B

1 month

FIQ Anxiety (0-10): 2.6 vs. 5.1, difference –1.1 (95% CI –2.0 to –0.2)

FIQ Depression (0-10): 2.0 vs. 3.7, difference –0.7 (95% CI –1.6 to 0.3)

FIQ Sleep (0-10): 5.9 vs. 6.8, difference –0.7 (95% CI –1.8 to 0.5)

FIQ Well-Being (0-10): 4.6 vs. 3.1, difference 0.8 (95% CI –0.4 to 2.0)

7 months

FIQ Anxiety: 3.3 vs. 4.8, difference –1.1 (95% CI –1.9 to –0.2)

FIQ Depression: 2.2 vs. 3.6, difference –0.7 (95% CI –1.6 to 0.2)

FIQ Sleep: 6.1 vs. 6.3, difference –0.3 (95% CI –1.3 to 0.6)

FIQ Well-Being: 3.8 vs. 3.6, difference 0.4 (95% CI –0.6 to 1.4)

Mist, 2018²⁵⁰

1 month

Duration of symptoms: NR

Fair

[New trial]

A. Group acupuncture (n=16): 20, 45-minute long treatments over 10 weeks

B. Education attention control (n=14)

A vs. B

Age: 52 vs. 56 years

BMI: 33 vs. 33 kg/m^2

Baseline VAS-pain (from FIQR): 6.2 vs. 6.3

A vs. B

1 month

VAS: 4.0 vs. 6.2, p<0.001

Vas, 2016²⁴⁸

3.75 and 9.75 months

Duration of pain: NR

Good

A. Acupuncture (n=82): 1, 20 minute session per week for 9 weeks

B. Sham Acupuncture (n=82): simulated acupuncture; 1, 20 minute session per week for 9 weeks

All patients received pharmacological treatment as prescribed by GP.

A vs. B

Age: 52.3 vs. 53.2 years

Female: 100% vs. 100%

Baseline FIQ (0-100): 71.7 vs. 70.1

Baseline Pain Intensity VAS (0-100): 79.3 vs. 75.8

A vs. B

3.75 months

FIQ % mean relative change: −25.0 vs. −11.2, Cohen’s d=0.58

Pain Intensity VAS % mean relative change: −23.6 vs. −16.6, Cohen’s d=0.28

9.75 months

FIQ % mean relative change (%): −22.2 vs. −4.9, Cohen’s d=0.80,

Pain intensity VAS % mean relative change: −19.9 vs. −6.2, Cohen’s d=0.62

A vs. B

3.75 months

HDRS % mean relative change: NR

SF-12 MCS % mean relative change: 30.6 vs. 13.9, Cohen’s d=0.38

SF-12 PCS % mean relative change: 37.0 vs. 15.5, Cohen’s d=0.56

9.75 months

HDRS % mean relative change: −19.1 vs. −5.9, Cohen’s d=0.22

SF-12 PCS % mean relative change: 37.2 vs. 11.4, Cohen’s d=0.58

SF-12 MCS % mean relative change: 23.0 vs. 9.4, Cohen’s d=0.36

: BDI = Beck Depression Inventory; CI = confidence interval; FIQ = Fibromyalgia Impact Questionnaire; GP = general practitioner; HDRS = Hamilton Depression Rating Scale; MCS = Mental Component Score; MPI = Multidimensional Pain Inventory; NHP = Nottingham Health Profile; NR = not reported; PCS = Physical Component Score; SF-12 = Short-Form-12 questionaire; SF-36 = Short-Form 36 questionaire; VAS = visual analog scale
a: Unless otherwise noted, followup time is calculated from the end of the treatment period
b: Outcome values were estimated from graphs.
c: Authors combined the three sham control groups and calculated the adjusted least-square mean difference between the acupuncture group and combined control groups. Treatment-by-time interaction was not included in the models; therefore data reflects results across all time-points.

Table 43Fibromyalgia: multidisciplinary rehabilitation

Author, Year, Followup,^a Pain Duration, Study Quality

Intervention

Population

Function and Pain Outcomes

Other Outcomes

Amris, 2014²⁶²

5.5 months

Duration of pain: median 10 to 11 years

Fair

A. Multidisciplinary treatment (n=84): 3- 5 hours of education, sleep hygiene, group discussions, and physical therapy per day over 2 weeks

B. Wait list (n=86)

A vs. B

Age: 44 vs. 44 years

Female: 100% vs. 100%

Baseline Fibromyalgia Impact Questionnaire Total (FIQ, 0-100): 64.0 vs. 65.7

Baseline FIQ pain VAS (0-10): 7.1 vs. 7.4

A vs. B

5.5 months

Change in FIQ total from baseline: −1.3 vs. −1.4, difference 0.1 (95% CI −3.6 to 3.8)

Change in FIQ pain VAS from baseline: 0.1 vs. −0.1, difference 0.2 (95% CI −0.3 to 0.7)

A vs. B

5.5 months

Change in Generalized Anxiety Disorder-10 from baseline (scale NR): −0.8 vs. −0.5, difference −0.2 (95% CI −2.0 vs. 1.5)

Change in Major Depression Inventory from baseline (0-50): −1.7 vs. −0.5, difference −1.3 (95% CI −3.3 to 0.8)

Change in SF-36 physical component score from baseline (0-100): 1.4 vs. 0.8, difference 0.6 (95% CI −1.0 to 2.1)

Percent responders in SF-36 physical component score: 27% vs. 23%

Change in SF-36 mental component score from baseline (0-100): 2.3 vs. 1.2, difference 1.1 (95% CI −1.5 to 3.8)

Percent responders in SF-36 mental component score: 27% vs. 27%

Change in SF-36 physical functioning from baseline (0-100): 1.1 vs. 1.6, difference −0.5 (95% CI −3.9 to 3.0)

Castel, 2013²⁶³

Salvat, 2017²⁶⁷

3, 6 and 12 months

Duration of pain: Mean 10.8 to 12.5 years

Poor

A. Multidisciplinary treatment (n=53), conventional pharmacological treatment, 24 sessions of group CBT and physical therapy over 12 weeks.

B. Usual care (n=35): conventional pharmacological treatment including analgesics, antidepressants, benzodiazepines, and nonbenzodiazepine hypnotics

A vs. B

Age: 49 vs. 49 years

Female: 100% vs. 100%

Baseline FIQ (0-100): 64.6 vs. 66.6

Baseline pain NRS (0-10): 6.8 vs. 7.1

A vs. B

3 months

FIQ: 55.5 vs. 64.6, difference −9.1 (95% CI −14.9 to −3.3)

Proportion with clinically significant FIQ improvement (≥14% change): 48% vs. 23%, OR 3.1 (95% CI 1.6 to 6.2)

Pain NRS: 6.4 vs. 6.8, difference −0.40 (95% CI −0.98 to 0.18)

Proportion with clinically significant NRS pain improvement (≥30% change): 14% vs. 11%

6 months

FIQ: 55.8 vs. 67.8, difference −12.0 (95% CI −18.2 to −5.8)

Proportion with clinically significant FIQ improvement (≥14% change): 42% vs. 19%, OR 3.1 (95% CI 1.5 to 6.4)

Pain NRS: 6.4 vs. 7.0, difference −0.60 (95% CI −1.2 to 0)

Proportion with clinically significant NRS pain improvement (≥30% change): 16% vs. 5%, OR 3.3 (95% CI 1.0 to 10.8)

12 months

FIQ: 58.8 vs. 69.6, difference −10.8 (95% CI −16.8 to −4.8)

Proportion with clinically significant FIQ improvement (≥14% change): 27% vs. 4%, OR 8.8 (95% CI 2.5 to 30.9)

Pain NRS: 6.7 vs. 7.1, difference −0.40 (95% CI −0.94 to 0.14)

Proportion with clinically significant NRS pain improvement (≥30% change): 8.6% vs. 0%, OR 0.5 (95% CI 0.4 to 0.6)

A vs. B

3 months

HADS (0-42): 15.2 vs. 20.6, difference −5.4 (95% CI −8.2 to −2.6)

MOS sleep scale (scale NR): 40.5 vs. 31.2, difference 9.3 (95% CI 6.1 to 12.5)

WONCA, mean (95% CI):

total score: 23.7 (22.5 to 25.0) vs. 26.5 (25.1 to 27.9), p<0.005;

physical function: 2.71 (2.51 to 2.95) vs. 3.20 (2.95 to 3.41), p=NR;

daily activities: 2.88 (2.70 to 3.05) vs. 3.20 (3.00 to 3.39), p=NR

6 months

HADS: 16.2 vs. 21.5, difference −5.3 (95% CI −8.1 to −2.5)

MOS sleep scale: 38.7 vs. 29.0, difference 9.7 (95% CI 6.6 to 12.8)

WONCA, mean (95% CI):

total score: 23.6 (22.4 to 24.9) vs. 27.3 (25.9 to 28.6), p<0.005;

physical function: 2.69 (2.48 to 2.90) vs. 3.38 (3.12 to 3.60), p=NR;

daily activities: 2.97 (2.80 to 3.15) vs. 3.28 (3.10 to 3.47), p=NR

12 months

HADS: 17.1 vs. 22.8, difference −5.7 (95% CI −8.7 to −2.7)

MOS sleep scale: 36.3 vs. 28.8, difference 7.5 (95% CI 4.3 to 10.7)

WONCA, mean (95% CI):

total score: 23.5 (22.1 to 24.8) vs. 26.4 (24.9 to 27.9), p<0.005;

physical function: 2.72 (2.49 to 2.96) vs. 3.33 (3.05 to 3.62), p=NR

daily activities: 2.87 (2.69 to 3.06) vs. 3.32 (3.10 to 3.55), p=NR

Cedraschi, 2004²⁶⁵

6 months

Duration of pain: Mean 8.4 to 9.5 years

Poor

A. Multidisciplinary treatment (n=84): 12 group pool sessions of physiotherapy, relaxation exercises, and exercise over 6 weeks

B. Usual care (n=80): Regular care, including physical therapy, drug treatment and, in some cases, psychotherapy.

A vs. B

Age: 49 vs. 50 years

Female: 93% vs. 93%

Baseline FIQ total (0-10): 5.5 vs. 5.6

FIQ physical function (0-10): 4.2 vs. 4.5

Baseline FIQ pain (0-10): 6.3 vs. 6.0

A vs. B

6 months

FIQ total: 4.9 vs. 5.5, difference −0.6 (95% CI −1.1 to −0.09)

FIQ physical function: 4.3 vs. 4.8, difference −0.5 (95% CI −1.3 to 0.3)

FIQ pain: 6.1 vs. 6.6, difference −0.5 (95% CI −1.2 to 0.2)

Regional Pain Score: 62.6 vs. 68.4, difference −5.8 (95% CI −12.1 to 0.5)

A vs. B

6 months

Psychological General Wellbeing Index total (0-110): 51.1 vs. 43.8, difference 7.3 (95% CI 0.2 to 14.3)

Psychological General Wellbeing Index anxiety (0-25): 13.0 vs. 10.3, difference 2.7 (95% CI 0.6 to 4.8)

Psychological General Wellbeing Index depression (0-15): 9.0 vs. 7.7, difference 1.3 (95% CI −0.1 to 2.7)

SF-36 physical function (0-100): 42.2 vs. 43.9, difference −1.7 (95% CI −8.6 to 5.2)

FIQ depression (0-10): 4.6 vs. 6.1

FIQ anxiety (0-10): 5.1 vs. 6.7, difference −1.6 (95% CI −2.6 to −0.6)

Martin, 2012²⁶⁶

6 months

Duration of pain: Mean 14 to 15 years

Poor

A. Multidisciplinary treatment (n=54): conventional pharmacological treatment, 12 sessions of CBT, education, and physiotherapy over 6 weeks

B. Usual care (n=56): conventional pharmacological treatmentincluding amitriptyline, paracetamol, and tramadol

A vs. B

Age: 49 vs. 52 years

Female: 91% vs. 91%

Baseline FIQ total (0-100): 76.3 vs. 76.2

Baseline FIQ physical functioning (0-10): 5.5 vs. 5.4

Baseline FIQ pain (0-10): 7.5 vs. 7.5

A vs. B

6 months

FIQ total: 70.3 vs. 76.8, difference −6.5 (95% CI −12.3 to −0.7)

FIQ physical function: 5.2 vs. 5.9, difference −0.7 (95% CI −1.4 to −0.04)

FIQ pain: 7.2 vs. 8.2, difference −1.0 (95% CI −1.7 to −0.3)

A vs. B

6 months

Hospital Anxiety and Depression Scale anxiety (HADS, 0-21): 13.4 vs. 12.8, difference 0.66 (95% CI −1.02 to 2.34)

HADS depression (0-21): 9.8 vs. 10.2, difference −0.43 (95% CI −2.00 to 1.14)

Saral, 2016²⁶⁸

6 months; 4 months based on intervention group^b

Duration of pain: 7.5 years

Fair

A. Long term interdisciplinary group (n=22): educational program (1 full day), exercise program (1 full day), and CBT (1, 3-hour session per week for 10 weeks); plus home strengthening and stretching exercises and relaxation

B. Short term interdisciplinary group (n=22): education, exercise, and CBT over 2 full days; plus home strengthening and stretching exercises and relaxation

C. Usual care (n=22): Patients continued current medical treatments, normal daily living, and current physical activity levels

A vs. B vs. C

Age, years: 38 vs. 43 vs. 44

Female: 100% vs. 100% vs. 100%

Symptom duration, months: 69 vs. 113 vs. 88

Baseline FIQ (0-100): 71.6 vs. 67.7 vs. 65.5

Baseline pain VAS (0-10): 8.2 vs. 7.6 vs. 7.5

A vs. C

4 months^b

FIQ: 53.9 vs. 65.5, difference −11.6 (95% CI −21.9 to −1.29)

Percent change from baseline in FIQ: −22.1% vs. 3.2%

Pain VAS: 5.1 vs. 7.6, difference −2.5 (95% CI −3.78 to −1.22)

Percent change from baseline in VAS pain: −38.3% vs. 1.5%

B vs. C

4 months^b

FIQ: 54.5 vs. 65.5, difference −11.0 (95% CI −19.5 to −2.5)

Percent change from baseline in FIQ: −18.9% vs. 3.2%

Pain VAS: 5.8 vs. 7.6, difference −1.8 (95% CI −2.6 to −1.0)

Percent change from baseline in VAS pain: −22.8% vs. 1.5%

A vs. C

4 months^b

BDI: 16.6 vs. 18.7, difference −2.1 (95% CI −8.2 to 4.0)

SF-36 PCS: 39.9 vs. 34.3, difference 5.6 (95% CI 0.61 to 10.6)

SF-36 MCS: 40.7 vs. 37.6, difference 3.1 (95% CI −4.1 to 10.3)

Sleep VAS: 3.0 vs. 4.9, difference −1.9 (95% CI −3.8 to −0.04)

B vs. C

4 months^b

BDI: 15.0 vs. 18.7 (9.5), difference −3.7 (95% CI −10.2 to 2.8)

SF-36 PCS: 39.6 vs. 34.3, difference 5.3 (95% CI −0.03 to 10.6)

SF-36 MCS: 40.2 vs. 37.6, difference 2.6 (95% CI −4.0 to 9.2)

Sleep VAS: 3.1 vs. 4.9 difference −1.8 (95% CI −3.6 to 0.02)

Van Eijk-Hustings, 2013⁹⁶

18 months

Duration of pain: Mean of 6.1 to 7.1 years

Fair

A. Multidisciplinary intervention (n=108): 36 days of sessions of sociotherapy, physiotherapy, psychotherapy, and creative arts therapy over 12 weeks

B. Aerobic exercise (n=47): 24 sessions over 12 weeks

C. Usual care (n=48): education and lifestyle advice in addition to usual care

A vs. B vs. C

Age: 41 vs. 39 vs. 43 years

Female: 93% vs. 100% vs. 98%

Baseline FIQ physical function (0-10): 4.2 vs. 3.6 vs. 3.4

Baseline FIQ total (0-100): 64.5 vs. 60.0 vs. 55.4

Baseline FIQ pain (0-10): 6.3 vs. 6.2 vs. 5.5

A vs. B^c

18 months

FIQ physical function: 3.6 vs. 3.6, difference 0 (95% CI −0.79 to 0.79)

FIQ total: 50.9 vs. 52.0, difference −1.10 (95% CI −8.40 to 6.20)

FIQ pain: 5.3 vs. 5.2, difference 0.10 (95% CI −0.67 to 0.87)

A vs. C

18 months

FIQ physical function: 3.6 vs. 3.9, ES 0.12 (−0.22 to 0.46)

FIQ total: 50.9 vs. 56.2, ES 0.25 (95% CI −0.09 to 0.59)

FIQ pain: 5.3 vs. 5.3, ES −0.01 (95% CI −0.35 to 0.34)

A vs. B^c

18 months

FIQ Depression: 3.9 vs. 5.0, difference −1.1 (95% CI −2.2 to 0.01)

FIQ Anxiety: 4.7 vs. 5.0, difference −0.30 (95% CI −1.41 to 0.81)

EQ-5D (−0.59 to 1): 0.6 vs. 0.5, difference 0.01 (95% CI −0.10 to 0.12)

GP consultations^d: 0.9 vs. 1.0, difference −0.10 (95% CI −0.89 to 0.69)

Medical specialist consultations^d: 0.3 vs. 0.4, difference −0.10 (95% CI −0.43 to 0.23)

Physiotherapist consultations^d: 2.6 vs. 0.4, difference 2.20 (95% CI 0.69 to 3.71)

Other paramedical professional consultations^d: 1.0 vs. 2.1, difference −1.10 (95% CI −2.21 to 0.01)

A vs. C

18 months

FIQ depression: 3.9 vs. 4.2, ES 0.10 (95% CI −0.24 to 0.44)

FIQ anxiety: 4.7 vs. 4.8, ES 0.03 (95% CI −0.31 to 0.37

EQ-5D: 0.55 vs. 0.51, ES 0.12 (95% CI −0.22 to 0.46)

GP consultations^d: 0.9 vs. 0.7, ES=−0.11 (95% CI −0.45 to 0.23)

Medical specialist consultations^d: 0.3 vs. 0.2, ES=−0.14 (95% CI −0.48 to 0.20) Physiotherapist consultations^d: 2.6 vs. 2.8, ES=0.04 (95% CI −0.30 to 0.38)

Other paramedical professional consultations^d: 1.0 vs. 0.2, ES=−0.28 (95% CI −0.62 to 0.06)

: BDI = Beck Depression Inventory; CBT = cognitive behavioral therapy; CI = confidence interval; ES = effect size; EQ-5D = EuroQol-5D; FIQ = Fibromyalgia Impact Questionnaire; GP = general practitioner; HADS = Hospital Anxiety and Depression Scale; MOS = Medical Outcomes Study; NR = not reported; NRS = Numeric Rating Scale; OR = odds ratio; SF-36 MCS = Short-Form 36 Mental Component Scale; SF-36 PCS = Short-Form 36 Physical Component Scale; VAS = visual analog scale; WONCA = World Organization of Family Docotors
a: Unless otherwise noted, followup time is calculated from the end of the treatment period
b: Long term multidisciplinary group was followed up at 4 months from end of intervention and the short term multidisciplinary and control groups were followed up at 6 months from end up intervention
c: Authors did not provide effect estimates for the comparison of multidisciplinary rehabilitation versus exercise; mean differences were calculated by the EPC
d: Total number of consultations over a period of 2 months prior to measurement

Table 44Chronic tension headache: psychological therapies

Author, Year, Followup,^a Pain Duration, Study Quality

Intervention

Population

Function and Pain Outcomes

Other Outcomes

Blanchard, 1990¹²⁸

1 month

Duration of pain: mean 14.2 years

Poor

A. Cognitive Stress Coping Training + PMR (n=17): 11, 45-90 minute sessions once or twice per week for 8 weeks

B. PMR alone (n=22): 10, 30-70 minute sessions twice weekly for 3 weeks followed by once weekly for 3 weeks with a final session at week 8

C. Pseudomeditation (attention control) (n=19): body awareness and mental control training; 11 sessions over 8 weeks, 40-45 minutes each

D. Waitlist (n=19): monitoring via phone, clinical visits and patient diaries.

A vs. B vs. C vs. D

Age: 38 vs. 43 vs. 39 vs. 37 years

Female: 56% vs. 58% vs. 45% vs. 66%

Mean duration of chronicity: 13.0 vs. 13.9 vs. 15.3 vs. 14.3 years

Baseline Headache Index Scores: mean 5.82 vs. 5.63 vs. 5.23 vs. 5.05

Baseline Medication Index Scores: mean 39.8 vs. 16.9 vs. 12.1 vs. 24.0

A vs. C

1 month

≥50% improvement (i.e., reduction) in headache frequency: 62.5% vs. 43.7%; RR 1.43 (95% CI 0.81 to 1.97)

Headache Index Scores: 3.2 vs. 4.6; difference −1.4 (95% CI −4.3 to 1.5)

A vs. D

1 month

≥50% improvement (i.e., reduction) in headache frequency: 62.5% vs. 20.0%; RR 3.13 (95% CI 0.91 to 2.45)

Headache Index Scores: 3.2 vs. 4.5; difference −1.3 (95% CI −3.9 to 1.4)

B vs. C

1 month

≥50% improvement (i.e., reduction) in headache frequency: 31.6% vs. 43.7%; RR 0.72 (95% CI 0.65 to 1.69)

Headache Index Scores: 3.8 vs. 4.6; difference −0.8 (95% CI −3.2 to 1.6)

B vs. D

1 month

≥50% improvement (i.e., reduction) in headache frequency: 31.6% vs. 20%; RR 1.58 (95% CI 0.75 to 2.11)

Headache Index Scores: 3.8 vs. 4.5; difference −0.6 (95% CI −2.7 to 1.5)

A vs. C

1 month

Medication Index Scores: 20.7 vs. 8.3; difference 12.4 (95% CI −6.8 to 31.6)

A vs. D

1 month

Medication Index Scores: 20.7 vs. 22.5; difference −1.8 (95% CI −23.8 to 20.2)

B vs. C

1 month

Medication Index Scores: 9.8 vs. 8.3; difference 1.5 (95% CI −6.8 to 9.8)

B vs. D

1 month

Medication Index Scores: 9.8 vs. 22.5; difference −12.7 (95% CI −25.6 to 0.21)

Holroyd, 1991¹³²

1 month

Duration of pain: mean 10.7 years

Poor

A. CBT (n=19): 3, 1 hour sessions over 8 weeks

B. Amitriptyline therapy (n=17): Individualized dosage at 25, 50, or 75 mg/day for 8 weeks

A + B

Age: 32.3 years

Female: 80%

A vs. B

Baseline % of Headache-free days: 18.0 vs. 18.5

Baseline Headache Index scores (0−10): 2.17 vs. 2.04

Baseline Headache Pain Peak scores (0−10): 6.41 vs. 6.36

A vs. B

1 month

Proportion with >66% reduction in headaches (substantial improvement): 37% vs. 18%; RR 2.09 (95% CI 0.79 to 2.23)

Proportion with 33-66% reduction in headaches (moderate improvement): 53% vs. 35%; RR 1.49 (95% CI 0.80 to 2.03)

% of Headache-free days: 54.7 vs. 42.3; difference 12.4 (95% CI −8.06 to 32.86)

Headache Index scores: 0.96 vs. 1.49; difference −0.53 (95% CI −1.14 to 0.08)

Headache Peak scores: 4.33 vs. 4.55; difference −0.22 (95% CI −1.70 to 1.26)

A vs. B

1 month

BDI (0-63): 5.16 vs. 5.56; difference −0.4 (95% CI −3.96 to 3.16)

STPI Anxiety (20-80): 18.37 vs. 19.06; difference −0.69 (95% CI −3.99 to 2.62)

STPI Anger (20-80): 19.47 vs. 17.44; difference 2.03 (95% CI −1.98 to 6.04)

WPSI (scale NR): 16.05 vs. 20.50; difference −4.45 95% CI −9.78 to 0.87)

Analgesic Tablets: 0.26 vs. 0.82; difference −0.56 (95% CI −1.16 to 0.04)

Holroyd, 2001¹²⁹

1 and 6 months

Duration of pain: mean 11.8 years

Poor

A. Stress Management Therapy + Placebo (n=34): 3, 1 hour sessions

B. Placebo (n=26)

Treatment Protocol: identical to group C

C. Antidepressant Medications (n=44):

Low starting dose (12.5 mg/day increased to 25mg, then 50mg) with the possibility to switch to nortriptyline

A vs. B vs. C

Age: 37 vs. 38 vs. 36 years

Female: 80% vs. 79% vs. 66%

Caucasian: 91% vs. 98% vs. 98%

Duration of pain: 12.3 vs. 11.1 vs. 11.9 years

Headache frequency, days/month: 26.5 vs. 26.1 vs. 25.1

Baseline Headache Index (0−10): 2.8 vs. 2.7 vs. 2.8

Baseline Days/month with at least moderately severe headache (≥5 on 0−10 scale): 13.5 vs. 13.5 vs. 14.1

A vs. B

1 month

Days/month with at least moderately severe headache: difference 2.5 (95% CI −0.1 to 5.2)

Headache Disability Inventory (0−100): difference 7.3 (95% CI 1.6 to 13.0)

Headache Index: difference 0.46 (95% CI 0.02 to 0.89)

6 months

Patients who experienced ≥50% reductions in Headache Index Scores: 35% vs. 29%; RR 1.18 (95% CI 0.79 to 1.79)

Days/month with at least moderately severe headache: difference 5.1 (95% CI 2.3 to 8.0)

Headache Disability Inventory: difference 9.3 (95% CI 3.5 to 15.1)

Headache Index: difference 0.79 (95% CI 0.30 to 1.28)

A vs. C

1 month

Days/month with at least moderately severe headache: difference −3.5 (95% CI −6.1 to −0.9)

Headache Disability Inventory: difference 0.1 (95% CI −5.6 to 5.7)

Mean Headache Index: difference −0.54 (95% CI −0.97 to −0.012)

6 months

Patients who experienced >50% reductions in Headache Index Scores: 35% vs. 38%; RR 0.92 (95% CI 0.71 to 1.54)

Days/month with at least moderately severe headache: difference 0.1 (95% CI −2.7 to 2.9)

Headache Disability Inventory: difference 2.4 (95% CI −3.3 to 8.0)

Headache Index: difference −0.13 (95% CI −0.61 to 0.35)

A vs. B

1 month

Weighted analgesic use: difference −1.7 (95% CI −12.0 to 8.6)

6 months

Weighted analgesic use: difference 11.8 (95% CI 1.5 to 22.1)

A vs. C

1 month

Weighted analgesic use: difference −19.4 (95% CI −29.5 to −9.3)

6 months

Weighted analgesic use: difference −6.2 (95% CI −16.2 to 3.8)

: BDI = Beck Depression Inventory; CBT = cognitive-behavioral therapy; CI = confidence interval; NR = not reported; PMR = Progressive Muscle Relaxation; RR = risk ratio; STPI = State-Trait Personality Inventory; VAS = visual analog scale; WPSI = Wahler Physical Symptom Inventory
a: Unless otherwise noted, followup time is calculated from the end of the treatment period

Table 45Chronic tension headache: physical modalities

Author, Year, Followup,^a Pain Duration, Study Quality

Intervention

Population

Function and Pain Outcomes

Other Outcomes

Bono, 2015¹⁶⁹

1 month, 2 months

Duration of pain: >2 years (mean NR)

Poor

A. Occipital TES (n=54): Electro-stimulator generated biphasic impulses via electrodes placed on occipital region bilaterally; pulse width: 250 µs; frequency: 40 Hz; intensity 20 mA.

B. Sham (n=29): Same device and procedure, but no current was delivered.

Treatment protocol: 30 minute sessions 3 times daily for two consecutive weeks (42 sessions total)

A vs. B

Age: 42 vs. 40 years

Female: 81% vs. 66%

Race: NR

Headache frequency: mean 29.0 days/month

Medication overuse: 43% vs. 52%

Baseline MIDAS (0-21+): 63 vs. 50

Baseline VAS pain (0−10): 8 vs. 8

A vs. B

1 month

Patients who achieved >50% reduction in headache days: 85% vs. 7%; RR 12.4 (95% CI 3.2 to 47.3)

2 months

MIDAS: 16 vs. 51; difference −35.0 (95% CI −42.6 to −27.4)

VAS pain (0−10): 3 vs. 8; difference −5.0 (95% CI −5.8 to −4.2)

Proportion of patients still overusing medications: 7% vs. 48%; RR 0.15 (95% CI 0.06 to 0.42)

A vs. B

2 months

BDI-II: 7 vs. 8; difference −1.0 (95% CI −2.2 to 0.2)

HAM-A: 6 vs. 7; difference −1.0 (95% CI −1.9 to −0.1)

: BDI-II = Beck Depression Inventory-II; CI = confidence interval; HAM-A = Hamilton Anxiety Rating Scale; Hz = Hertz; mA = milliamps; MIDAS = Migraine Disability Assessment Questionnaire; NR = not reported; RR = risk ratio; SD = standard deviation; TES = transcutaneous electrical stimulation; VAS = visual analog scale; µs = microsecond
a: Unless otherwise noted, followup time is calculated from the end of the treatment period

Table 46Chronic tension headache: manual therapies

Author, Year, Followup,^a Pain Duration, Study Quality

Intervention

Population

Function and Pain Outcomes

Other Outcomes

Boline, 1995¹⁸⁸

1 month

Duration of pain: 13.5 years

Poor

A. Spinal Manipulative Therapy (n=70): short-lever, low-amplitude, high-velocity thrust techniques on cervical, thoracic or lumbar spinal segments. Moist heat and light massage preceded manipulation; 12, 20 minute sessions (2 per week for 6 weeks)

B. Amitriptyline (n=56): dose titration of amitriptyline for 6 weeks. Nighttime, daily doses began at 10mg/day for first week, then increased to 20mg/day in the second, followed by 30mg/day in the third week and after; continued use of OTC medications as-needed.

A vs. B

Age: 41 vs. 42 years

Female: 54% vs. 70%

Race: NR

Baseline Daily headache intensity (0-20)^b: 5.6 vs. 5.0

Baseline Weekly headache frequency (0-28)^c: 12.4 vs. 10.8

A vs. B

1 month

Daily headache intensity^b: adjusted means 3.8 vs. 5.2; difference 1.4 (95% CI 0.3, 2.3)

Weekly headache frequency^c: adjusted means 7.6 vs. 11.8; difference 4.2 (95% CI 1.9, 6.5)

A vs. B

1 month

SF-36 Function Health Status Global Score (% points): adjusted means 78.8 vs. 73.9; difference 4.9 (95% CI 0.4, 9.4)

OTC medication usage: adjusted means 1.3 vs. 2.2; difference 0.9 (95% CI 0.3, 1.5)

Castien, 2011¹⁸⁷

4.5 months

Duration of pain: 13 years

Fair

A. Spinal Manipulation (n=38): combination of 3 approaches at the therapist discretion: mobilizations of the cervical and thoracic spine, craniocervical muscle exercises and posture correction; maximum of 9, 30-minute sessions over 2 months

B. Usual Care (n=37): 2-3 general practitioner visits over 2 months

A vs. B

Age, years: 40 vs. 40 years

Female: 78% vs. 78%

Race: NR

Mean frequency of headache (days/month): 24 vs. 24

NSAID use: 29% (mean 3 pills/week); Analgesic use: 59% (mean 1.5 pills/week)

Baseline HIT-6 (36-78): 62.6 vs. 61.2

Baseline HDI (0-100): 39.6 vs. 44.2

Baseline Pain intensity, NRS (0-10): 6.3 vs. 5.7

A vs. B

4.5 months

Proportion of patients with ≥50% reduction in headache frequency: 81.6% vs. 40.5%; RR 2.01 (95% CI 1.32 to 3.05)

HIT-6, mean change from baseline: −10.6 vs. −5.5; difference 5.0 (95% CI −9.02 to −1.16)

HDI, mean change from baseline: −20.0 vs. −9.9; difference −10.1 (95% CI −19.5 to −0.64)

Headache frequency (days/14 days), mean change from baseline: −9.1 vs. −4.1; difference −4.9 (95% CI −6.95 to −2.98)

Pain intensity mean change from baseline: −3.1 vs. −1.7; difference −1.4 (95% CI −2.69 to −0.16)

Headache duration (hrs./day), mean change from baseline: −7.0 vs. −3.5; difference −3.5 (95% CI −7.71 to −0.63)

A vs. B

4.5 months

Resource use, proportion who used:

≥1 sick leave day: 7.9% vs. 32.4%; RR 0.23 (95% CI 0.07 to 0.79)

Any additional healthcare: 13.2% vs. 59.4%; RR 0.22 (95% CI 0.09 to 0.52)

Additional physical therapy: 2.6% vs. 40.5%; RR 0.06 (95% CI 0.01 to 0.47)

Additional medical specialist care: 2.6% vs. 16.2%; RR 0.16 (95% CI 0.02 to 1.28)

Additional “other” healthcare”: 7.8% vs. 2.7%; RR 2.9 (95% CI 0.3 to 26.8)

: CI = confidence interval; HDI = Headache Disability Index; HIT-6 = Headache Impact Test-6; mg = milligram; NR = not reported; NRS = numeric rating scale; NSAID = nonsteroidal anti-inflammatory drugs OTC = over-the-counter; RR = risk ratio; SF-36 = Short-Form-36 Questionnaire
a: Unless otherwise noted, followup time is calculated from the end of the treatment period
b: Headache intensity was calculated as the total ratings per period and divided by the number of days per period
c: Headache frequency was calculated by summing all headache ratings 2 and above for the month

Table 47Chronic tension headache: acupuncture

Author, Year, Followup,^a Pain Duration, Study Quality

Intervention

Population

Function and Pain Outcomes

Other Outcomes

Ebneshahidi, 2005²⁵¹

3 months

Duration of pain: NR

Fair

A. Low-Energy Laser Acupuncture (n=25): 4 acupoints (two local and two distal), bilaterally (8 total): intensity 1.3J, output 100%, continuous mode, using vertical contact with pressure and a duration of 43 seconds.

B. Sham Laser Acupuncture (n=25): Identical procedure to real electroacupuncture except power output set to 0

Treatment Protocol: 3 sessions per week for a total of 10 sessions (session length: NR)

A vs. B

Age: 33 vs. 39 years

Female: 80% vs. 80%

Race: NR

Baseline Number of headache days per month (0-28), median: 20 vs. 18

Baseline Pain intensity on VAS (0-10), median: 10 vs. 10

Baseline Duration of attacks, (hours), median: 10 vs. 8

A vs. B

3 months

Headache Days/Month, median change from baseline: −8 vs. 0, p<0.001

Headache Intensity (VAS), median change from baseline: −2 vs. 0, p<0.001

Duration of attacks (hours), median change from baseline: −4 vs. 0, p<0.001

Karst, 2000²⁵²

1.5 months

Duration of pain: NR

Poor

A. Acupuncture (n=21)

Traditional Chinese acupuncture; maximum of 15 needles, 10 acupoints

B. Sham Acupuncture (n=18): blunt placebo needles and elastic foam were used to simulate puncturing and shield needle type.

Treatment Protocol: 30-minute sessions twice weekly for 5 weeks (10 sessions total)

A vs. B

Age: 50 vs. 47 years

Female: 38% vs. 61%

Race: NR

Headache frequency: 27 vs. 27 days/month

VAS (0-10): 6.2 vs. 6.3

Analgesic Intake/Month: 8.3 vs. 10.2

A vs. B

1.5 months

Frequency of headache attacks/month: 22.1 vs. 22.0; difference 0.1 (95% CI −6.6 to 6.8)

Headache Severity, VAS: 4.0 vs. 3.9; difference 0.1 (95% CI −11.9 to 12.1)

A vs. B

1.5 months

Analgesic Intake/Month: 13.7 vs. 21.2; difference −7.5 (95% CI −22.2 to 7.2)

Tavola, 1992²⁵³

1, 6, 12 months

Duration of pain: 8 years

Poor

A. Acupuncture (n=15):

Traditional Chinese acupuncture; 6-10 acupoints chosen on an individual basis; insertion depth 10-20 mm; needles were left in place without the use of any manual or electrical stimulation

B. Sham Acupuncture (n=15): same number of needles, inserted more superficially (depth 2-4 mm), in the same region used in real acupuncture group but in areas without acupuncture points

Treatment Protocol: 20-minute sessions once per week for 8 weeks (8 sessions total)

A vs. B

Age: 33 vs. 33 years

Female: 87% vs. 87%

Mean frequency of headache attacks per month: 18 vs. 17

Mean analgesic use: 12 vs. 12 units/month

Mean HI (intensity X duration X frequency/30): 4.3 vs. 4.5

Mean duration of attacks (sum of the hours of headache in a month/number of attacks): 3.3 vs. 4.4

A vs. B

1 month

Responders, ≥33% improvement in HI: 86.7% vs. 60.0%; RR 1.44 (95% CI 0.91 to 2.28)

Responders, ≥50% improvement in HI: 53.3% vs. 46.7%; RR 1.14 (95% CI 0.56 to 2.35)

HI, mean^b: 2.4 vs. 3.0; difference −0.60 (95% CI −6.12 to 4.92)

Mean decrease in HI from baseline: 58.3% vs. 27.8%

Mean decrease in headache attack frequency from baseline: 44.3% vs. 21.4%

6 months

HI, mean^b: 2.2 vs. 3.1; difference −0.90 (95% CI −7.15 to 5.35),

12 months

Responders, ≥33% improvement in HI: 53.3% vs. 46.7%; RR 1.14 (95% CI 0.56 to 2.35)

Responders, ≥50% improvement in HI: 40.0% vs. 26.7%; RR 1.50 (95% CI 0.53 to 4.26)

HI, mean^b: 3.2 (2.1) vs. 3.7 (2.2); difference −0.50 (95% CI −6.73 to 5.73)

A vs. B

1 month

Mean decrease in analgesic consumption from baseline: 57.7% vs. 21.7%

: CI = confidence interval; HI = headache index; J = joule; NR = not reported; RR = risk ratio; VAS = visual analog scale
a: Unless otherwise noted, followup time is calculated from the end of the treatment period
b: Means and standard error of the means (not shown) estimated from graphs.

Bookshelf ID: NBK556231

Contents

< Prev Next >