HPV Self-Sampling for Primary Cervical Cancer Screening: A Review of Diagnostic Test Accuracy and Clinical Evidence – An Update

Yi-Sheng Chao; Suzanne McCormack

NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

HPV Self-Sampling for Primary Cervical Cancer Screening: A Review of Diagnostic Test Accuracy and Clinical Evidence – An Update

CADTH Rapid Response Report: Summary with Critical Appraisal

Authors

Yi-Sheng Chao and Suzanne McCormack.

Ottawa (ON): Canadian Agency for Drugs and Technologies in Health; 2019 May 30.

Abbreviations

CADTH: Canadian Agency for Drugs and Technologies in Health
HPV: human papillomavirus
PCR: polymerase chain reaction
RCT: randomized controlled trial
SR: systematic review

Context and Policy Issues

The introduction of cervical cancer screening and timely intervention is associated with the recent decrease in cervical cancer incidence.¹ There are several options to screen cervical cancer. Two of the methods commonly used in Canada are cytology and human papillomavirus (HPV) tests.² Cytology requires clinicians to obtain samples from the cervix for further examination.² HPV tests that detect the infection of HPV also requires samples from the cervix.² The HPV tests that detect certain types of carcinogenic HPV genotypes, especially genotypes 16 and 18, are called high-risk HPV tests.³ The samples can be obtained via brushes or swabs or other devices not only by clinicians, but also by screening participants.³ Clinician-sampled HPV tests are used in screening program in several countries, such as Italy⁴ and Denmark.⁵ Self-sampled HPV tests have been tested in the capital region in Denmark but have not replaced clinician-sampled tests.⁵

With feasibility to conduct at home and potentially better acceptability to participants, self-sampled HPV tests have been used to reach individuals that are unscreened or under-screened for cervical cancer.⁴ In a previous CADTH report, there was some evidence to show similar diagnostic test accuracy between self- and clinician-sampled HPV tests.⁶ For example, the diagnostic test accuracy of GP5+/6+ polymerase chain reaction (PCR) HPV tests using samples taken with brushes is similar for self- and clinician-collected samples.⁷ In several primary studies, fair to high agreement between self- and clinician-sampled HPV tests has been found.⁶

Since the previous CADTH review, there have been primary studies comparing self- and clinician-sampled HPV tests published⁸^,⁹ and a systematic review has been updated.³ This report updates the previous review on the difference in the diagnostic test accuracy of self-sampled HPV tests and the agreement between self- and clinician-sampled HPV tests.

Research Questions

What is the diagnostic test accuracy of self-sampled HPV tests compared with clinician-sampled HPV tests or cytology for asymptomatic cervical cancer screening?
What is the clinical evidence regarding the agreement or concordance of self-sampled HPV tests and clinician-sampled HPV tests or cytology for asymptomatic cervical cancer screening?

Key Findings

Similar to the meta-analysis reviewed in in the previous CADTH report, Arbyn et al. confirmed the difference in diagnostic test accuracy between self- and clinician-collected samples remained significant for signal amplification-based HPV tests. Since the publication of the CADTH report, there were new studies that have examined self- and clinician-sampled HPV tests.

Two systematic reviews, two randomized controlled trials (RCTs), and ten non-randomized studies were identified. In the updated meta-analysis by Arbyn et al., human papillomavirus (HPV) tests were categorized into polymerase chain reaction (PCR) and signal amplification-based tests. Self-sampled HPV tests based on PCR for the detection of cervical intraepithelial neoplasia (CIN) grade 2 or more severe were shown to not have statistically different sensitivity or specificity compared with clinician-sampled tests. However, self-sampled HPV tests based on signal amplification were not as accurate for the detection of CIN2+.

Moderate to excellent agreement between self- and clinician-sampled HPV tests was reported in primary studies. Various HPV tests were tested in different healthcare settings.

However, it was unclear whether the differences in the agreement were associated with the types of HPV tests. There was heterogeneity between the included studies and the impact on the diagnostic accuracy or agreement of self- and clinician-sampled HPV tests was unclear.

Further research on the diagnostic test accuracy and agreement between self- and clinician-sampled HPV tests in target populations could reduce the uncertainties in the application of self-sampled HPV tests.

Methods

Literature Search Methods

This report makes use of a literature search developed for a previous CADTH report of April 2018.⁶

For the current report, a limited literature search was conducted by an information specialist on key resources including Ovid Medline, Embase, the Cochrane Library, University of York Centre for Reviews and Dissemination (CRD) databases, Canadian and major international health technology agencies, as well as a focused Internet search. The search strategy was comprised of both controlled vocabulary, such as the National Library of Medicine’s MeSH (Medical Subject Headings), and keywords. The main search concepts were self-sampling and the human papilloma virus (HPV). No filters were applied to limit retrieval by study type. The search was also limited to English language documents published between January 1, 2017 and April 26, 2019.

Selection Criteria and Methods

One reviewer screened citations and selected studies. In the first level of screening, titles and abstracts were reviewed and potentially relevant articles were retrieved and assessed for inclusion. The final selection of full-text articles was based on the inclusion criteria presented in Table 1.

Exclusion Criteria

Articles were excluded if they did not meet the selection criteria outlined in Table 1, they were duplicate publications, or were published prior to 2017. If patients with known HPV status or cytology results were recruited, primary studies that did not specify the screening programs were not eligible for this report. Guidelines with unclear methodology were also excluded. Studies included in selected systematic reviews were also excluded.

Critical Appraisal of Individual Studies

The included systematic reviews were critically appraised by one reviewer using the AMSTAR II checklist,¹⁰ randomized and non-randomized studies were critically appraised using the Downs and Black checklist.¹¹ Diagnostic test accuracy studies were also assessed with the QUADAS-2 checklist.¹² Summary scores were not calculated for the included studies; rather, a review of the strengths and limitations of each included study were described narratively.

Summary of Evidence

Quantity of Research Available

A total of 233 citations were identified in the literature search. Following screening of titles and abstracts, 210 citations were excluded and 23 potentially relevant reports from the electronic search were retrieved for full-text review. One potentially relevant publication was retrieved from the grey literature search for full text review. Of these potentially relevant articles, ten publications were excluded for various reasons, and 14 publications met the inclusion criteria and were included in this report. These comprised two systematic reviews, two RCTs, and ten non-randomized studies. Appendix 1 presents the PRISMA¹³ flowchart of the study selection.

Additional references of potential interest are provided in Appendix 6.

Summary of Study Characteristics

Study Design

In the updated systematic review (SR) by Arbyn et al. in 2018, articles published up to April 2018 were searched and diagnostic studies that conducted HPV tests on self- and clinician-collected samples from the same individuals were included.³ For a study to be included, the presence or absence of cervical intraepithelial neoplasia (CIN) grade 2 or more severe needed to be verified with colposcopy.³ In the 2017 SR by Kelly et al., articles published between January 2004 and February 2017 were searched and studies that evaluated the performance of point-of-care HPV tests for the detection of CIN2+ or CIN3+ were eligible.¹⁴ The review included seven studies (study design not specified) using careHPV for the detection of CIN2+ and four studies using careHPV for the detection of CIN3+.¹⁴ The overlap in the included studies was shown in Appendix 5. There were 76 accuracy studies included in Arbyn et al.³ and eight included in Kelly et al.¹⁴ Three studies were included in both SRs.¹⁴

Polman et al. conducted one non-inferiority RCT to compare the sensitivity of self- and clinician-collected samples used for HPV tests to detect CIN2+ or CIN3+.¹⁵ Ajenifuja et al. randomized participants into two groups: provider sampling before self-sampling and self-sampling before provider sampling to compare the agreement of HPV DNA tests with self- or provider-collected samples.¹⁶

Thay et al., Des Marais et al., Lam et al., Senomago et al., Wong et al., Zhang et al., Cremer et al., and Obiri-Yeboah et al. conducted cohort studies.⁵^,⁸^,⁹^,¹⁷^–²¹ Zhang et al. follow-up the participants for 15 years.⁹ The follow-up lengths in the remaining non-randomized studies ranged from 6 to 20 months, if reported.⁵^,¹⁵^,²⁰

Toliman et al. and Phoolcharoen et al. reported cross-sectional findings in eligible populations.²²^,²³

Country of Origin

The first authors of the SRs were based in Belgium³ and the UK.¹⁴

The first authors of the RCTs were based in the Nethrelands¹⁵ and Nigeria.¹⁶ The first authors of the non-randomized studies were based in Cambodia (1 study),¹⁷ Papua New Guinea and Australia (1 study),²² the US (2 studies),¹⁸^,¹⁹ Denmark (1 study),⁵ Thailand (1 study),²³ China (2 studies),⁸^,⁹ the US and Salvador (1 study),²⁰ and Ghana (1 study).²¹

Patient Population

In the SR by Arbyn et al., individuals participating in diagnostic studies that used self- and clinician-collected samples in HPV tests for the detection of CIN2+ were eligible.³ In the SR by Kelly et al., individuals recruited in cross-sectional or cohort studies that evaluated point-of-care HPV tests for the detection of CIN2+ or CIN3+ were included.¹⁴

In the RCT by Polman et al., individuals aged 29 to 61 years in a regular cervical cancer screening program were randomized.¹⁵ In the RCT by Ajenifuja et al., individuals presenting for cervical cancer screening were recruited.¹⁶

In the non-randomized studies, populations of different characteristics were recruited. Thay et al. enrolled human immunodeficiency virus (HIV) positive or negative participants in a cohort study or from a hospital in Cambodia.¹⁷ Toliman et al. recruited those attending cervical cancer screening at clinics.²² Because self-sampled HPV tests had the potential to reach infrequently screened individuals, Des Marais et al. focused on low-income and infrequently screened individuals according to national guidelines in the US (no screening in the past four years).¹⁸ Lam et al. recruited non-attenders (not screened for at least four or six years depending on the ages) in the Copenhagen Self-sampling Initiative (CSi) and used participants in the Horizon study (cytology and clinician-sampled HPV tests) as comparison.⁵ Phoolcharoen et al. obtained samples from a colposcopy clinic.²³ Senkomago et al. and Wong et al. recruited female sex workers.⁸^,¹⁹ Zhang et al. followed up participants without a history of cervical cancer or hysterectomy aged 35 to 45 years beginning in 1999 for 15 years.⁹ Cremer et al. analyzed participants aged 30 to 49 years in a cervical cancer screening program.²⁰ Obiri-Yeboah et al. recruited those attending HIV and outpatient clinics in a cervical cancer screening study.²¹

Interventions and Comparators

In the SRs by Arbyn et al., the HPV tests were categorized into signal amplification and PCR-based tests and used for self- and clinician-collected samples.³ Kelly et al. only included studies that evaluated the diagnostic test accuracy of two HPV tests: careHPV and OncoE6 using self- or clinician-collected samples.¹⁴

In the RCTs by Polman et al. and Ajenifuja et al., HPV tests on self- and clinician-collected samples were compared (GP5+/6+ PCR enzyme immunoassay and Hybribio GenoArray, respectively).¹⁵^,¹⁶

In the non-randomized studies by Thay et al., Toliman et al., Des Marais et al., Poolcharoen et al., Senkomago et al., Wong et al., Zhang et al., Cremer et al., and Obiri-Yeboah et al. HPV tests on self- and clinician-collected samples were compared.⁸^,⁹^,¹⁷^,¹⁹^–²³

Lam et al. compared self-sampled HPV tests and routine screening that included cytology or co-testing (HPV tests and cytology).⁵

In addition, Thay et al. also studied the effectiveness of visualization with acetic acid and digital colposcopy.¹⁷ Zhang et al. also tested visual inspection with acetic acid.⁹

Outcomes

The diagnostic test accuracy in this report was based on the detection of colposcopy-confirmed cases (CIN2+ or CIN3+). Sensitivity was the number of identified cases (positive on both HPV and colposcopy) divided by the total number of colposcopy confirmed cases.⁶ Specificity was the number of non-cases (negative on both HPV and colposcopy) divided by the total number of colposcopy negative cases.⁶ Positive predictive values were the number of identified cases divided by the total number of individuals with positive HPV test results.⁶ Negative predictive values were the number of identified non-cases divided by the total number of participants with negative HPV test results.⁶

In the SR by Arbyn et al., the outcomes in the study selection criteria were diagnostic test accuracy and response rates to screening (i.e. proportions of the invited population who participated in screening).³ Kelly et al. reported diagnostic test accuracy including sensitivity and specificity.¹⁴

In the RCT by Polman et al., the outcome was the detection of CIN2+ or CIN3+ and diagnostic test accuracy (sensitivity and specificity) was derived.¹⁵ Ajenifuja et al. reported the degree of agreement between self- and provider-sampled HPV tests.¹⁶

In the non-randomized studies, Thay et al. estimated HPV prevalence and the proportions detected with self- and clinician-collected samples.¹⁷ Toliman et al., Des Marais et al., Phoolcharoen et al., Senkomago et al., Wong et al., Cremer et al., and Obiri-Yeboah et al. reported the agreement of HPV detection with self- and clinician-collected samples.⁸^,¹⁸^–²³ Lam et al. reported the positive predictive values for the detection of CIN2+.⁵ Zhang et al. reported the cumulative diagnostic test accuracy for the detection of CIN2+ at baseline, 6-, 11-, and 15-year follow-up.⁹

Additional details regarding the characteristics of included publications are provided in Appendix 2.

Summary of Critical Appraisal

Systematic reviews

In the SRs by Arbyn et al. and Kelly et al., the population, intervention, comparator, and outcome components were described.³^,¹⁴ The selection of study design was explained.³^,¹⁴ Comprehensive literature search strategies, details in the included studies, critical appraisal with published tools, and review authors’ competing interests were described.³^,¹⁴ The risk of bias in the included studies were considered when interpreting the results and the heterogeneity across the included studies were explained.³^,¹⁴ There were no lists of excluded studies for either SRs.³^,¹⁴ Arbyn et al. and Kelly et al. conducted meta-analyses and adopted appropriate statistical methods (bivariate models in both SRs).³^,¹⁴ Arbyn et al. adjusted for inter-study heterogeneity by conducting sensitivity analysis.³ However, Kelly et al. did not conduct sensitivity analysis due to an insufficient number of primary studies.¹⁴ In both studies, the potential impact of the risk of bias in the include studies was assessed.³^,¹⁴ Arbyn et al. did not find evidence of publication bias.³ Kelly et al. did not examine publication bias due to a small number of included studies.¹⁴

Arbyn et al. updated a meta-analysis that was previously published.³ Kelly et al. did not publish the protocol a priori.¹⁴

Arbyn et al. selected studies and extracted data in duplicate³ and Kelly et al. did not.¹⁴

RCTs

In the RCTs by Polman et al. and Ajenifuja et al., the hypotheses, main outcomes, patient characteristics, interventions, distributions of principal confounders, main findings, and the random variability for the main outcomes were described.¹⁵^,¹⁶ The participants and outcomes assessors were not blinded.¹⁵^,¹⁶ The lengths of follow-up were similar between groups.¹⁵^,¹⁶ The statistical tests to assess the outcomes were appropriate.¹⁵^,¹⁶ The compliance (attending screening) and the outcome measures (HPV tests) were reliable.¹⁵^,¹⁶ Participants of different groups were recruited from the same populations within the same periods of time.¹⁵^,¹⁶ Participants were randomized to different groups.¹⁵^,¹⁶ Allocation to interventions was not concealed.¹⁵^,¹⁶ The lengths of follow-up were adequate.¹⁵^,¹⁶ Confounders were adjusted in the analysis.¹⁵^,¹⁶

Polman et al. considered important adverse effects and described patients lost to follow-up.¹⁵ It was not reported whether there was differential attrition between groups.¹⁵

Participants lost to follow-up were not considered in the analysis.¹⁵ Patients lost to follow-up were not reported in the RCT by Ajenifuja et al.¹⁶

The population asked to participate in the RCT by Polman et al. was representative of the population of interest.¹⁵ The representativeness of the sample in Ajenifuja et al. was not assessed.¹⁶

Polman et al. reported the 95% confidence intervals.¹⁵ Ajenifuja et al. did not report the actual probability values (P values).¹⁶

The sample sizes were not estimated (power analysis) before the studies began.¹⁵^,¹⁶

In addition, the RCT by Polman et al. was assessed with the QUADAS-2 tool.¹² Consecutive samples were enrolled and a case-control design was not adopted.¹⁵ Inappropriate exclusion criteria that were associated with disease prevalence or other confounding factors were not used.¹⁵ The HPV tests were interpreted without the knowledge of colposcopy results.¹⁵ The diagnosis thresholds for the HPV tests were provided.¹⁵ The reference standard, colposcopy, was likely to correctly classify the target condition.¹⁵ There were appropriate intervals between index tests and the reference standard.¹⁵ All patients were eligible for the same reference standard, colposocopy, if positive on the HPV screening test.¹⁵ Participants with negative results in cytology or HPV tests were referred to routine screening.¹⁵ The cases identified in routine screening could be used to calculate diagnostic test accuracy.¹⁵ However, the reference standard results were not interpreted without the results of the index and comparator tests.¹⁵ Participants lost to follow-up were not included in the analysis.¹⁵

Non-randomized studies

In the nonrandomized studies, the hypotheses, main outcomes, patient characteristics, interventions, distributions of principal confounders, main findings, and the random variability for the main outcomes were described.⁵^,⁸^,⁹^,¹⁷^–²³ The participants and outcomes assessors were not blinded.⁵^,⁸^,⁹^,¹⁷^–²³ The lengths of follow-up were similar between groups in each study.⁵^,⁸^,⁹^,¹⁷^–²³ The statistical tests to assess the outcomes were appropriate.⁵^,⁸^,⁹^,¹⁷^–²³ The compliance (attending screening) and the outcome measures (HPV test results) were reliable.⁵^,⁸^,⁹^,¹⁷^–²³ Participants of different groups were recruited from the same populations within the same periods of time.⁵^,⁸^,⁹^,¹⁷^–²³ Participants were not randomized to different groups.⁵^,⁸^,⁹^,¹⁷^–²³ Allocation to interventions was not concealed.⁵^,⁸^,⁹^,¹⁷^–²³ The lengths of follow-up were adequate.⁵^,⁸^,⁹^,¹⁷^–²³

The actual probability values (P values) were reported,⁵^,⁸^,⁹^,¹⁷^–²² except for the non-randomized study by Phoolcharoen et al. that did not report probability values.²³

Cremer et al., Des Marais et al., Obiri-Yeboah et al., Thay et al., Toliman et al., Wong et al., and Zhang et al. considered confounding in the analysis.⁸^,⁹^,¹⁷^,¹⁸^,²⁰^–²²

Obiri-Yeboah et al., Phoolcharoen et al., Thay et al., and Toliman et al. did not report patients lost to follow-up.¹⁷^,¹⁸^,²¹^,²³ Cremer et al., Des Marais et al., Lam et al., Senkomago et al., Wong et al., and Zhang et al. described patients lost to follow-up, but whether the distributions were significant was not determined.⁵^,⁸^,⁹^,¹⁸^–²⁰

The population asked to participate in the non-randomized study by Lam et al. was representative of the population of interest.⁵ However, the representativeness of the participants in other studies was not assessed.⁵^,⁸^,⁹^,¹⁷^–²³

Wong et al. reported important adverse effects,⁸ while others did not.⁵^,⁹^,¹⁷^–²³

In addition, the non-randomized studies by Lam et al. and Zhang et al. were assessed with the QUADAS-2 checklist.¹² Lam et al. recruited consecutive samples⁵ and Zhang et al. enrolled random samples.⁹ Both avoided case-control design.⁵^,⁹ Inappropriate exclusion criteria that were associated with disease prevalence and other factors were not used.⁵^,⁹ The index or comparator tests were not interpreted with the knowledge of the results of the reference standard (colposcopy).⁵^,⁹ The thresholds for the index and comparator tests were provided.⁵^,⁹ The reference standard, colposocopy, was likely to correctly classify the target condition.⁵^,⁹ There were appropriate intervals between the index or comparator tests and the reference standard.⁵^,⁹ All patients were eligible for the same reference standard if test positive.⁵^,⁹ Those with negative test results were followed-up in routine screening⁵ or the subsequent rounds of screening.⁹ However, in both studies, the reference standard results were not interpreted without the knowledge of the results of the index or comparator tests.⁵^,⁹ Participants lost to follow-up were not included in the analysis.⁵^,⁹

Additional details regarding the strengths and limitations of included publications are provided in Appendix 3.

Summary of Findings

Diagnostic test accuracy of self-sampled high-risk HPV tests

Systematic Reviews

In the SR by Arbyn et al., the pooled sensitivity and specificity of high-risk HPV tests based on PCR (GP5+/5+ PCR-EIA, Abbott RT PCR hrHPV, Anyplex II HR, cobas 4800 HPV test, GP5+/6+-LMNX, Linear Array, HPV Risk assay, Xpert HPV) for the detection of CIN2+ using self-samples were not significantly different to those using clinician samples (96% and 79% respectively for both self and clinician samples).³ There was a significant difference in the sensitivity and specificity of high-risk HPV tests based on signal amplification (Hybrid Capture and Cervista).³ Self-sample HPV tests based on signal amplification were significantly less sensitive (77% versus 93%) and significantly less specific (84% versus 86%) than clinician-sampled HPV tests.³

Kelly et al. reported the sensitivity and specificity of self- and clinician-sampled HPV tests, but did not determine the statistical significance of the difference in the diagnostic test accuracy between these two sampling methods.¹⁴ The pooled sensitivity and specificity of careHPV tests using clinician-collected samples were 88.1% and 83.7% respectively for the detection of CIN2+ and 90.3% and 85.3% respectively for the detection of CIN3+.¹⁴ The pooled sensitivity and specificity of careHPV tests using self-collected samples were 73.6% and 88.0% respectively for the detection of CIN2+ and 75.2% and 90.6% respectively for the detection of CIN3+.¹⁴ Kelly et al. concluded the diagnostic test accuracy was good using careHPV and considered the sensitivity using self-collected samples lower.¹⁴

RCTs

In the RCT by Polman et al., the sensitivity and specificity of self-sampled HPV tests were 92.9% and 93.9% respectively for the detection of CIN2+ and 95.1% and 93.4% for the detection of CIN3+ (GP5+/6+ PCR enzyme immunoassay).¹⁵ The sensitivity and specificity of clinician-sampled HPV tests were 96.4% and 94.2% respectively for the detection of CIN2+ and 95.8% and 93.5% for the detection of CIN3+.¹⁵ There was no significant difference in the sensitivity or specificity between self- and clinician-sampled HPV tests.¹⁵

Non-randomized studies

Lam et al. reported that the positive predictive values were higher among the CSi attenders (self-sampled HPV tests) (36.5% respectively) than those in the Horizon study (cotesting with cytology and HPV tests, 25.6%).⁵ The HPV tests included Hybrid Capture 2, CLART, and Aptima.⁵ Lam et al. concluded that self-sampling was associated with higher detection rates than cytology and cotesting (HPV tests and cytology) in a group of screening non-attendees.⁵

Zhang et al. reported the baseline, 6-, 11- and 15-year cumulative sensitivity and specificity for the detection of CIN2+.⁹ Zhang et al. concluded that single self-sampled HPV tests were less sensitive than clinician-sampled HPV tests.⁹ However, both tests were equal in colposcopy referral rates and the detection rates of CIN2+ on cumulative cases.⁹

Agreement of self-sampled high-risk HPV tests and clinician-sampled high-risk HPV tests or cytology

RCTs

Ajenifuja et al. reported statistically significant moderate correlation with a κ value as 0.47 (95% CI, 0.213 to 0.723) between self- and clinician-sampled HPV tests (Hybribio GenoArray).¹⁶

Non-randomized studies

Thay et al. defined HPV infections as those detected by both self- and clinician-collected samples.¹⁷ Self-sampled HPV tests identified 89% of HPV infections (50 out of 56) and clinician-sampled HPV test identified 80% (45 out of 56) (careHPV).¹⁷ Thirty-nine HPV infections were identified by both self- and clinician-sampled HPV tests.¹⁷ There was no significant difference in the detection rates between self- and clinician-collected samples.¹⁷ Toliman et al. reported the agreement in high-risk HPV detection between self- and clinician-collected samples was substantial (k >0.6 in 32 pair-wise comparisons of HPV tests: Xpert, Cobas, and Aptima).²² Des Marais et al. reported moderate to good agreement between self- and clinician-sampled HPV tests (Aptima, k = 0.56 to 0.66).¹⁸ Phoolcharoen et al. reported the moderate agreement of HPV tests between self- and clinician-collected samples was 74.5% with a κ value as 0.46 (Cobas).²³ Senkomago et al. reported moderate agreement between self- and clinician-sampled HPV tests that increased over time (Aptima, k = 0.55 and 0.83 at baseline and 24 months respectively).¹⁹ Senkomago et al. considered operational proficiency by the participants to obtain samples might be related to the increased agreement over time.¹⁹ Wong et al. identified substantial agreement between self- and clinician-sampled HPV tests (unspecified HPV test, k = 0.69).⁸ Cremer et al. reported the agreement with a k value as 0.70 (Hybrid Capture 2).²⁰ Obiri-Yeboah et al. identified excellent agreement (careHPV, k = 0.88).²¹

Phoolcharoen et al. reported the lowest agreement, k = 0.46,²³ and Obiri-Yeboah et al. reported the highest agreement, k = 0.88.²¹ There were several differences in health care settings and HPV tests. Phoolcharoen et al., recruited participants from a colposcopy clinic in Thailand.²³ Obiri-Yeboah et al. recruited individuals from a teaching hospital in Ghana.²¹ The HPV tests used were Cobas and careHPV.²¹^,²³ Without reference standards in both studies or other tests to understand the diagnostic test accuracy of self- and clinician-sampled HPV tests,²¹^,²³ the exact reasons for the difference in the agreement were unclear.

Appendix 4 presents a table of the main study findings and authors’ conclusions.

Limitations

There were different degrees of heterogeneity in the primary studies included in the SRs.³^,¹⁴ The differences in health care settings, HPV tests, screening strategies, and participants’ technical proficiency might influence the comparability of the study results of the primary studies and the generalizability to populations that were not considered in the SRs. The population characteristics in the primary studies also varied between studies.⁵^,⁸^,⁹^,¹⁷^,²¹^,²³ It was unclear whether the differences in HPV prevalence attributed to the variations in the agreement between self- and clinician-sampled HPV tests. The diagnostic test accuracy was not uniform for all HPV tests and there were no sufficient sample sizes to determine the differences between devices. The types of HPV tests were not described in several studies and the impact on the results was not clear.⁸^,²⁰

Conclusions and Implications for Decision or Policy Making

There two SRs,³^,¹⁴ two RCTs¹⁵^,¹⁶ and ten non-randomized studies included.⁵^,⁸^,⁹^,¹⁷^–²³

Diagnostic test accuracy of self-sampled high-risk HPV tests

The diagnostic test accuracy of self- and clinician-sampled HPV tests for the detection of CIN2+ were available in one low-quality SR,³ one critically low-quality SR,¹⁴ one fair-quality RCT,¹⁵ and two fair-quality non-randomized studies.⁵^,⁹

In the updated meta-analysis by Arbyn et al., self-sampled HPV tests based on PCR for the detection of CIN2+ did not have statistically different sensitivity or specificity compared with clinician-sampled tests.³ However, self-sampled HPV tests based on signal amplification were statistically significantly less sensitive and specific for the detection of CIN2+.³ Kelly et al. meta-analyzed the diagnostic test accuracy of careHPV (signal amplification-based)³ and also concluded that careHPV tests using self-collected samples were less sensitive than the same test using clinician-collected samples for the detection of CIN2+ or CIN3+.¹⁴ In the RCT by Polman et al., self- and clinician-sampled PCR- based HPV tests were similarly accurate for the detection of CIN2+ or CIN3+.¹⁵

Agreement of self- and clinician-sampled high-risk HPV tests

The agreement of self- and clinician-sampled HPV tests was available in one fair-quality RCT,¹⁶ seven fair-quality non-randomized studies,⁸^,¹⁷^–²² and one poor-quality non-randomized study.²³

In the RCT, moderate agreement between self- and clinician-sampled HPV tests was reported using Hybribio GenoArray.¹⁶ In non-randomized studies, moderate to excellent agreement between self- and clinician-sampled HPV tests was found.⁸^,¹⁸^,²⁰^–²² Similarly Thay et al. used careHPV and did not find significant differences in the HPV detection rates between self- and clinician-collected samples.¹⁷

In a difference from the earlier meta-analysis⁷ reviewed in the previous CADTH report,⁶ Arbyn et al. categorized the HPV tests into PCR- and signal amplification-based tests and did not focus on the types of storage medium and sampling devices, such as brushes or swabs.⁷ The difference in diagnostic test accuracy between self- and clinician-collected samples remained significant for signal amplification-based HPV tests.⁶^,⁷ Since the publication of the CADTH report,⁶ there were new studies that continued showing moderate to excellent agreement between self- and clinician-sampled HPV tests.⁸^,¹⁷^,¹⁸^,²¹^–²³

However, it was unclear whether the differences in the agreement were associated with the types of HPV tests being studied. There was heterogeneity between the included studies and the impact on the diagnostic accuracy or agreement of self- and clinician-sampled HPV tests was unclear.

Based on available evidence, self-sampled HPV tests could provide similar accuracy to clinician-sampled tests, particularly for PCR-based HPV tests. Moderate to excellent agreement between self- and clinician-sampled HPV tests was observed in primary studies conducted in various healthcare settings. Further research on the diagnostic test accuracy and agreement between self- and clinician-sampled HPV tests in target populations could reduce the uncertainties in the application of self-sampled HPV tests.

References

1.: Canadian cancer statistics 2017 special topic: pancreatic cancer. Toronto (ON): Canadian Cancer Society; 2017: http://www.cancer.ca/~/media/cancer.ca/CW/publications/Canadian%20Cancer%20Statistics/Canadian-Cancer-Statistics-2017-EN.pdf. Accessed 2019 May 29.
2.: HPV Testing for Primary Cervical Cancer Screening: A Health Technology Assessment. (CADTH optimal use report vol.7, no.1b). Ottawa (ON): Canadian Agency for Drugs and Technology in Health; 2019: https://www.cadth.ca/sites/default/files/ou-tr/op0530-hpv-testing-for-pcc-report.pdf. Accessed 2019 May 29. [PubMed: 31246380]
3.: Arbyn M, Smith SB, Temin S, et al. Detecting cervical precancer and reaching underscreened women by using HPV testing on self samples: updated meta-analyses. BMJ. 2018;363:k4823. [PMC free article: PMC6278587] [PubMed: 30518635]
4.: Melnikow J, Henderson JT, Burda BU, Senger CA, Durbin S, Weyrich MS. Screening for cervical cancer with high-risk human papillomavirus testing: updated evidence report and systematic review for the US Preventive Services Task Force. JAMA. 2018;320(7):687–705. [PubMed: 30140883]
5.: Lam JUH, Elfstrom KM, Ejegod DM, et al. High-grade cervical intraepithelial neoplasia in human papillomavirus self-sampling of screening non-attenders. Br J Cancer. 2018;118(1):138–144. [PMC free article: PMC5765223] [PubMed: 29136403]
6.: Chao Y-S, Clark M, Ford C. HPV self-sampling for primary cervical cancer screening: a review of diagnostic test accuracy and clinical evidence. (CADTH Rapid response report: summary with critical appraisal). Ottawa (ON): Canadian Agency for Drugs and Technologies in Health; 2018: https://www.cadth.ca/hpv-self-sampling-primary-cervical-cancer-screening-review-diagnostic-test-accuracy-and-clinical. Accessed 2019 May 29. [PubMed: 30329248]
7.: Arbyn M, Verdoodt F, Snijders PJF, et al. Accuracy of human papillomavirus testing on self-collected versus clinician-collected samples: a meta-analysis. The Lancet Oncology. 2014;15(2):172–183. [PubMed: 24433684]
8.: Wong ELY, Cheung AWL, Huang F, Chor JSY. Can Human Papillomavirus DNA Self-sampling be an Acceptable and Reliable Option for Cervical Cancer Screening in Female Sex Workers? Cancer Nurs. 2018;41(1):45–52. [PubMed: 28114260]
9.: Zhang L, Xu XQ, Hu SY, et al. Durability of clinical performance afforded by self-collected HPV testing: A 15-year cohort study in China. Gynecol Oncol. 2018;151(2):221–228. [PubMed: 30269870]
10.: Shea BJ, Reeves BC, Wells G, et al. AMSTAR 2: a critical appraisal tool for systematic reviews that include randomised or non-randomised studies of healthcare interventions, or both. BMJ. 2017;358:j4008. [PMC free article: PMC5833365] [PubMed: 28935701]
11.: Downs SH, Black N. The feasibility of creating a checklist for the assessment of the methodological quality both of randomised and non-randomised studies of health care interventions. J Epidemiol Community Health. 1998;52(6):377–384. [PMC free article: PMC1756728] [PubMed: 9764259]
12.: Whiting PF, Rutjes AW, Westwood ME, et al. QUADAS-2: a revised tool for the quality assessment of diagnostic accuracy studies. Ann Intern Med. 2011;155(8):529–536. [PubMed: 22007046]
13.: Liberati A, Altman DG, Tetzlaff J, et al. The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate health care interventions: explanation and elaboration. J Clin Epidemiol. 2009;62(10):e1–e34. [PubMed: 19631507]
14.: Kelly H, Mayaud P, Segondy M, Pant Pai N, Peeling RW. A systematic review and meta-analysis of studies evaluating the performance of point-of-care tests for human papillomavirus screening. Sex Transm Infect. 2017;93(S4):S36–S45. [PubMed: 29223961]
15.: Polman NJ, Ebisch RMF, Heideman DAM, et al. Performance of human papillomavirus testing on self-collected versus clinician-collected samples for the detection of cervical intraepithelial neoplasia of grade 2 or worse: a randomised, paired screen-positive, non-inferiority trial. Lancet Oncol. 2019;20(2):229–238. [PubMed: 30658933]
16.: Ajenifuja OK, Ikeri NZ, Adeteye OV, Banjo AA. Comparison between self sampling and provider collected samples for Human Papillomavirus (HPV) Deoxyribonucleic acid (DNA) testing in a Nigerian facility. Pan Afr Med J. 2018;30:110. [PMC free article: PMC6195243] [PubMed: 30364362]
17.: Thay S, Goldstein A, Goldstein LS, Govind V, Lim K, Seang C. Prospective cohort study examining cervical cancer screening methods in HIV-positive and HIV-negative Cambodian Women: a comparison of human papilloma virus testing, visualization with acetic acid and digital colposcopy. BMJ Open. 2019;9(2):e026887. [PMC free article: PMC6443060] [PubMed: 30804036]
18.: Des Marais AC, Zhao Y, Hobbs MM, et al. Home Self-Collection by Mail to Test for Human Papillomavirus and Sexually Transmitted Infections. Obstet Gynecol. 2018;132(6):1412–1420. [PMC free article: PMC6249061] [PubMed: 30399091]
19.: Senkomago V, Ting J, Kwatampora J, et al. High-risk HPV-RNA screening of physician- and self-collected specimens for detection of cervical lesions among female sex workers in Nairobi, Kenya. Int J Gynaecol Obstet. 2018;143(2):217–224. [PubMed: 30047987]
20.: Cremer M, Maza M, Alfaro K, et al. Scale-Up of an Human Papillomavirus Testing Implementation Program in El Salvador. J Low Genit Tract Dis. 2017;21(1):26–32. [PMC free article: PMC5201413] [PubMed: 27922905]
21.: Obiri-Yeboah D, Adu-Sarkodie Y, Djigma F, et al. Self-collected vaginal sampling for the detection of genital human papillomavirus (HPV) using careHPV among Ghanaian women. BMC Womens Health. 2017;17(1):86. [PMC free article: PMC5615631] [PubMed: 28950841]
22.: Toliman PJ, Kaldor JM, Badman SG, et al. Evaluation of self-collected vaginal specimens for the detection of high-risk human papillomavirus infection and the prediction of high-grade cervical intraepithelial lesions in a high-burden, low-resource setting. Clin Microbiol Infect. 2019;25(4):496–503. [PubMed: 29906593]
23.: Phoolcharoen N, Kantathavorn N, Krisorakun W, et al. Agreement of self- and physician-collected samples for detection of high-risk human papillomavirus infections in women attending a colposcopy clinic in Thailand. BMC Res Notes. 2018;11(1):136. [PMC free article: PMC5819229] [PubMed: 29458440]

Appendix 1. Selection of Included Studies

Appendix 2. Characteristics of Included Publications

Table 2Characteristics of Included Systematic Reviews and Meta-Analyses

First Author, Publication Year, Country

Study Designs and Numbers of Primary Studies Included

Population Characteristics

Intervention and Comparator(s)

Clinical Outcomes, Length of Follow-Up

Arbyn et al. 2018,³ Belgium

Diagnostic test accuracy; RCTs for response outcomes

56 accuracy studies and 25 participation trials

Databases searched: Medline (PubMed), Embase, and CENTRAL

Study selection criteria -- Diagnostic studies: “a vaginal sample was collected by a woman herself (self sample) followed by a cervical sample collected by a clinician (clinician sample); the same hrHPV assay was performed on both samples; and the presence or absence of CIN2+ was verified by colposcopy and biopsy in all enrolled women, or in women with one or more positive tests. Studies with cytological follow-up for women with negative colposcopy results at baseline assessment were accepted as well, but were indexed for sensitivity analyses” (p. 2)

Individuals participating in diagnostic test accuracy studies or RCTs

Self sampling arm (intervention arm) Invited to collect a self sample for hrHPV testing;

versus

Control arm Invited or reminded to undergo a screening test on a clinician sample

HPV tests identified: signal amplification (including Hybrid Capture and Cervista), polymerase chain reaction (PCR, including GP5+/5+ PCR-EIA, Abbott RT PCR hrHPV, Anyplex II HR, cobas 4800 HPV test, GP5+/6+-LMNX, Linear Array, HPV Risk assay, Xpert HPV)

Diagnostic test accuracy

Response rates to screening; test positivity rates, adherence to follow-up in women who were screened, and detection of CIN2+.

Follow-up lengths not reported

Kelly et al. 2017,¹⁴ UK

Study selection criteria: Cross-sectional or cohort studies evaluating HPV-POC tests against histological endpoint of CIN2+ or CIN3+

Databases searched: Medline, Embase, Global Health and CINAHL

English language only, human subjects only, 1 January 2004 to 25 February 2017

29,657 women in 7 studies using careHPV for the detection of CIN2+

27,845 women in 4 studies using careHPV for the detection of CIN3+

Population inclusion criteria: Any sexually active populations consistent with the WHO screening guidelines in any geographical location, female only, include HIV-positive patients

Self-

versus

Physician-collected samples

HPV tests

careHPV or OncoE6

Diagnostic test accuracy including sensitivity, specificity, Positive Predictive Value (PPV) and Negative Predictive Value (NPV; including 95% CIs).

Follow-up lengths not reported

: CENTRAL = Cochrane Central Register of Controlled Trials; CIN = cervical intraepithelial neoplasia; EIA = enzyme immunoassay; HPV = human papilloma virus; hrHPV = high-risk human papilloma virus; PCR = polymerase chain reaction; RCT = randomized controlled trial; WHO = World Health Organization

Table 3Characteristics of Included Primary Clinical Studies

First Author, Publication Year, Country

Study Design

Population Characteristics

Intervention and Comparator(s)

Clinical Outcomes, Length of Follow-Up

Randomized controlled trials

Polman et al. 2019,¹⁵ the Netherlands

RCT, non-inferiority, part of a regular screening program, IMPROVE (full terms not specified) study

Dutch Trial register (NTR5078)

13,925 women analyzed

7,643 women were included in the self-sampling group and 6,282 in the clinician-based sampling group

Aged 29 to 61 years

Self-sampling group: women requested to collect their own cervicovaginal sample using an Evalyn Brush (Rovers Medical Devices BV, Oss, Netherlands)

versus

Clinician-based sampling group: samples collected by a general practitioner with a Cervex-Brush (Rovers Medical Devices BV)

HPV tests: validated GP5+/6+ PCR enzyme immunoassay (Labo Biomedical Products BV, Rijswijk, Netherlands).

Primary endpoints Detection of cervical intraepithelial neoplasia (CIN) of grade 2 or worse (CIN2+) and grade 3 or worse (CIN3+)

Diagnostic test accuracy (sensitivity and specificity)

Non-inferiority of HPV testing on self-collected versus clinician-collected samples: evaluated against a margin of 90% for the relative sensitivity and 98% for the relative specificity

Median follow-up duration for HPV-positive women: 20 months

Ajenifuja et al. 2018,¹⁶ Nigeria

RCT, two-arm, multiple visits, single centre

194 women presenting for cervical cancer screening underwent both self- and provider sampling for HPV DNA testing using Hybribio GenoArray.

self-

versus

provider sampling

Group A: provider sampling before self sampling

Group B: self sampling before undergoing provider sampling

HPV tests: HPV DNA testing using Hybribio GenoArray

Degree of agreement between self and provider sampling for HPV DNA tests

Follow-up time: not reported

Non-randomized studies

Thay et al. 2019,¹⁷ Cambodia

Cohort, prospective

250 Cambodian women between 30 and 49 years of age

129 HIV-positive and 121 HIV-negative

Recruited from the National Center for HIV/AIDS Dermatology and sexually transmitted disease cohort, the Sihanouk Hospital Center of Hope’s Rural Outreach Teams and the Pochentong Medical Center.

(1) self-sampled human papilloma virus (HPV) testing (careHPV system)

versus

(2) clinician-collected HPV testing

versus

(3) visualization with acetic acid

versus

(4) digital colposcopy with the Enhanced Visual Assessment System

HPV tests: careHPV system for 14 genotypes of hrHPV (16, 18, 31, 33, 35, 39, 45, 51, 52, 56, 58, 59, 66, 68)

HPV prevalence and cervical intraepithelial neoplasia (CIN) status

Agreement of self- and clinician-sampled HPV tests

HPV infection detected with self- or clinician-collected samples

Follow-up duration: not reported

Toliman et al. 2019,²² Papua New Guinea and Australia

Cross-sectional

1,005 women attending for cervical cancer screening at 2 clinics

Aged 30 to 59 years

Self-collected vaginal specimens

versus

Clinician-collected cervical specimens

HPV tests: Cepheid Xpert HPV, Roche Cobas 4800 HPV and Hologic Aptima HPV assays

Agreement in HPV detection

Sensitivity and specificity of self-sampled HPV tests for the detection of HPV infection identified with clinician-collected samples

High-grade squamous intraepithelial lesions (HSIL) detection for cytology

Follow-up not mentioned

Des Marais et al. 2018,¹⁸ US

Observational, 2^nd phase of My Body, My Test observational study

193 women overdue for cervical cancer screening by national guidelines

Low-income, infrequently screened women

Inclusion criteria: 30–64 years of age; no history of Pap testing in the past 4 years (overdue for screening by national guidelines at the start of the study); household income below 250% of the poverty level; not pregnant; not had a hysterectomy; and uninsured, underinsured, or had Medicaid insurance.

1) a cervicovaginal sample self-collected by brush at home and returned by mail (self-home sample), 2) a cervicovaginal sample self-collected by brush in a clinic and handed to a nurse (self-clinic sample),

versus

3) a cervical sample collected by brush by a clinician during a pelvic examination (clinician sample)

HPV tests: Aptima HPV assay (E6/E7 mRNA of 14 high-risk HPV genotypes (16, 18, 31, 33, 35, 39, 45, 51, 52, 56, 58, 59, 66, and 68)

High risk HPV, Chlamydia trachomatis, Neisseria gonorrhoeae, Trichomonas vaginalis, and Mycoplasma genitalium infection

Agreement between diagnostic test accuracy

Follow-up time not reported

Lam et al. 2018,⁵ Denmark

Cohort, including part of the Copenhagen Self-sampling Initiative (CSi) implementation Study and Horizon study

4865 non-attenders who participated in self-sampling and 4291 women attending routine screening

Analyzed with 3347 samples collected in the Horizon study (cytology and HPV tests with physician-collected samples)

Non-attendees: women who had not been screened for at least 4 (if aged 27–49 years) or 6 years (if aged 50–65 years).

Self-sampling in non-attenders

versus

Routine screening in women attending routine screening (cytology or contesting with HPV and cytology)

CSi: opt-in pilot project, self sampling, HPV+ followed up by HPV and cytology contesting, some screened by general practitioner-collected cytology

CSi non-responders: possibly symptom-based diagnosis

Horizon study: routine screening, physician-collected cytology, additionally tested for HPV

HPV tests: HC2, cobas, CLART, and APTIMA

Cervical intraepithelial neoplasia grade 2 or worse (≥CIN2) detection rate

Positive predictive values for the detection of CIN2+

HPV positivity defined by CLART (Genomica, Madrid, Spain) and Onclarity (BD, Sparks, MD, USA) assays

Follow-up: 18 months in the CSi study

Phoolcharoen et al. 2018,²³ Thailand

Cross-sectional, in a colposcopy clinic

247 pairs of samples

Inclusion criteria: attending the colposcopy clinic, aged 30–70 years, no history of cervical cancer, no hysterectomy, and currently not pregnant.

Self-sampling with a dry brush

versus

Physician-collected endocervical samples from the same individuals

HPV tests: Cobas4800 HPV test (Roche Molecular Diagnostics, Pleasanton, California, USA)

Concordance between vaginal self- and endocervical physician-collected high-risk HPV testing

Follow-up not reported

Senkomago et al. 2018,¹⁹ US

Cohort, prospective

350 female sex workers

Aged 18 to 50 years

Self-collected cervico-vaginal specimens for hrHPV RNA testing

versus

Physician collected cervical specimens for hrHPV-RNA testing and conventional cytology

hrHPV-RNA testing every 3 months

Conventional cytology every 6 months

HPV tests: Aptima (qualitatively detecting E6/E7 mRNA of 14 hrHPV types: 16, 18, 31, 33, 35, 39, 45, 51, 52, 56, 58, 59, 66, and 68)

HPV prevalence

Agreement between physician- and self-collected hrHPV-RNA results

Diagnostic test accuracy of HPV tests for the detection of high-grade squamous intraepithelial cervical lesions or worse

Follow-up between December 2, 2009, and February 15, 2013

Wong et al. 2018,⁸ China

Cohort

68 female sex workers

Inclusion criteria: aged 18 years or older, not currently pregnant, no known abnormal Papanicolaou test results and no symptoms of cervical cancer, genital cancer, cervical surgery, or immune treatment of the cervix during the 6 months before recruitment into the study

Self-sampling for HPV testing

versus

Clinician-obtained sample for HPV testing

HPV DNA testing in the university laboratory

Agreement in HPV detection rates between clinician and HPV DNA self-sampling

Agreement definition:

k < 0: poor

k = 0 to 0.20: slight

k = 0.21 to 0.40: fair

k = 0.41 to 0.60: moderate

k = 0.61 to 0.80: substantial

k = 0.81 to 1.00: perfect

Follow-up lengths not reported

Zhang et al. 2018,⁹ China

Cohort, prospective, the Shanxi Province Cervical Cancer Screening Study I (SPOCCS I)

1,997 women

Inclusion criteria: aged 35 to 45 with no history of cervical cancer or hysterectomy in 1999

HPV testing on self-collected and physician-collected samples, cytology and visual inspection with acetic acid (VIA)

HPV tests: HC2 assays (Hybrid Capture II, Qiagen Inc.)

Cumulative diagnostic test accuracy for the detection of CIN2+ at 6-year, 11-year and 15-year follow-up

Follow-up in 1999 (baseline), 2005, 2010 and 2014

Cremer et al. 2017,²⁰ US and El Salvador

Cohort, phase 2 of Cervical Cancer Prevention in El Salvador, 3 phases in total

N = 8050 in phase 2, aged 30 to 49 years

self- and provider-collected specimens

HPV tests not reported

Agreement of diagnostic test accuracy

Follow-up: 6 months after screening

Obiri-Yeboah et al. 2017,²¹ Ghana

Cohort, comparative frequency-matched study (1:5), part of a larger HPV and cervical cancer study conducted in the Cape Coast Teaching Hospital (CCTH)

194 women attending HIV and outpatient clinics in the Cape Coast Teaching Hospital, Ghana

Mean age 44.1 years (SD ± 11.3)

191 paired results

Self-collection of vaginal samples using the careHPV brush

versus

Clinician-collected cervical sample

HPV DNA (14 high-risk types) tests: careHPV assay (Qiagen) and HPV genotyping (Anyplex II, Seegene)

HPV detection concordance

Follow-up available if cytology required

: CCTH = Cape Coast Teaching Hospital; CIN = cervical intraepithelial neoplasia; CSi = Copenhagen Self-sampling Initiative; DNA = deoxyribonucleic acid; HC2 = Hybrid Capture 2; HPV = human papillomavirus; hrHPV = high-risk human papillomavirus; HSIL = high-grade squamous intraepithelial lesions; mRNA = messenger ribonucleic acid; PCR = polymerase chain reaction; RCT = randomized controlled trial; RNA = ribonucleic acid; SD = standard deviation; SPOCCS = Shanxi Province Cervical Cancer Screening Study

Appendix 3. Critical Appraisal of Included Publications

Table 4Strengths and Limitations of Systematic Reviews and Meta-Analyses using AMSTAR 2 checklist¹⁰

Strengths	Limitations
Arbyn et al., 2018³
- PICO components included in the research questions and inclusion criteria - Review protocol published a priori (update to a previous meta-analysis) - Selection of study designs explained - Comprehensive literature searches - Study selection in duplicate - Data extraction in duplicate - Included studies described - Risk of bias of the included studies appraised with published tools - Appropriate statistical methods used for synthesis - Risk of bias of the primary studies considered in meta-analysis - Risk of bias of the primary studies considered when discussing the results - Heterogeneity discussed - Publication bias investigated - Review authors’ conflict of interest reported	- Excluded studies not provided - Sources of funding for the included studies not provided
Kelly et al., 2017¹⁴
- PICO components included in the research questions and inclusion criteria - Selection of study designs explained - Comprehensive literature searches - Included studies described - Risk of bias of the included studies appraised with published tools - Appropriate statistical methods used for synthesis - Risk of bias of the primary studies considered in meta-analysis - Risk of bias of the primary studies considered when discussing the results - Heterogeneity discussed - Publication bias investigated - Review authors’ conflict of interest reported	- Excluded studies not provided - Sources of funding for the included studies not provided - Review protocol not published a priori - Study selection not in duplicate - Data extraction not in duplicate

: PICO = population, intervention, comparator, and outcome

Table 5Strengths and Limitations of Clinical Studies using the Downs and Black checklist¹¹

Strengths	Limitations
Randomized controlled trials
Polman et al., 2019¹⁵
- Hypothesis described - Outcomes to be measured described - Patient characteristics described - Intervention described - Distributions of principal confounders described - Main findings described - Estimates of random variability provided - Important adverse effects explicitly described - Patients lost to follow-up described - Individuals asked to participate representative of the population from which they were recruited - No significant changes to health care that the participants received, compared to that the majority received - Similar lengths of follow-up for participants in different groups - Appropriate statistical tests to assess the outcomes - Measures of compliance reliable - Accurate outcome measures - Participants recruited from the same period of time - Participants recruited from the same population - Participants randomized to different groups - Confounding adjusted in the analysis	- Participants not blinded - Outcome assessors not blinded - Intervention allocation not concealed - Participants lost to follow-up not taken into account - Power analysis for sample sizes not conducted
Ajenifuja et al., 2018¹⁶
- Hypothesis described - Outcomes to be measured described - Patient characteristics described - Intervention described - Distributions of principal confounders described - Main findings described - Estimates of random variability provided - No patients lost to follow-up reported - No significant changes to health care that the participants received, compared to that the majority received - Similar lengths of follow-up for participants in different groups - Appropriate statistical tests to assess the outcomes - Measures of compliance reliable - Accurate outcome measures - Participants recruited from the same period of time - Participants recruited from the same population - Participants randomized to different groups - Confounding adjusted in the analysis	- Participants not blinded - Outcome assessors not blinded - Intervention allocation not concealed - Power analysis for sample sizes not conducted - Important adverse effects not explicitly described - Actual probability values (P values) not reported
Non-randomized studies
Thay et al., 2019¹⁷
- Hypothesis described - Outcomes to be measured described - Patient characteristics described - Intervention described - Distributions of principal confounders described - Main findings described - No patients lost to follow-up reported - No significant changes to health care that the participants received, compared to that the majority received - Similar lengths of follow-up for participants in different groups - Appropriate statistical tests to assess the outcomes - Measures of compliance reliable - Accurate outcome measures - Participants recruited from the same period of time - Participants recruited from the same population - Confounding adjusted in the analysis - Actual probability values (P values) reported	- Participants not blinded - Outcome assessors not blinded - Intervention allocation not concealed - Power analysis for sample sizes not conducted - Important adverse effects not explicitly described - Participants not randomized to different groups
Toliman et al., 2019²²
- Hypothesis described - Outcomes to be measured described - Patient characteristics described - Intervention described - Distributions of principal confounders described - Main findings described - Estimates of random variability provided - No patients lost to follow-up reported - No significant changes to health care that the participants received, compared to that the majority received - Similar lengths of follow-up for participants in different groups - Appropriate statistical tests to assess the outcomes - Measures of compliance reliable - Accurate outcome measures - Participants recruited from the same period of time - Participants recruited from the same population - Confounding adjusted in the analysis - Actual probability values (P values) reported	- Participants not blinded - Outcome assessors not blinded - Intervention allocation not concealed - Power analysis for sample sizes not conducted - Important adverse effects not explicitly described - Participants not randomized to different groups
Des Marais et al., 2018¹⁸
- Hypothesis described - Outcomes to be measured described - Patient characteristics described - Intervention described - Distributions of principal confounders described - Main findings described - Estimates of random variability provided - Patients lost to follow-up described - No significant changes to health care that the participants received, compared to that the majority received - Similar lengths of follow-up for participants in different groups - Appropriate statistical tests to assess the outcomes - Measures of compliance reliable - Accurate outcome measures - Participants recruited from the same period of time - Participants recruited from the same population - Confounding adjusted in the analysis - Actual probability values (P values) reported	- Participants not blinded - Outcome assessors not blinded - Intervention allocation not concealed - Power analysis for sample sizes not conducted - Important adverse effects not explicitly described - Participants not randomized to different groups - Participants lost to follow-up not taken into account
Lam et al., 2018⁵
- Hypothesis described - Outcomes to be measured described - Patient characteristics described - Intervention described - Distributions of principal confounders described - Main findings described - Estimates of random variability provided - Patients lost to follow-up reported - Participants asked to participate representative of the population which they were recruited from - No significant changes to health care that the participants received, compared to that the majority received - Similar lengths of follow-up for participants in different groups - Appropriate statistical tests to assess the outcomes - Measures of compliance reliable - Accurate outcome measures - Participants recruited from the same period of time - Participants recruited from the same population - Actual probability values (P values) reported	- Participants not blinded - Outcome assessors not blinded - Intervention allocation not concealed - Power analysis for sample sizes not conducted - Important adverse effects not explicitly described - Participants not randomized to different groups - Confounding not adjusted in the analysis - Participants lost to follow-up not taken into account
Phoolcharoen et al., 2018²³
- Hypothesis described - Outcomes to be measured described - Patient characteristics described - Intervention described - Distributions of principal confounders described - Main findings described - Estimates of random variability provided - No patients lost to follow-up reported - No significant changes to health care that the participants received, compared to that the majority received - Similar lengths of follow-up for participants in different groups - Appropriate statistical tests to assess the outcomes - Measures of compliance reliable - Accurate outcome measures - Participants recruited from the same period of time - Participants recruited from the same population	- Participants not blinded - Outcome assessors not blinded - Intervention allocation not concealed - Power analysis for sample sizes not conducted - Important adverse effects not explicitly described - Participants not randomized to different groups - Confounding not adjusted in the analysis
Senkomago et al., 2018¹⁹
- Hypothesis described - Outcomes to be measured described - Patient characteristics described - Intervention described - Distributions of principal confounders described - Main findings described - Estimates of random variability provided - Patients lost to follow-up reported - No significant changes to health care that the participants received, compared to that the majority received - Similar lengths of follow-up for participants in different groups - Appropriate statistical tests to assess the outcomes - Measures of compliance reliable - Accurate outcome measures - Participants recruited from the same period of time - Participants recruited from the same population - Actual probability values (P values) reported - Participants lost to follow-up taken into account in the analysis	- Participants not blinded - Outcome assessors not blinded - Intervention allocation not concealed - Power analysis for sample sizes not conducted - Important adverse effects not explicitly described - Participants not randomized to different groups - Confounding not adjusted in the analysis
Wong et al., 2018⁸
- Hypothesis described - Outcomes to be measured described - Patient characteristics described - Intervention described - Distributions of principal confounders described - Main findings described - Estimates of random variability provided - Patients lost to follow-up reported - No significant changes to health care that the participants received, compared to that the majority received - Similar lengths of follow-up for participants in different groups - Appropriate statistical tests to assess the outcomes - Measures of compliance reliable - Accurate outcome measures - Participants recruited from the same period of time - Participants recruited from the same population - Confounding adjusted in the analysis - Actual probability values (P values) reported - Important adverse effects explicitly described	- Participants not blinded - Outcome assessors not blinded - Intervention allocation not concealed - Power analysis for sample sizes not conducted - Participants not randomized to different groups - Participants lost to follow-up not taken into account in the analysis
Zhang et al., 2018⁹
- Hypothesis described - Outcomes to be measured described - Patient characteristics described - Intervention described - Distributions of principal confounders described - Main findings described - Estimates of random variability provided - Patients lost to follow-up reported - No significant changes to health care that the participants received, compared to that the majority received - Similar lengths of follow-up for participants in different groups - Appropriate statistical tests to assess the outcomes - Measures of compliance reliable - Accurate outcome measures - Participants recruited from the same period of time - Participants recruited from the same population - Confounding adjusted in the analysis - Actual probability values (P values) reported	- Participants not blinded - Outcome assessors not blinded - Intervention allocation not concealed - Power analysis for sample sizes not conducted - Important adverse effects not explicitly described - Participants not randomized to different groups - Participants lost to follow-up not taken into account in the analysis
Cremer et al., 2017²⁰
- Hypothesis described - Outcomes to be measured described - Patient characteristics described - Intervention described - Distributions of principal confounders described - Main findings described - Estimates of random variability provided - Patients lost to follow-up reported - No significant changes to health care that the participants received, compared to that the majority received - Similar lengths of follow-up for participants in different groups - Appropriate statistical tests to assess the outcomes - Measures of compliance reliable - Accurate outcome measures - Participants recruited from the same period of time - Participants recruited from the same population - Confounding adjusted in the analysis - Actual probability values (P values) reported	- Participants not blinded - Outcome assessors not blinded - Intervention allocation not concealed - Power analysis for sample sizes not conducted - Important adverse effects not explicitly described - Participants not randomized to different groups - Participants lost to follow-up not taken into account in the analysis
Obiri-Yeboah et al., 2017²¹
- Hypothesis described - Outcomes to be measured described - Patient characteristics described - Intervention described - Distributions of principal confounders described - Main findings described - Estimates of random variability provided - No patients lost to follow-up reported - No significant changes to health care that the participants received, compared to that the majority received - Similar lengths of follow-up for participants in different groups - Appropriate statistical tests to assess the outcomes - Measures of compliance reliable - Accurate outcome measures - Participants recruited from the same period of time - Participants recruited from the same population - Confounding adjusted in the analysis - Actual probability values (P values) reported	- Participants not blinded - Outcome assessors not blinded - Intervention allocation not concealed - Power analysis for sample sizes not conducted - Important adverse effects not explicitly described - Participants not randomized to different groups

Table 6Strengths and Limitations of Diagnostic Test Accuracy Studies using the QUADAS-2 checklist¹¹

Strengths	Limitations
Randomized controlled trials
Polman et al., 2019¹⁵
- Consecutive or randam samples enrolled - Case-control design avoided - Inappropriate exclusion criteria avoided - Selection of patients not likely to introduce bias - Index test results interpreted without the knowledge of the results of the reference standard (colposcopy) - Index test threshold provided - Comparator test results interpreted without the knowledge of the results of the reference standard (colposocopy) - Comparator test threshold provided - Reference standard likely to correctly classify the target condition - Appropriate intervals between index tests and reference standards - All patients eligible for the reference standard if test posivitve - The same reference standard for all patients	- Reference standard results not interpreted without the knowledge of the results of index or comparator tests - Participants lost to follow-up not included in the analysis
Non-randomized controlled trials
Lam et al., 2019⁵
- Consecutive or randam samples enrolled - Case-control design avoided - Inappropriate exclusion criteria avoided - Selection of patients not likely to introduce bias - Index test results interpreted without the knowledge of the results of the reference standard (colposcopy) - Index test threshold provided - Comparator test results interpreted without the knowledge of the results of the reference standard (colposocopy) - Comparator test threshold provided - Reference standard likely to correctly classify the target condition - Appropriate intervals between index tests and reference standards - All patients eligible for the reference standard if test posivitve - The same reference standard for all patients	- Reference standard results not interpreted without the knowledge of the results of index or comparator tests - Participants lost to follow-up not included in the analysis
Zhang et al., 2019⁹
- Consecutive or randam samples enrolled - Case-control design avoided - Inappropriate exclusion criteria avoided - Selection of patients not likely to introduce bias - Index test results interpreted without the knowledge of the results of the reference standard (colposcopy) - Index test threshold provided - Comparator test results interpreted without the knowledge of the results of the reference standard (colposocopy) - Comparator test threshold provided - Reference standard likely to correctly classify the target condition - Appropriate intervals between index tests and reference standards - All patients eligible for the reference standard if test posivitve - The same reference standard for all patients - Long-term (15 years) follow-up of the participants	- Reference standard results not interpreted without the knowledge of the results of index or comparator tests - Participants lost to follow-up not included in the analysis

Appendix 4. Main Study Findings and Authors’ Conclusions

Table 7Summary of Findings Included Systematic Reviews and Meta-Analyses

Main Study Findings	Authors’ Conclusion
Arbyn et al., 2018³
Self- versus clinician-collected samples hrHPV assays based on polymerase chain reaction for the detection of CIN2+ - Pooled sensitivity and specificity using self samples: 96% and 79% - Pooled sensitivity and specificity using clinician samples: 96% and 79% - No significant difference in sensitivity between self samples and clinician samples to detect CIN2+ or CIN3+ (pooled ratio 0.99, 95% CI, 0.97 to 1.02) - Self samples with significantly lower specificity to exclude CIN2+ (2%) than clinician samples hrHPV assays based on signal amplification for the detection of CIN2+ - Pooled sensitivity and specificity using self samples: 77% (95% CI 69% to 82%) and 84% (95% CI, 77% to 88%) - Pooled sensitivity and specificity using clinician samples: 93% (95% CI, 89% to 96%) and 86% (95% CI, 81% to 90%) - Self samples significantly less sensitive (pooled ratio 0.85, 95% CI, 0.80 to 0.89) than clinician samples - Self samples with significantly lower specificity to exclude CIN2+ (4%) than clinician samples Response rates to screening invitation - Mailing self-sampling kits to the woman’s home address with higher response rates than the other two options: 1) to have a sample taken by a clinician than invitation or 2) reminder letters (pooled relative participation in intention-to-treat-analysis of 2.33, 95% CI, 1.86 to 2.91) - Opt-in strategies (had to request a self-sampling kit) generally not more effective than invitation letters (relative participation of 1.22, 95% CI, 0.93 to 1.61) - Direct offer of self-sampling devices to women in communities that were under-screened with high participation rates (>75%) - Substantial inter-study heterogeneity (I²>95%).	- “When used with hrHPV assays based on polymerase chain reaction, testing on self samples was similarly accurate as on clinician samples” (p. 1) - “Offering self sampling kits generally is more effective in reaching underscreened women than sending invitations” (p. 1)
Kelly et al., 2017¹⁴
Self- versus clinician-collected samples - Pooled prevalence for CIN2+ and CIN3+: 2.3% and 1.1% respectively careHPV tests using clinician-collected cervical specimen - Sensitivity for CIN2+: 88.1% (95% CI, 81.4% to 92.7%) - Specificity for CIN2+: 83.7% (95% CI, 74.9% to 89.8%) - Sensitivity for CIN3+: 90.3% (95% CI, 83.4% to 94.5%) - Specificity for CIN3+: 85.3% (95% CI, 73.1% to 92.5%) careHPV tests using self-collected vaginal swabs - Sensitivity for CIN2+: 73.6% (95% CI, 64.9% to 80.8%) - Specificity for CIN2+: 88.0% (95% CI, 79.1% to 93.5%) - Sensitivity for CIN3+: 75.2% (95% CI, 66.8% to 82.0%) - Specificity for CIN3+: 90.6% (95% CI, 83.4% to 94.9%) OncoE6 tests (n = 2) - Sensitivity for CIN2+: 31.3% to 42.4% - Specificity for CIN2+: 99.1% to 99.4% - Sensitivity for CIN3+: 53.5% - Specificity for CIN3+: 98.9%	- “CareHPV has good sensitivity and specificity for the detection of CIN2+ and CIN3+, but sensitivity was lower using self-collected vaginal samples” (p. S36) - “The specificity is lower in high HPV prevalence populations such as women living with HIV” (p. S36)

: CI = confidence interval; CIN = cervical intraepithelial neoplasia; HIV = human immunodeficiency virus; HPV = human papillomavirus; hrHPV = highrisk human papillomavirus

Table 8Summary of Findings of Included Primary Clinical Studies

Main Study Findings	Authors’ Conclusion
Randomized controlled trials
Polman et al., 2019¹⁵
Self- versus clinician-collected samples using a PCR-based HPV test HPV prevalence - Self-collected samples: 569 (7.4%) - Clinician-collected samples: 451 (7.2%) - Relative risk = 1.04 (95% CI, 0.92 to 1.17) CIN2+ sensitivity and specificity of HPV testing (unadjusted) - Sensitivity using self-samples: 92·9% (95% CI, 87.3% to 98.4%) - Specificity using self-samples: 93.9% (95% CI, 93.4% to 94.5%) - Sensitivity using clinician-samples: 96.4% (95% CI, 92.9% to 99.9%) - Specificity using clinician-samples: 94.2% (95% CI, 93.6% to 94.8%) - Not statistically differ between self-sampling and clinician-based sampling - Relative accuracy of sensitivity = 0.96 (95% CI, 0.90 to 1.03) - Relative accuracy of specificity = 1.00 (95% CI, 0.99 to 1.01) CIN3+ sensitivity and specificity of HPV testing (unadjusted) - Sensitivity using self-samples: 95.1% (95% CI, 88.5% to 100%) - Specificity using self-samples: 93.4% (95% CI, 92.9% to 94.0%) - Sensitivity using clinician-samples: 95.8% (95% CI, 91.2% to 100%) - Specificity using clinician-samples: 93.5% (95% CI, 92.9% to 94.1%) - Relative accuracy of sensitivity = 0.99 (95% CI, 0.91 to 1.08) - Relative accuracy of specificity = 1.00 (95% CI, 0.99 to 1.01)	“HPV testing done with a clinically validated PCR-based assay had similar accuracy on self-collected and clinician-collected samples in terms of the detection of CIN2+ or CIN3+ lesions” (p. 229)
Ajenifuja et al., 2018¹⁶
Self- versus provider-collected samples using Hybribio GenoArray HPV prevalence - Self sampling: 12 (6.2%) - Provider collected samples: 19 (9.8%) The most common HPV type - Both: HPV 58 (2.6%) Prevalence of multiple HPV genotypes - Self-collected samples: 5 cases (2.6%) - Provider-collected samples: 1 (0.5%) High risk-HPV detection rate - Self sampled: 7.2% - Provider sampled: 6.8% Agreement between self- and provider-collected samples - Moderate correlation - κ = 0.47 (95% CI, 21.3% to 72.3%, P < 0.05)	“moderate correlation between both sampling techniques” (p. 1)
Non-randomized studies
Thay et al., 2019¹⁷
Self-sampled samples versus clinician-sampled samples (careHPV) versus visualization with acetic acid versus digital colposcopy hrHPV+ (16, 18, 31, 33, 35, 39, 45, 51, 52, 56, 58, 59, 66, 68) prevalence - 56 (22.4%) overall - 37 (28.6%) in HIV+ women (P = 0.0154, compared to HIV-) - 19 (15.7%) in HIV- women - 50 (89%) in self-sampling HPV specimens (P =0.174, compared to physician-collected) - 45 (80%) in physician-collected specimens Comfort in obtaining self-samples - 95.2% VIA+ - 37/250 - 30 underwent confirmatory biopsies for cervical intraepithelial neoplasia (CIN) (26 CIN1, 4 CIN2+) Confirmed dysplasia - 20 (15.5%) in the HIV+ group - 10 (8.26%) in HIV- women (P = 0.0291) - Accurately differentiated between CIN1 and CIN2+ lesions by contemporaneous physician impressions of the DC images	- “potential modifications of the current cervical screening strategy that is currently being employed in Cambodia” (p. 1) - “The first step in this new strategy would be self-swabbing for hrHPV. Subsequently, hrHPV+ patients would have DC and immediate treatment based on colposcopic findings: cryotherapy for suspected CIN1 and loop electrosurgical excision procedure (LEEP) for suspected CIN2+” (p. 1)
Toliman et al., 2019²²
Self-collected versus clinician-collected samples using Xpert, Cobas, and Aptima Agreement in hrHPV detection between self- and clinician-samples specimens across all three assays - substantial (k >0.6 in 32 pairwise comparisons of HPV tests) Sensitivity, specificity, and positive and negative predictive values for the detection of HPV type 16 according to the constructed reference standard (HPV infection detected with clinician-collected samples) using self-collected specimens - Xpert HPV: 92.1%, 93.1%, 63.6% and 98.9% - Cobas 4800 HPV: 90.4%, 94.3%, 67.8% and 98.7% - Aptima HPV: 63.2%, 97.2%, 75.0% and 95.3% - Similar results observed for all hrHPV types (combined) and for HPV types 18/45, on all three assays Detection of any hrHPV using self-collected specimens on all assays for HSIL positivity - High sensitivity (86% to 92%) - High specificity (87% to 94%) - High negative predictive value (>98%)	“Xpert HPV, using self-collected vaginal specimens, has sufficient accuracy for use in pointof- care ‘test-and-treat’ cervical screening strategies in high-burden, low-resource settings” (p. 496)
Des Marais et al., 2018¹⁸
Self-home versus self-clinic versus clinician samples using Aptima Prevalence of high-risk HPV - Self-home samples: 12.4% - Clinical samples: 11.4% [not significantly different from self-home samples (P = 0.79)] - Self clinic samples: 15.5% (not significantly different from self-home samples, P = 0.21) Positivity for high-risk HPV - Increased with increasing grades of cervical abnormality in all sample types (P < 0.001) Positivity for high-risk HPV in all identified cases of high-grade squamous intraepithelial lesions and of cervical intraepithelial neoplasia 2 or worse - Self-home samples: detected all cases - Comparable across sample types for T vaginalis (range 10.2% to 10.8%), M genitalium (3.3% to 5.5%), C trachomatis (1.1% to 2.1%), and N gonorrhoeae (0% to 0.5%) Agreement measured by Kappa values between sample types - high-risk HPV: 0.56 to 0.66 (moderate to good) - T vaginalis: 0.86 to 0.91 - M genitalium: 0.65 to 0.83 Instruction understanding - No difficulty understanding self-collection instructions: 93.6% - Willing to use self collection in the future: 96.3%	“Mail-based, at-home self-collection for high-risk HPV and sexually transmitted infection detection was valid and well accepted among infrequently screened women in our study” (p. 1412)
Lam et al., 2018⁵
Self sampling (HC2, Cobas, CLART, and Aptima) versus routine screening (cytology) CIN2+ detection - Self-sampling: higher than routine cytology-based screening (OR = 1.83, 95% CI, 1.21 to 2.77) - Self-sampling: similar to routinely screening with cytology and HPV testing (OR = 1.03, 95% CI, 0.75 to 1.40) Positive predictive value for CIN2+ - Screening non-attenders: higher than routinely HPV- and cytology-screened screened women (36.5% vs 25.6%, respectively) - Among the adequate biopsies - CSi-attenders (self-sampling) and GP-attenders (clinician-collected cytology) with higher PPV (36.5% and 32.7%, respectively) than non-responders (20.0%) and women included in the Horizon study (contesting with cytology and HPV test 25.6%)	“Self-sampling offered to non-attenders showed higher detection rates for ≥CIN2 than routine cytology-based screening, and similar detection rates as HPV and cytology co-testing” (p. 138)
Phoolcharoen et al., 2018²³
Self- versus clinician-collected samples using Cobas hrHPV prevalence - Self- collected samples: 41.3% - Physician-collected samples: 36.0% Agreement between the methods - 74.5% with κ = 0.46 (P < 0.001)	“Our study revealed moderate agreement between self- and physician-collected methods for hrHPV testing” (p. 1)
Senkomago et al., 2018¹⁹
Self- versus clinician-collected samples using Aptima hrHPV-RNA prevalence from baseline to 24 months - Self-collected samples: decreased slightly from 28.5% (98/344) to 24.3% (53/218) - Clinician-collected samples: decreased slightly from 29.9% (103/344) to 24.3% (53/218) Agreement between the sampling methods - Increase over time - Baseline k = 0.55 (95% CI, 0.45 to 0.65) - 24 months k = 0.83 (95% CI, 0.74 to 0.91) Among 21 patients with HSIL+ over 24 months - Clinician-collected samples: 18 (86%) hrHPV-RNA-positive results at baseline - Self-collected samples: 17 (81%) hrHPV-RNA-positive results at baseline - hrHPV-RNA-positive results or cytology anomalies: 20 (95%) at baseline	- “Overall agreement between physician- and self-collected hrHPV-RNA results was moderate and appeared to increase over time” (p. 217) - “Baseline physician- and self-collected hrHPV-RNA tests were similarly strong indicators of cumulative HSIL+ over 24 months” (p. 217)
Wong et al., 2018⁸
Preference to self sampling using unspecified tests - 65.6% preferred HPV DNA self-sampling in the future - 86.7% in those without previous experience of Papanicolaou tests (P = 0.055) Overall crude agreement in HPV detection rates - 85.3% (58/68) - k = 0.69 (95% CI, 0.51 to 0.87), substantial Sensitivity and specificity for the detection of ASCUS+ - Self-collected samples: 66.7% and 66.1% Positive and negative predicted values - Self-collected samples: 24.0% and 92.5% Prevalence of HPV - Slightly higher in self-collected samples (39.7%, 27/68) than in clinician-collected samples (36.8%, 25/68) Attitudes toward self-sampling - Positive, but less confident in their skills of self-sampling compared with clinicians (70.6% versus 91.2%)	“The findings showed that self-sampling could be incorporated into current cervical cancer screening approaches” (p. 46)
Zhang et al., 2018⁹
HPV testing on self-collected and physician-collected samples (HC2), cytology and visual inspection with acetic acid (VIA) compared with each other Sensitivity for cumulative CIN2+ using self-collected samples - Baseline: 83.1% (95% CI, 73.7% to 89.7%) - 6 years: 83.3% (95% CI, 74.9% to 89.3%) - 11 years: 70.3% (95% CI, 62.5% to 77.2%) - 15 years: 63.3% (95% CI, 55.7% to 70.2%) Specificity for cumulative CIN2+ using self-collected samples - Baseline: 85.9% (95% CI, 84.3% to 87.4%) - 6 years: 87.2% (95% CI, 85.5% to 88.7%) - 11 years: 87.9% (95% CI, 86.1% to 89.5%) - 15 years: 87.0% (95% CI, 85.0% to 88.8%) Sensitivity for cumulative CIN2+ using clinician-collected samples - Baseline: 97.6% (95% CI, 91.6% to 99.3%) - 6 years: 96.1% (95% CI, 90.4% to 98.5%) - 11 years: 82.1% (95% CI, 75.0% to 87.5%) - 15 years: 73.5% (95% CI, 66.3% to 79.6%) Specificity for cumulative CIN2+ using clinician-collected samples - Baseline: 84.7% (95% CI, 83.0% to 86.3%) - 6 years: 86.2% (95% CI, 84.4% to 87.7%) - 11 years: 86.7% (95% CI, 84.8% to 88.4%) - 15 years: 86.1% (95% CI, 84.0% to 87.9%) Prospective PPV of cumulative CIN2+ - Self-collected: 83.3% (95% CI, 74.9% to 89.3%), 70.3% (95% CI, 62.5% to 77.2%) and 63.3% (95% CI, 55.7% to 70.2%) at 6 years, 11 years and 15 years Relative cumulative sensitivity of clinician-collected versus self-collected HPV testing - Stable over 15 years at about 1.16. Cumulative sensitivity of self-collected HPV testing - Comparable to cytology - Significantly higher than VIA CIN2+ during 6-year follow-up and 15 years after baseline among women positive HPV tests at baseline - Self-collected: 26.2% (95% CI, 21.5% to 30.9%) - Physician-collected: similar Protection against CIN2+ of negative self-collected HPV results - Greater than VIA - CIN2+ cumulative incident rates: 1.1% at the 6-year follow-up	“Self-collected HPV testing demonstrates lower sensitivity than physician-collected HPV testing but performs comparably to cytology prospectively and provides satisfactory assurance against CIN2+” (p. 222)
Cremer et al., 2017²⁰
Response rates - CM (colposcopy management) cohort: 216 (44.2%) completed (203 treated, 11 diagnosed negative, 2 pregnant) - ST (screen and treat) cohort: 411 (88.4%) completed (407 treated, 2 diagnosed negative, 1 pregnant) Overall agreement between HPV test results from self-collected and provider-collected specimens (unspecified HPV test) - 93.7% - κ value = 0.70 (95% CI, 0.68 to 0.73)	- “Human papillomavirus testing with ST management resulted in an approximately twice completion rate compared with CM management” (p. 26) - “Agreement between self- and provider-based sampling was good and might be used to extend screening to women in areas that are more difficult to reach” (p. 26)
Obiri-Yeboah et al., 2017²¹
Overall HPV detection concordance using careHPV and Anyplex - 94.2% (95% CI, 89.9 to 97.1) - Kappa value = 0.88 (P < 0. 0001), showing excellent agreement - Agreement similar between HIV positive (93.8%) and negative (94. 7%) women Sensitivity and specificity of self-collected samples for the detection of HPV infection identified by clinician-collected samples - 92.6% (95% CI, 85.3 to 97.0) and 95.9% (95% CI, 89.8 to 98.8) - Highest sensitivity: HIV positive women (95.7%, 95% CI, 88.0 to 99.1) - Highest specificity: HIV negative women (98.6%, 95% CI, 92.4 to 100) User experience - 76.3% women found SC very easy/easy to obtain - 57.7% preferred SC to CC - 61.9% felt SC would increase their likelihood to access cervical cancer screening	“The feasibility, acceptability and performance of SC using careHPV support the use of this alternative form of HPV screening among Ghanaian women” (p. 1)

: ASCUS = atypical squamous cells of undetermined significance; CC = clinician-collected; CI = confidence interval; CIN = cervical intraepithelial neoplasia: CM = colposcopy management; CSi = Copenhagen Self-sampling Initiative; DC = digital colposcopy; GP = general practitioner; HC2 = Hybrid Capture 2; HIV = human immunodeficiency virus; HPV = human papillomavirus; hrHPV = high-risk human papillomavirus; HSIL = high-grade squamous intraepithelial lesion; LEEP = loop electrosurgical excision procedure; OR = odds ratio; PCR = polymerase chain reaction; PPV = positive predictive value; RNA = ribonucleic acid; SC = self-collection; ST = screen-and-treat; VIA = visual inspection with acetic acid

Appendix 5. Overlap between Included Systematic Reviews

Table 9Primary Study Overlap between Included Systematic Reviews

Primary Study Citation	Systematic Review Citation (n = 81)
Primary Study Citation	Arbyn et al., 2018³ Accuracy studies (n = 76)	Kelly et al., 2017¹⁴ (n = 8)
Morrison 1992	X
Hillemanns 1999	X
Sellors 2000	X
Wright 2000	X
Belinson 2001	X
Lorenzato 2002	X
Nobbenhuis 2002	X
Garcia 2003	X
Salmerón 2003	X
Brink 2006	X
Daponte 2006	X
Girianelli 2006	X
Holanda 2006	X
Seo 2006	X
Szarewski 2007	X
Qiao YL 2008	X	X
Bhatla 2009	X
Balasubramanian 2010	X
Gustavsson 2011	X
Taylor 2011	X
Twu 2011	X
Belinson 2012	X
Dijkstra 2012	X
Longatto-Filho 2012	X
van Baars 2012	X
Zhao FH 2012	X
Darlin 2013a	X
Darlin 2013b	X
Geraets 2013	X
Guan 2013	X
Jentschke 2013a	X
Jentschke 2013b	X
Nieves 2013	X
Bais 2007	X
Gök 2010	X
Giorgi Rossi 2011	X
Lazcano-Ponce 2011	X
Piana 2011	X
Szarewski 2011	X
Virtanen 2011	X
Wikström 2011	X
Gök 2012	X
Sancho-Garnier 2013	X
Broberg 2014	X
Cadman 2015	X
Haguenoer 2014	X
Arrossi 2015	X
Giorgi Rossi 2015	X
Tranberg 2018	X
Zhao FH 2013	X	X
Chernesky 2014	X
Hesselink 2014	X
Jeronimo 2014	X	X
Wang 2014	X
Zhang S 2014	X
Boggan 2015	X
Porras 2014	X
Chen Q 2016	X
Chen K 2016	X
Jentschke 2016	X
Qin Y 2016	X
Stanczuk 2016	X
Aiko 2017	X
Asciutto 2017	X
Catarino 2017	X
Leeman 2017	X
Asciutto 2018	X
Leinonen 2018	X
Enerly 2016	X
Moses 2015	X
Racey 2016	X
Sultana 2016	X
Zehbe 2016	X
Kitchener 2018	X
Modibbo 2017	X
Kellen 2018	X
Segondy 2016		X
Tuerxun 2016		X
Bansil 2015		X
Gage 2012		X
Chibwesha 2016		X

Appendix 6. Additional References of Potential Interest

Primary studies using participants with known HPV or cytology status

Aiko KY, Yoko M, Saito OM, et al. Accuracy of self-collected human papillomavirus samples from Japanese women with abnormal cervical cytology. J Obstet Gynaecol Res. 2017;43(4):710–717. [PubMed: 28418208]
El-Zein M, Bouten S, Louvanto K, et al. Validation of a new HPV self-sampling device for cervical cancer screening: The Cervical and Self-Sample In Screening (CASSIS) study. Gynecol Oncol. 2018;149(3):491–497. [PubMed: 29678360]
Tranberg M, Jensen JS, Bech BH, Blaakaer J, Svanholm H, Andersen B. Good concordance of HPV detection between cervico-vaginal self-samples and general practitioner-collected samples using the Cobas 4800 HPV DNA test. BMC Infect Dis. 2018;18(1):348. [PMC free article: PMC6062874] [PubMed: 30053836]

Tables

Table 1Selection Criteria

Population	Asymptomatic adults eligible for cervical cancer screening (≥ 21 years of age, or age at which screening starts in the jurisdiction)
Intervention	Q1-2: Self-sampled high-risk HPV tests for primary cervical cancer screening
Comparator	Q1-2: Clinician-sampled high-risk HPV tests for primary cervical cancer screening; cytology (conventional Pap smear or liquid based cytology) Q1 only: Colposcopy with histologic examination of tissue specimens, when indicated
Outcomes	Q1: Diagnostic test accuracy Number and proportion of patients positive and negative on each test using colposcopy as reference standard Sensitivity, specificity, PPV, NPV, PLR, NLR, DOR to screen for high-grade cervical lesions (HSIL or CIN2+, AGC, AIS) and/or invasive cervical cancer (squamous cell carcinoma or adenocarcinoma) Q2: Agreement between self-sampled HPV tests and clinician-sampled HPV tests or cytology (i.e., % agreement of positive test results, % agreement of negative test results)
Study Designs	Health technology assessments; systematic reviews; meta-analyses; randomized controlled trials; non-randomized studies

: AGC = atypical glandular cell; AIS = adenocarcinoma in situ; CIN = cervical intraepithelial neoplasia; DOR = diagnostic odds ratio; HPV = human papillomavirus; HSIL = high-grade squamous intraepithelial lesion; NLR = negative likelihood ratio; NPV = negative predictive value; PLR = positive likelihood ratio; PPV = positive predictive value

About the Series

CADTH Rapid Response Report: Summary with Critical Appraisal

ISSN: 1922-8147

Version: 1.0

Funding: CADTH receives funding from Canada’s federal, provincial, and territorial governments, with the exception of Quebec.

Suggested citation:

HPV Self-Sampling for Primary Cervical Cancer Screening: A Review of Diagnostic Test Accuracy and Clinical Evidence – An Update. Ottawa: CADTH; 2019 May. (CADTH rapid response report: summary with critical appraisal).

Disclaimer: The information in this document is intended to help Canadian health care decision-makers, health care professionals, health systems leaders, and policy-makers make well-informed decisions and thereby improve the quality of health care services. While patients and others may access this document, the document is made available for informational purposes only and no representations or warranties are made with respect to its fitness for any particular purpose. The information in this document should not be used as a substitute for professional medical advice or as a substitute for the application of clinical judgment in respect of the care of a particular patient or other professional judgment in any decision-making process. The Canadian Agency for Drugs and Technologies in Health (CADTH) does not endorse any information, drugs, therapies, treatments, products, processes, or services.

While care has been taken to ensure that the information prepared by CADTH in this document is accurate, complete, and up-to-date as at the applicable date the material was first published by CADTH, CADTH does not make any guarantees to that effect. CADTH does not guarantee and is not responsible for the quality, currency, propriety, accuracy, or reasonableness of any statements, information, or conclusions contained in any third-party materials used in preparing this document. The views and opinions of third parties published in this document do not necessarily state or reflect those of CADTH.

CADTH is not responsible for any errors, omissions, injury, loss, or damage arising from or relating to the use (or misuse) of any information, statements, or conclusions contained in or implied by the contents of this document or any of the source materials.

This document may contain links to third-party websites. CADTH does not have control over the content of such sites. Use of third-party sites is governed by the third-party website owners’ own terms and conditions set out for such sites. CADTH does not make any guarantee with respect to any information contained on such third-party sites and CADTH is not responsible for any injury, loss, or damage suffered as a result of using such third-party sites. CADTH has no responsibility for the collection, use, and disclosure of personal information by third-party sites.

Subject to the aforementioned limitations, the views expressed herein are those of CADTH and do not necessarily represent the views of Canada’s federal, provincial, or territorial governments or any third party supplier of information.

This document is prepared and intended for use in the context of the Canadian health care system. The use of this document outside of Canada is done so at the user’s own risk.

This disclaimer and any questions or matters of any nature arising from or relating to the content or use (or misuse) of this document will be governed by and interpreted in accordance with the laws of the Province of Ontario and the laws of Canada applicable therein, and all proceedings shall be subject to the exclusive jurisdiction of the courts of the Province of Ontario, Canada.

The copyright and other intellectual property rights in this document are owned by CADTH and its licensors. These rights are protected by the Canadian Copyright Act and other national and international laws and agreements. Users are permitted to make copies of this document for non-commercial purposes only, provided it is not modified when reproduced and appropriate credit is given to CADTH and its licensors.

Except where otherwise noted, this work is distributed under the terms of a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International licence (CC BY-NC-ND), a copy of which is available at http://creativecommons.org/licenses/by-nc-nd/4.0/

Bookshelf ID: NBK545378PMID: 31433604

HPV Self-Sampling for Primary Cervical Cancer Screening: A Review of Diagnostic Test Accuracy and Clinical Evidence – An Update

Authors

Abbreviations

Context and Policy Issues

Research Questions

Key Findings

Methods

Literature Search Methods

Selection Criteria and Methods

Exclusion Criteria

Critical Appraisal of Individual Studies

Summary of Evidence

Quantity of Research Available

Summary of Study Characteristics

Study Design

Country of Origin

Patient Population

Interventions and Comparators

Outcomes

Summary of Critical Appraisal

Systematic reviews

RCTs

Non-randomized studies

Summary of Findings

Diagnostic test accuracy of self-sampled high-risk HPV tests

Systematic Reviews

RCTs

Non-randomized studies

Agreement of self-sampled high-risk HPV tests and clinician-sampled high-risk HPV tests or cytology

RCTs

Non-randomized studies

Limitations

Conclusions and Implications for Decision or Policy Making

Diagnostic test accuracy of self-sampled high-risk HPV tests

Agreement of self- and clinician-sampled high-risk HPV tests

References

Appendix 1. Selection of Included Studies

Appendix 2. Characteristics of Included Publications

Table 2Characteristics of Included Systematic Reviews and Meta-Analyses

Table 3Characteristics of Included Primary Clinical Studies

Appendix 3. Critical Appraisal of Included Publications

Table 4Strengths and Limitations of Systematic Reviews and Meta-Analyses using AMSTAR 2 checklist10

Table 5Strengths and Limitations of Clinical Studies using the Downs and Black checklist11

Table 6Strengths and Limitations of Diagnostic Test Accuracy Studies using the QUADAS-2 checklist11

Appendix 4. Main Study Findings and Authors’ Conclusions

Table 7Summary of Findings Included Systematic Reviews and Meta-Analyses

Table 8Summary of Findings of Included Primary Clinical Studies

Appendix 5. Overlap between Included Systematic Reviews

Table 9Primary Study Overlap between Included Systematic Reviews

Appendix 6. Additional References of Potential Interest

Primary studies using participants with known HPV or cytology status

Tables

Table 1Selection Criteria

About the Series

Suggested citation:

Table 4Strengths and Limitations of Systematic Reviews and Meta-Analyses using AMSTAR 2 checklist¹⁰

Table 5Strengths and Limitations of Clinical Studies using the Downs and Black checklist¹¹

Table 6Strengths and Limitations of Diagnostic Test Accuracy Studies using the QUADAS-2 checklist¹¹