U.S. flag

An official website of the United States government

NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Rivero-Arias O, Png ME, White A, et al. Benefits and harms of antenatal and newborn screening programmes in health economic assessments: the VALENTIA systematic review and qualitative investigation. Southampton (UK): National Institute for Health and Care Research; 2024 Jun. (Health Technology Assessment, No. 28.25.)

Cover of Benefits and harms of antenatal and newborn screening programmes in health economic assessments: the VALENTIA systematic review and qualitative investigation

Benefits and harms of antenatal and newborn screening programmes in health economic assessments: the VALENTIA systematic review and qualitative investigation.

Show details

Chapter 3Work package 1: systematic review of health economic assessments evaluating antenatal and newborn screening

Sections of this chapter have been previously reported in Png et al. (2021).38

Introduction

In this chapter, we report our systematic review of health economic assessments evaluating antenatal and newborn screening programmes in developed countries. The systematic review had two distinct purposes. The first was to identify all available evidence in the published and grey literature over the last two decades and understand its main characteristics, the clinical areas and conditions covered and the reporting quality of the contributing studies. The second objective was to extract detailed information about the benefits and harms incorporated into these health economic assessments. This chapter covers the first aim and reports a comprehensive overview, whereas the second aim is presented in Chapter 4.

Methods

We used the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) 2020 checklist39 when reporting the methods and results of the systematic review. The review protocol has been registered with PROSPERO (CRD42020165236) and published on 13 January 2020. This review is based on data available from secondary sources and published materials with no primary data collection required, so ethics committee approval or written informed consent was not required.

Eligibility criteria

The Population, Intervention, Comparator, Outcome and Study design (PICOS) framework was used to develop the study eligibility criteria (Table 3) and applied to the literature searches. Searches were limited to studies published after 1 January 2000. Studies reporting health economic assessments, such as economic evaluations and studies that use economic frameworks of cost-effectiveness evidence or economic notions of value (e.g. multi-criteria decision analyses, programme budgeting and marginal analyses) of antenatal or newborn screening programmes, were included. Non-English language studies were included, but studies were limited to those conducted in developed countries (defined, for the purposes of this review, as a member of the OECD40).

Table Icon

TABLE 3

Inclusion and exclusion criteria for identification of relevant studies

Information sources

Systematic searches of both published and grey literature, including peer-reviewed journal articles controlled by commercial publishers and documents produced by all levels of government, academia, business and industry, were conducted. The following electronic bibliographic databases were searched: MEDLINE (OvidSP) (1946–present), EMBASE (OvidSP) (1974–present), NHS Economic Evaluation Database (via CRDWeb www.crd.york.ac.uk/CRDWeb/) (inception to 31 March 2015), EconLit (Proquest) (1969–present), Science Citation Index, Social Science Citation Index and Conference Proceedings Citation Index – Science (Web of Science Core Collection) (1945–present), CINAHL (EBSCOhost) (1982–present) and PsycINFO (OvidSP) (1806–present). SCOPUS (Elsevier) was used to run forward and backward citation searches once relevant studies were identified. The academic electronic database searches were supplemented by manual reference searching of bibliographies from studies that were included, contacts with experts in the field and author searching based on experts’ opinion. The first full search of published literature was conducted on 24 April 2020 with a top-up search conducted on 2 July 2020 to include the ‘perinatal’ search term while a refresh search was conducted on 22 January 2021.

The list of grey literature searched was derived from a pool of relevant websites that was informed by a recent systematic review of national policy recommendations on newborn screening that identified websites of national and regional screening organisations with documentation about antenatal and/or newborn screening recommendations.41 This was widened to cover websites reported by the Health Grey Matters checklist and those for national and regional screening organisations, HTA agencies, paediatrics organisations, and obstetrics and gynaecology societies in OECD countries, as well as international decision-making bodies, such as the World Health Organization, the European Council, European Commission and the European Observer.41,42 A customised web scraping tool that used the Google search engine was built using Python to directly query the stated websites (see Appendix 1) from 18 to 27 January 2021 using English search terms and from 14 to 17 February 2021 using translated search terms for non-English websites, as well as to automate the data extraction processes.

Search strategy

The search strategies applied to the published literature (see Appendix 2, Tables 2126) were developed using a combination of medical subject headings (MeSH) and free-text keywords related to health economic assessments of antenatal and newborn screening programmes in collaboration with an information specialist (NR) with expertise in conducting systematic literature reviews in the health sciences. A simplified search strategy derived based on the Cochrane guidelines was applied to the grey literature search.43 Translation of the simplified search terms for non-English websites was performed by professional translators.

Data management

The results of the literature searches were uploaded into the Endnote software package X9 (Clarivate, Philadelphia, PA, USA, 2019), a reference management system specifically designed for managing bibliographies and citations, to remove duplicates. Unique records were subsequently imported into Covidence,44 an online software program that facilitates collaboration among reviewers during the screening and data extraction stages. This software allows importation of references and files to be screened and information can be entered into a pre-created data extraction form after removing duplicates. Screening criteria based on the inclusion and exclusion criteria specified in Table 3 were developed and tested. A calibration exercise was undertaken to pilot and refine the screening criteria before the formal screening process started. For non-English language papers, Google Translate (Google, Mountain View, CA, USA) was used to translate relevant documents.

Selection process

For the published literature, two reviewers (MEP and MY) independently screened the titles and abstracts of all retrieved articles and documented the reasons for study exclusion according to the criteria specified in Table 3. Full texts of potentially relevant articles were reviewed independently by the same reviewers (MEP and MY), and study eligibility based on the inclusion and exclusion criteria was assessed. At each stage of the selection process, any disagreement was resolved by discussion and consensus between the two reviewers. When consensus could not be reached, input from the rest of the review team (ORA and SP) was obtained. For the grey literature, only one reviewer (MEP) did the title/abstract and full-text screening of all the retrieved articles, while another reviewer (SR) screened the titles/abstracts of a random sample of at least 10% (13%) of the retrieved articles due to a change and shortage in work force and limited time.

Data collection process

A data extraction form, which was piloted and refined using 10 randomly selected studies identified in the academic electronic databases, was created using Microsoft Excel following recommendations from the Cochrane Handbook for Systematic Reviews of Interventions.43 As we had anticipated a large number of articles to data extract, after consulting our Independent Oversight Committee members and Information Specialist (NR), a selection of 10% of the articles/reports was extracted independently by two health economists (MEP and MY), followed by a reconciliation process. During this reconciliation process, MEP and MY had to extract the same key information from a random set of conference abstracts, journal articles and reports before they proceeded to extract the other studies independently and this was observed after assessing around 10% of the papers/reports. The rest of the published literature was subsequently divided between the two reviewers (MEP and MY), while data from the grey literature were extracted by one reviewer (MEP) only. Furthermore, any uncertainties related to data extracted by the two independent reviewers (MEP and MY) was discussed with the two senior investigators (ORA and SP) at weekly meetings. The list of variables extracted from each report included at the final stage of the review process was finalised following the piloting and refinement of the data extraction sheet.

The data extraction form consisted of two parts: (1) a section that contained items from the CHEERS checklist,45 modified where applicable to align with our research focus (i.e. benefits and harms within economic assessments) (see Appendix 3). This included bibliographic details; condition(s) screened; approaches for measuring and valuing health outcome measures; the journal impact factor quartile during the year that the article was published, obtained from Clarivate Analytics and SCImago; whether the authors made any policy recommendation based on their economic evaluation evidence; and whether the authors might have had any potential conflicts of interest in promoting their screening programme or mechanism (defined as a study that was funded by an industry sponsor, unless it was an unrestricted grant, and at least one of the authors being clearly employed by the industry sponsor); and (2) a bespoke form (see Appendix 4) created by the research team to extract benefits and harms adopted by economic assessments evaluating antenatal or newborn screening programmes. This form was created de novo as we could not find any previous examples in the published literature. A detailed description of the process to create the bespoke form is described in Chapter 4. This bespoke form contained consequences as reported by authors by screening test outcome (i.e. true positives, false positives, true negatives and false negatives) and type of data (i.e. probability, cost or outcome), which were captured and categorised as either a benefit or a harm. We also recorded the stage of the disease pathway at which the screening test was administered and the phase(s) of the screening programme using categorisations from recent guidance,1 as well as recorded whether the structure of decision-analytical models had been reported, and the consequences associated with treatment where applicable.

Data items

In order to reduce bias from including data from multiple reports of the same study, multiple articles published by the same authors with similar titles and abstracts were treated as linked companion studies (i.e. multiple reports from a single study) and only the most detailed publication was included in our final outputs. Similarly, if conference abstract(s) and a journal article by a similar group of authors had been published on the same topic, only the journal article was included at the full-text screening stage. Since we were interested in the methodological approaches to the measurement and valuation of benefits and harms and how the results were reported, if the article/report title suggested that an economic evaluation was conducted but neither the methods nor the results were presented in the abstract or full text, the article/report was excluded at the screening stage(s). Articles/reports that did not focus specifically on pregnant women or newborns but reported separate results of screening of pregnant women or newborns within broader populations were still included. In addition, authors were not contacted for missing data on individual data items included in our data extraction sheet, which were instead recorded as ‘not stated’.

Assessment of reporting quality of individual studies

Since only aggregated data and no effect sizes were sought, we did not assess the risk of bias or conduct a formal meta-analysis. Instead, the reporting quality of articles and reports (excluding conference abstracts) was assessed using the CHEERS reporting statement.45 The items were considered as ‘satisfied’ if reported in full or ‘not satisfied’ if not reported or partially reported. The items were not scored as per the guidance in the CHEERS reporting statement.45

Deviations from protocol

There were a few of deviations from the protocol. First, we used Endnote and Covidence for different components of the systematic review. Endnote was used to record all the studies identified as part of our searches and employed to identify and remove duplicates. Covidence was used as it can better facilitate collaboration among reviewers during the screening and data extraction stages than Endnote. Second, we did not use any risk of bias tools such as Cochrane ROBEQ tool and Risk Of Bias In Non-randomized Studies – of Interventions (ROBINS-I) for different study designs because the aim of the systematic review was to understand the benefits and harms of antenatal and newborn screening and not to extract any quantitative information from the papers. Therefore, we excluded any reference to risk of bias assessment from the final published protocol for the systematic review. For the same reason, we did not explore further the external validity of any of the cost-effectiveness results published in the studies. We have used the CHEERS statement to evaluate the reporting quality of the studies since understanding the reporting quality of these studies was a primary aim of our systematic review as it was good indicator of whether a particular study was going to provide all the information in our bespoke form. Last, data extraction was not conducted independently by the two reviewers and a 10% sample was used because the former was not feasible as we ended up including 336 articles and reports and it was decided given our timelines to change the strategy for the data extraction. All deviations were discussed and approved by our Independent Oversight Committee.

Results

Search results

We identified 52,244 articles and reports from the searches of the published and grey literature. Among the 16,052 records that were sought for retrieval based on identification of records via other methods (i.e. grey literature), 7464 records were non-English (46.5%). Thirty-nine studies of the non-English records were assessed for eligibility with five subsequently included in the data extraction phase. A total of 336 records, 310 articles (1.4% of databases) and 26 reports (0.08% of websites), were included in the systematic review. One HTA report included two separate economic evaluations that were separated into two different reports, resulting in 337 outputs. Study selection and reasons for exclusion as well as data extraction of the bespoke form are summarised in the modified PRISMA diagram (Figure 3). The list of studies excluded is summarised in Report Supplementary Material 1.

FIGURE 3. Modified PRISMA flow diagram.

FIGURE 3

Modified PRISMA flow diagram. a, One HTA report included two separate economic evaluations that were separated into two different reports, resulting in 242 outputs from the 241 records.

There was no trend in the publication year of the articles and reports (Figure 4). Characteristics of the included articles and reports are presented in Table 4. The majority of the articles and reports included were journal articles (228, 67.7%) and almost half of the studies were conducted in the USA (109, 32.2%) and the UK (43, 12.7%). The majority of the articles and reports also required further information to determine if the authors had potential conflicts of interests (221, 65.6%). Furthermore, the authors did not make any recommendation about the adoption of the screening programme based on the economic evidence generated for the majority of the articles and reports (273, 81.0%). The majority of the articles were published in top quartile medical journals (i.e. quartile one; 129, 38.3%).

FIGURE 4. Number of articles and reports published from 2000 to January 2021.

FIGURE 4

Number of articles and reports published from 2000 to January 2021. Note: Dotted lines were used to indicate that only January 2021 was included in this chart.

Table Icon

TABLE 4

Characteristics of articles and reports

Target population and setting

The characteristics of screening programmes and populations in the included articles and reports are summarised in Table 5. There were 173 (71.5%) studies on antenatal screening and 63 (66.3%) studies on newborn screening that did not state the setting of the screening (236, 70.0%) or the women’s gestational stage at the time of screening (168, 65.4% of the antenatal screening studies). The majority of the studies were targeted at the general population of pregnant women (197, 57.1%) or infants (91, 26.4%). Many studies were investigations at the symptomless stage with pathologically definable change present (303, 89.9%) or involved all phases of the screening programmes (162, 48.1%).

Table Icon

TABLE 5

Characteristics of screening programmes and population in the articles and reports

Medical conditions investigated are summarised in Table 6. Genetic conditions and infectious diseases (153, 63.2%) were the main areas covered by the articles and reports assessing antenatal screening. Metabolic and structural conditions (57, 60.0%) were the main areas covered by health economic assessments evaluating newborn screening programmes.

Table Icon

TABLE 6

Medical conditions investigated

The key methodological characteristics of the health economic assessments from the CHEERS checklist are summarised in Table 7 and in the following subsections.

Table Icon

TABLE 7

Health economic assessment characteristics of the articles and reports

Choice and time horizon of model

The most common type of economic evaluation used was ‘cost–utility analysis’, which reports outcomes in terms of QALYS or disability-adjusted life-years (DALYs), for antenatal screening (129, 53.3%), and cost-effectiveness analysis for newborn screening (47, 50.0%). Decision-analytical models were employed in 272 (81.0%) of the articles and reports for the economic evaluations – 200 (82.6%) in antenatal screening and 72 (76.6%) in newborn screening. Among these studies, the majority either employed a lifetime horizon (82, 41.0% for antenatal screening and 37, 51.4% in newborn screening) or did not state the time horizon (75, 37.5% for antenatal screening and 14, 19.4% for newborn screening).

Cost perspective

The costing perspective adopted was not stated in 117 (33.7%) articles and reports. Among those that stated a costing perspective, the majority adopted a health system or payer perspective (107, 43.5% for antenatal screening and 53, 52.5% for newborn screening).

Main outcome measures used in the economic evaluations

The source to inform the main outcome measures in the economic evaluations came predominantly from evidence synthesis of secondary data for both antenatal (167, 77.3%) and newborn (62, 67.4%) screening. Natural units such as number of cases averted and number of cases detected were the more commonly reported outcome measure in both antenatal (187, 59.2%) and newborn (73, 65.2%) screening studies. QALYs were used as the main outcome measure in 129 (39.9%) of antenatal screening and 36 (32.1%) of newborn screening studies. The DALY metric (an outcome measure that combines years of life lost due to premature mortality and years lived with a disability) was used in five studies across both types of screening programmes. Maternal preference-based outcomes (QALYs/DALYs) were reported in 94 (72.9%) of the antenatal screening evaluations, whereas infant preference-based outcomes were reported in 34 (89.5%) of the newborn screening evaluations.

Preference-elicitation methods for valuation of outcomes

Thirty out of 162 studies generated QALYs based on preferences for relevant health states using direct valuation exercises or preference-based instruments based on individual patient-level data. Thirteen out of the 65 studies (20%) reported that they had used a standard gamble and/or time trade-off method to obtain preferences directly from individuals within their studies; of which, 10/47 (21.3%) were antenatal screening programme assessments and 3/18 (16.7%) newborn screening programme evaluations. The use of preference-based instrument to describe health-related quality of life outcomes was limited with only 9/47 studies (19.1%) that investigated antenatal screening programmes and 7/18 (38.9%) that investigated newborn screening programmes stating clearly the instrument used and included the EQ-5D, Health Utilities Index 2 (HUI2), HUI3, 16-Dimension (16D) or the Quality of Well-Being Scale (QWB). There were two studies (one each for antenatal and newborn screening programmes) that used mapping of a non-preference-based survey [i.e. Edinburgh Postnatal Depression Scale and Adrenoleukodystrophy-Disability rating scale (ALD-DRS)] onto a generic preference-based measure.

Assessment of reporting quality

Reporting quality assessed using the CHEERS checklist was heterogeneous among the 264 full-length articles and reports (as summarised in Appendix 5). The top five items not satisfied among the studies for antenatal screening programmes were ‘Abstract’ (160, 88.4%), ‘Time horizon’ (153, 84.5%), ‘Choice of model’ (153, 84.5%), ‘Discount rate’ (130, 71.8%) and ‘Study funding, limitation, generalisability, and current knowledge’ (123, 68.0%). Similar results were found among studies assessing newborn screening programmes. The top five items not satisfied among these studies were ‘Abstract’ (69, 83.1%), ‘Time horizon’ (67, 80.7%), ‘Study funding, limitation, generalisability, and current knowledge’ (59, 71.1%), ‘Choice of model’ (55, 66.3%), ‘Discount rate’ (53, 63.9%) and ‘Setting and location’ (53, 63.9%). The majority of these items were partially satisfied as authors failed to justify the rationale of their methodology as required by the CHEERS checklist.

Discussion

This is the first systematic review to synthesise the evidence surrounding the benefits and harms adopted by health economic assessments evaluating antenatal and newborn screening programmes in OECD countries. Almost half of the articles were published in first-quartile journals, indicating interest in the topic by high-impact journals. Most of the economic evidence of antenatal screening programmes focused on screening for genetic conditions or infectious diseases, while that surrounding newborn screening programmes primarily focused on screening for metabolic or structural conditions.

We found clear evidence that decision-analytic models represent the main vehicle for the conduct of these studies, unsurprisingly given the nature of the evidence synthesis needed. Almost half of the articles and reports used standard health economic measures of QALYs or DALYs to measure the health benefits of the screening programmes. Only 30 of the studies using QALYs attempted to estimate preferences for relevant health states using valuation exercises or employing a preference-based instruments or mapping exercise on participant-level data sets. Therefore, the main source of information to inform utility values used in QALY estimations was the published literature.

A key strength of this review includes the focus on a comprehensive set of antenatal and newborn screening programmes across OECD countries. We did not restrict our search to English-only records to avoid language bias and did not restrict to the published literature only to avoid publication bias. However, this study has its limitations. We did not perform dual extraction of data, as currently recommended,43 due to the large amount of information to extract from the final included articles and reports and the timelines to complete the project. For practical purposes and quality assurance, dual data extraction was performed for 10% of the papers after consulting our Independent Oversight Committee and information specialist (NR) using a reconciliation process that ended in a high-level agreement between reviewers. Furthermore, reporting quality was assessed using the original CHEERS checklist and not the CHEERS 2022 checklist that was published after the completion of the systematic review.46 Arguably, application of the CHEERS 2022 checklist, which includes requirements to report on the use of health economic analysis plans, the contributions of patients and members of the public to study design and reporting, and trade-offs between efficiency and equity concerns, would have led to different assessments of reporting quality.

We found that many of these studies did not adhere to recent reporting guidance for health economic evaluations. Time horizon, choice of model and discount rates were poorly reported in general. Related to time horizon, we observed that authors employed longer time horizons to estimate health benefits than their associated costs counterparts. It was common to observe studies that used a lifetime horizon for the estimation of QALYs but a shorter time frame (e.g. up to delivery or when a case was detected) for the costs included in the model. Current lack of long-term data to inform accurate costs of living with a condition over time partly explains this result,11 but it highlights a serious limitation of these studies. It also indicates that these studies did not adhere to recognised methods guidelines for the conduct of economic evaluations for the purposes of assessing the value for money of screening programmes.14 This suggests that policy-makers using cost-effectiveness information from these studies to inform local decision-making should read these reports with caution.

Copyright © 2024 Rivero-Arias et al.

This work was produced by Rivero-Arias et al. under the terms of a commissioning contract issued by the Secretary of State for Health and Social Care. This is an Open Access publication distributed under the terms of the Creative Commons Attribution CC BY 4.0 licence, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. See: https://creativecommons.org/licenses/by/4.0/. For attribution the title, original author(s), the publication source – Journals Library, and the DOI of the publication must be cited.

Bookshelf ID: NBK604594

Views

  • PubReader
  • Print View
  • Cite this Page
  • PDF version of this title (4.8M)

Other titles in this collection

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...