U.S. flag

An official website of the United States government

NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Bruening W, Schoelles K, Treadwell J, et al. Comparative Effectiveness of Core-Needle and Open Surgical Biopsy for the Diagnosis of Breast Lesions [Internet]. Rockville (MD): Agency for Healthcare Research and Quality (US); 2009 Dec. (Comparative Effectiveness Reviews, No. 19.)

Cover of Comparative Effectiveness of Core-Needle and Open Surgical Biopsy for the Diagnosis of Breast Lesions

Comparative Effectiveness of Core-Needle and Open Surgical Biopsy for the Diagnosis of Breast Lesions [Internet].

Show details

4Discussion

Open surgical biopsy is the “gold standard” method of evaluating a suspicious breast lesion. However, it is a surgical procedure that, like all surgeries, places the patient at risk of experiencing morbidities and, in rare cases, mortality. The majority of women who undergo breast biopsy procedures do not have cancer. Exposing large numbers of women to invasive surgical procedures when the majority of these women do not benefit from the procedure may be considered an unacceptable medical practice. A less invasive method would be preferable if it were sufficiently accurate.

Open surgical biopsy has been reported to miss 1 to 2% of breast cancers.40 Our analysis found that stereotactically-guided vacuum-assisted core-needle biopsy is almost as accurate as open surgical biopsy with a much lower complication rate. US-guided automated gun core-needle biopsy may be almost as accurate as stereotactically-guided vacuum-assisted biopsy, and may have a slightly lower complication rate than vacuum-assisted biopsy. Both US-guided automated gun biopsy and stereotactically-guided vacuum-assisted biopsy meet the criteria of being sufficiently accurate and safer than open surgical biopsy, and therefore under most clinical conditions are preferable to open surgical biopsy. It is possible that US-guided vacuum-assisted biopsy and stereotactically guided automated gun also meet the criteria of being sufficiently accurate and safer than open surgical biopsy, but the confidence intervals around the point estimates of accuracy are too wide to be certain.

Diagnoses of “pure” DCIS determined on the basis of core-needle biopsy may be incorrect due to the inability of needle biopsy to sample all parts of the tumor. Rakha and Ellis reviewed the literature in 2007 and reported that 15 to 20% of cases diagnosed as “pure” DCIS by core-needle biopsy were subsequently found to contain associated invasive carcinoma upon excision.16 Our analyses found that DCIS underestimation rates ranged from 13% to 36%, justifying current clinical practice of referring all DCIS diagnoses for open surgery.

The management of “high risk” lesions such as ADH is somewhat controversial. Our analysis found that at least 20% of ADH diagnoses on core-needle biopsy are actually malignant, suggesting that some patients diagnosed with atypia on core needle may benefit from open surgery as well.

In Figure 5E, in the Executive Summary, we present a simple model of what might happen if the same cohort of 1000 women underwent various types of breast biopsy. The cohort of women includes 300 women with malignant tumors, and 700 women with benign lesions. The model is based on the point estimates of accuracy from our analyses and do not incorporate estimates of uncertainty in the point estimates. Refer to Figure 1 A through Figure 4 D in the Executive Summary for a visual representation of the degree of uncertainty in the point estimates.

Limitations of the Evidence Base

The evidence base is very large but of generally low quality. The majority of the available studies are poorly reported retrospective chart reviews. Most of the studies included all patients who underwent core-needle biopsy at a particular center or centers during a certain time period and had no other inclusion criteria for enrollment. Very few studies reported any characteristics of their patients; some did not even report how many patients were enrolled. Details of operator training and experience were often omitted, as were details about the training and experience of the pathologists reading the biopsy material. Many studies combined results for multiple core-needle biopsy methods. Others changed biopsy methodology in mid-study. Descriptions of biopsy methods were often inadequate. Characteristics of the breast lesions being biopsied were often omitted. Biopsy diagnoses were often collapsed into “benign” and “malignant” categories, instead of being presented in a more granular form by type of lesion. Sources of funding for the studies were usually not mentioned. Presentation of results was often haphazard and confusing. Many patients diagnosed as “benign” on core-needle biopsy had inadequate followup data. Poor reporting of biopsy methodology, patient characteristics, and details of lesions precluded answering the majority of the sub-questions about factors affecting the accuracy and harms of core-needle biopsy.

Applicability

We used inclusion criteria intended to restrict the evidence base to only those studies that included the population of interest: women of average risk undergoing breast biopsy after discovery of a suspicious lesion on routine screening. However, our analysis found that the prevalence of cancers in the study populations tended to be slightly higher than expected. The prevalence of cancers in the general population sent for breast biopsy (in the USA) has been reported to be around 23%.15 The studies in our analysis generally reported prevalence in the thirties to forties, and up to 55% for freehand biopsies. This may be due to the fact that many of the studies were conducted in non-USA locations, where the prevalence of cancers in populations sent for biopsy has been reported to be 60 to 70%.234 It may also be an artifact caused by attrition. Many of the studies had fairly high rates of attrition, and most of the lost patients had been diagnosed as benign on core-needle biopsy. The lost patients were of necessity removed from the analysis, and this may have artificially elevated the prevalence of disease. Interestingly, the studies of US-guided vacuum-assisted biopsy reported an overall prevalence of disease of only 15%, suggesting that lesions selected for this method may have a low probability of being malignant. Lesions selected for US-guided procedures generally do not contain microcalcifications and must be clearly visible on US.

Possible Impact of Key Assumptions on the Conclusions

Several key assumptions were made: (1) the “reference standard”, a combination of open surgery and followup for at least six months, was 100% accurate; (2) the pathologists examining the open surgical biopsy results were 100% accurate; and (3) core-needle diagnoses of malignancy (invasive or in situ) that could not be confirmed by open surgery were assumed to have been correct diagnoses where the lesion had been completely removed by the core-needle biopsy procedure. In addition, the majority of studies reported data on a per-lesion rather than a per-patient basis, and therefore we analyzed the data on a per-lesion basis.

Key assumption #1, that the reference standard was 100% accurate, is almost certainly not true. Open surgical biopsy has been reported to have a false-negative rate of 1 to 2% when two years of patient followup was used as the reference standard.40 If a small percentage of the surgical biopsies were false-negatives then our estimates of the accuracy of core-needle biopsy are slightly lower than the actual “true” accuracy of core-needle biopsy. If a small percentage of the patients declared “benign” on six-month patient followup actually had cancers then our estimates of the accuracy of core-needle biopsy are higher than the actual “true” accuracy of core-needle biopsy. Logically one would expect short-term patient followup to be more prone to error than open surgical biopsy; thus it seems likely that our estimates of core-needle biopsy accuracy are slightly higher than the actual “true” accuracy. However, some of the studies did follow all patients for at least two years, and other studies did perform open biopsy on all patients. We performed meta-regressions and found no statistically significant impact of the type of reference standard used or length of followup on the reported accuracy of the core-needle biopsies.

Key assumptions #2 and #3 are inter-related and both depend on pathologists being 100% accurate in reading open surgical biopsy material. The errors that pathologists make when examining core-needle biopsy specimens are incorporated into our conclusions about the accuracy of core-needle biopsy: causes of misdiagnosis include errors of sampling as well as errors of pathologists examining the core-needle specimens. The literature reports pathology errors in general as being rare, affecting 0.08 to 1.2% of specimens examined.237 The fact that open surgical biopsy has a false-negative rate of less than 2% also suggests that open surgical biopsy pathology errors are quite rare; this low false-negative rate includes errors of surgery as well a s errors of pathologists. A 2006 review of medical malpractice suits filed against pathologists for breast biopsy misdiagnoses reported that about half the suits involved false-negative errors and about half involved false-positive errors.237 Even if a very small percentage of patients declared “true positive” in our analysis were actually false-positives and a very small percentage of patients declared “true negatives” were actually false-negatives, it seems unlikely that our estimates of core-needle biopsy accuracy can be significantly different than the actual true accuracy. The clinical impact of pathology errors, however, is not insignificant, since it can lead to over- and under-treatment.

Key assumption #4, that analyzing the data on a per-lesion rather than a per-patient basis would not violate statistical assumptions of independence, was unavoidable. Very few of the studies reported data on a per-patient basis. The percentage of patients with more than one lesion was, inmost studies, quite low. Each lesion was subjected to an independent core-needle biopsy. A patient diagnosed with multiple benign lesions would have all lesions managed by followup, but a patient with one malignant lesion and a benign lesion may have had the benign lesion surgically biopsied at the same time as the malignant lesion was biopsied. Thus the independence of data at the per-lesion level is not quite complete. The impact of this minor lack of independence on the results of our analyses is most likely insignificant.

Correlation With Findings From Prior Systematic Reviews

As discussed previously, two prior systematic reviews of core-needle biopsy have been published.234,235 Both prior reviews and our review calculated very similar false-negative rates for stereotactically-guided automated gun core-needle biopsy: 2.2%, 3.0%, and 2.0%. Both prior reviews and our review calculated very similar rates of ADH underestimation for stereotactically-guided automated gun core-needle biopsy: 40%, 43.5%, and 47.4%. The DCIS underestimation rate reported by Verkooijen et al. for stereotactically-guided core-needle biopsy was much lower (only 15.0%) than the DCIS underestimation rates reported by Fahrbach et al. and our review(24.4%, 27.1%, respectively). This difference may be related to the fact that our review and Fahrbrach et al. included both palpable and non-palpable lesions in the analysis whereas Verkooijen et al. restricted their analysis to non-palpable lesions.

Verkooijen et al. did not study stereotactically-guided vacuum-assisted core-needle biopsy. Our review and Fahrbach et al. found very similar accuracy figures for stereotactically-guided vacuum-assisted core-needle biopsy: false negative rate, 1.2% and 0.8%; ADH underestimation rate, 29.2% and 21.9%; DCIS underestimation rate, 13.7% and 13.0%.

Fahrbach et al. found that study location was a significant predictor of the false-negative rate, but type of reference standard and patient position had no significant impact on the results. We also found that the type of reference standard had no impact on the results, but we found no impact of study location on the results. The reason for this apparent discrepancy may be that we included studies conducted worldwide, whereas Fahrbach et al. included only studies conducted in North America, Europe, Australia, or New Zealand.

Future Research Needed

For many interventions, randomized controlled trials that measure patient-oriented outcomes are necessary in order to justify the routine use of the intervention. However, it is generally believed that early diagnosis and treatment of breast tumors leads to improved survival rates and quality of life. Women found to have benign lesions on biopsy are able to avoid unnecessary treatment and receive reassurance that they do not have breast cancer. There is no need to conduct randomized controlled trials reporting patient-oriented outcomes of breast biopsy procedures. Establishing that a type of breast biopsy is safer than open surgical biopsy while being as or almost as accurate as open surgical biopsy is sufficient to justify its routine use. Our systematic review has found that both stereotactically guided vacuum-assisted and US-guided automated gun core-needle biopsy are safer than open surgical biopsy and are almost as accurate as open surgical biopsy, justifying their routine use.

However, well-reported retrospective chart reviews, retrospective database analyses, or prospective diagnostic accuracy studies are needed to address the as-yet-unanswered questions as to what factors affect the accuracy and harms of core-needle breast biopsy. We have listed the most important as-yet unanswered questions in Table 16. Answers to such questions are important for both patients and clinicians when faced with the decision of what type of breast biopsy is best for each individual patient. The unanswered questions can be addressed by a prospective or retrospective diagnostic cohort study that reports relevant information in a format that allows each unanswered question to be directly addressed. It is possible that many of the studies included in the current systematic review collected information that addressed some of the unanswered questions but did not report it.

Table 16. Unanswered questions.

Table 16

Unanswered questions.

In addition, our conclusions are often rated as being supported by a low strength of evidence. The low rating is almost entirely due to the fact that the evidence base, while large, consists of universally poorly reported studies. The studies omitted important details about patients, methods, and results. The studies presented results in an often confusing and haphazard manner. The poor reporting made it difficult to determine whether the studies were likely to be unaffected by bias, and therefore we rated the evidence base as being of low quality. Publication of better-reported diagnostic accuracy studies would permit verification that our conclusions are accurate and not influenced by biases in the studies included in this technology assessment.

Views

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...