Sandra M. Sulik, MD, MS Karin Kroeger, MD Jennifer K. Schultz, MSEd Jennie L. Brown, MD Lorne A. Becker, MD William D. Grant EdD Syracuse, Jordan-Elbridge, and Fayetteville, New York Submitted, revised, July 13, 2001. From the Department of Family Medicine (S.M.S., J.K.S., L.A.B., W.D.G.) and the Center for Evidence-Based Practice (J.K.S., L.A.B., W.D.G.), State University of New York (SUNY) Upstate Medical University, Syracuse; St. Joseph’s Family Practice Residency, Syracuse (S.M.S.); Family Care Medical Group, Jordan-Elbridge (K.K.); and St. Joseph’s Family Medicine, Fayetteville (J.L.B.). Reprint requests should be addressed to Sandra M. Sulik, MD, MS, SUNY Upstate Medical University, Department of Family Medicine and St. Joseph’s Family Practice Residency, 5820 Heritage Landing Drive, East Syracuse, NY 13057. Email: smsulik@aol.com.
References
Study Quality Assessment
For the studies addressing accuracy, we developed a criteria form for extracting infusing criteria.8 Two points were assigned and summed to create an overall study quality score (maximum score = 13). Two reviewers independently assessed the quality of each article with differences resolved by consensus Table 1.
Data Extraction
Three reviewers independently extracted data using a structured form. Differences were resolved by consensus. Specimen adequacy was classified as satisfactory, satisfactory but limited by absence of endocervical cells (SBLB-absence), satisfactory but limited by other (SBLB-o), and unsatisfactory. SBLB-o included obscuring inflammatory exudate, blood, thick tests, scant cellularity, and air-drying artifact.
Data Synthesis and Analysis
Summary estimates of sensitivity and specificity were made from studies that used an appropriate reference standard using a DerSimonian and Laird random effects model. Sensitivity and specificity were pooled independently and weighted by the inverse of the variance using MetaTest software (version 0.6, Joseph Lau, MD, with permission). The MetaTest program was also used to calculate the area under receiver operating characteristic (AuROC) curves, and the difference between AuROC was calculated using ROCKET 0.9B software for ROC analysis (Charles E. Metz, Department of Radiology University of Chicago, March 1998). The AuROC is a measure of overall diagnostic accuracy, where 1.0 is a perfect test, and 0.5 is a test that is no better than chance at distinguishing normal from abnormal specimens. For analysis of specimen adequacy, we used the RevMan 4.1 software (Cochrane Collaboration, Update Software, Oxford, England) to calculate rate differences. Study homogeneity analyses were performed, and our analysis plan called for the use of a random effects model if significant heterogeneity (P >.05) was found.
Results
Search Strategy and Study Inclusion
We identified 62 articles for critical appraisal. Because some authors published more than one article from a single study9-23 and one author combined 2 studies into one article,24 the 62 articles represented 47 actual studies. Fifty-two articles met the initial inclusion criteria.6,9-59 Ten articles were excluded, because they did not contain a reference standard or specimen adequacy data, or they restricted their reports to only a subset of Pap results, such as atypical glandular cells of uncertain significance (AGUS) or atypical squamous cells of uncertain significance (ASCUS).24,39,61-68
Study Characteristics and Qualitative Synthesis
Most articles provided no systematic comparison with any reference standard and could therefore only be used to evaluate specimen adequacy. In some cases, histologic results were reported for some patients and compared with Pap and FBC reports; however, these appeared to be haphazard samples of patients with a positive result on one or both tests. Most articles compared the results obtained from FBC and Pap with the assumption that the better test was the one with the higher proportion of positives, ignoring the possibility of false-positive tests.
Five studies included a comparison with a reference standard.18,27,35,40,50 In all 5, both tests were performed at the same time in all patients. After the cervix was scraped in the usual fashion, the sampling device was wiped across a slide for the conventional Pap and then rinsed in a vial containing the appropriate solution for the FBC method.
Three studies systematically compared FBC and Pap results with colposcopy and biopsy. One12,35 involved women referred to a colposcopy clinic because of a previous abnormal test result; Pap and FBC Pap test results were obtained, and colposcopic examinations were done on all women referred. A second16,23,55 studied 782 patients referred for colposcopy after an abnormal Pap test result; colposcopy was performed, and biopsies were taken from 445 of these patients. In the third study40 A total of 8636 randomly selected Costa Rican women were each screened with Pap, FBC, and cervicography. All women with a suspicious physical examination for cancer or with any abnormality on any of the 3 tests were referred for colposcopy, along with a random sample of 150 women with no abnormalities.
The remaining 2 studies27,50 used consensus between independent reviewers of the Pap and FBC test results as the reference standard, with biopsy for at least 50% of the women with significant abnormalities on either or both tests. In the first study50 a total of 2778 split samples were obtained and evaluated in both Germany and the United States. In Germany, masked slides were reviewed by cytotechnologists, and pathologists reviewed all abnormal and discrepant slides. Masked review was repeated in the United States, and senior cytotechnologists and cytopathologists rescreened abnormal slides. The cases were then unmasked, and discrepant cases were reviewed. A subset of histologic data (1235 samples) was analyzed, and a final reference diagnosis was made. In the second study27 2009 sample pairs from a multicenter trial were evaluated blindly by 2 cytotechnologists, with all abnormal and 10% of normal slides reviewed by 1 of 6 pathologists. All sample pairs containing an abnormal result were then sent for a second masked opinion by cytotechnologists and pathologists. Consensus data were summarized and reported.