Header

UZH-Logo

Maintenance Infos

Identifying Diagnostic Studies in MEDLINE: Reducing the Number Needed to Read


Bachmann, L M (2002). Identifying Diagnostic Studies in MEDLINE: Reducing the Number Needed to Read. Journal of the American Medical Informatics Association (JAMIA), 9(6):653-658.

Abstract

Objectives: The search filters in PubMed have become a cornerstone in information retrieval in evidence-based practice. However, the filter for diagnostic studies is not fully satisfactory, because sensitive searches have low precision. The objective of this study was to construct and validate better search strategies to identify diagnostic articles recorded on MEDLINE with special emphasis on precision. Design: A comparative, retrospective analysis was conducted. Four medical journals were hand-searched for diagnostic studies published in 1989 and 1994. Four other journals were hand-searched for 1999. The three sets of studies identified were used as gold standards. A new search strategy was constructed and tested using the 1989-subset of studies and validated in both the 1994 and 1999 subsets. We identified candidate text words for search strategies using a word frequency analysis of the abstracts. According to the frequency of identified terms, searches were run for each term independently. The sensitivity, precision, and number needed to read (1/precision) of every candidate term were calculated. Terms with the highest sensitivity × precision product were used as free text terms in combination with the MeSH term "SENSITIVITY AND SPECIFICITY” using the Boolean operator OR. In the 1994 and 1999 subsets, we performed head-to-head comparisons of the currently available PubMed filter with the one we developed. Measurements: The sensitivity, precision and the number needed to read (1/precision) were measured for different search filters. Results: The most frequently occurring three truncated terms (diagnos*; predict* and accura*) in combination with the MeSH term "SENSITIVITY AND SPECIFICITY” produced a sensitivity of 98.1 percent (95% confidence interval: 89.9-99.9%) and a number needed to read of 8.3 (95% confidence interval: 6.7-11.3%). In direct comparisons of the new filter with the currently available one in PubMed using the 1994 and 1999 subsets, the new filter achieved better precision (12.0% versus 8.2% in 1994 and 5.0% versus 4.3% in 1999. The 95% confidence intervals for the differences range from 0.05% to 7.5% (p = 0.041) and -1.0% to 2.3% (p = 0.45), respectively). The new filter achieved slightly better sensitivities than the currently available one in both subsets, namely 98.1 and 96.1% (p = 0.32) versus 95.1 and 88.8% (p = 0.125). Conclusions: The quoted performance of the currently available filter for diagnostic studies in PubMed may be overstated. It appears that even single external validation may lead to over optimistic views of a filter's performance. Precision appears to be more unstable than sensitivity. In terms of sensitivity, our filter for diagnostic studies performed slightly better than the currently available one and it performed better with regards to precision in the 1994 subset. Additional research is required to determine whether these improvements are beneficial to searches in practice

Abstract

Objectives: The search filters in PubMed have become a cornerstone in information retrieval in evidence-based practice. However, the filter for diagnostic studies is not fully satisfactory, because sensitive searches have low precision. The objective of this study was to construct and validate better search strategies to identify diagnostic articles recorded on MEDLINE with special emphasis on precision. Design: A comparative, retrospective analysis was conducted. Four medical journals were hand-searched for diagnostic studies published in 1989 and 1994. Four other journals were hand-searched for 1999. The three sets of studies identified were used as gold standards. A new search strategy was constructed and tested using the 1989-subset of studies and validated in both the 1994 and 1999 subsets. We identified candidate text words for search strategies using a word frequency analysis of the abstracts. According to the frequency of identified terms, searches were run for each term independently. The sensitivity, precision, and number needed to read (1/precision) of every candidate term were calculated. Terms with the highest sensitivity × precision product were used as free text terms in combination with the MeSH term "SENSITIVITY AND SPECIFICITY” using the Boolean operator OR. In the 1994 and 1999 subsets, we performed head-to-head comparisons of the currently available PubMed filter with the one we developed. Measurements: The sensitivity, precision and the number needed to read (1/precision) were measured for different search filters. Results: The most frequently occurring three truncated terms (diagnos*; predict* and accura*) in combination with the MeSH term "SENSITIVITY AND SPECIFICITY” produced a sensitivity of 98.1 percent (95% confidence interval: 89.9-99.9%) and a number needed to read of 8.3 (95% confidence interval: 6.7-11.3%). In direct comparisons of the new filter with the currently available one in PubMed using the 1994 and 1999 subsets, the new filter achieved better precision (12.0% versus 8.2% in 1994 and 5.0% versus 4.3% in 1999. The 95% confidence intervals for the differences range from 0.05% to 7.5% (p = 0.041) and -1.0% to 2.3% (p = 0.45), respectively). The new filter achieved slightly better sensitivities than the currently available one in both subsets, namely 98.1 and 96.1% (p = 0.32) versus 95.1 and 88.8% (p = 0.125). Conclusions: The quoted performance of the currently available filter for diagnostic studies in PubMed may be overstated. It appears that even single external validation may lead to over optimistic views of a filter's performance. Precision appears to be more unstable than sensitivity. In terms of sensitivity, our filter for diagnostic studies performed slightly better than the currently available one and it performed better with regards to precision in the 1994 subset. Additional research is required to determine whether these improvements are beneficial to searches in practice

Statistics

Citations

Dimensions.ai Metrics
94 citations in Web of Science®
111 citations in Scopus®
Google Scholar™

Altmetrics

Downloads

14 downloads since deposited on 25 Sep 2018
14 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Journal Article, refereed, original work
Communities & Collections:National licences > 142-005
Dewey Decimal Classification:Unspecified
Language:English
Date:1 November 2002
Deposited On:25 Sep 2018 13:01
Last Modified:29 Apr 2019 13:50
Publisher:BMJ Publishing Group
ISSN:1067-5027
OA Status:Green
Free access at:Publisher DOI. An embargo period may apply.
Publisher DOI:https://doi.org/10.1197/jamia.m1124
Related URLs:https://www.swissbib.ch/Search/Results?lookfor=nationallicenceoxford101197jamiaM1124 (Library Catalogue)

Download

Download PDF  'Identifying Diagnostic Studies in MEDLINE: Reducing the Number Needed to Read'.
Preview
Content: Published Version
Language: English
Filetype: PDF (Nationallizenz 142-005)
Size: 105kB
View at publisher