Abstract

We compared the performance of expert-crafted rules, a Bayesian network, and a decision tree at automatically identifying chest X-ray reports that support acute bacterial pneumonia. We randomly selected 292 chest X-ray reports, 75 (25%) of which were from patients with a hospital discharge diagnosis of bacterial pneumonia. The reports were encoded by our natural language processor and then manually corrected for mistakes. The encoded observations were analyzed by three expert systems to determine whether the reports supported pneumonia. The reference standard for radiologic support of pneumonia was the majority vote of three physicians. We compared (a) the performance of the expert systems against each other and (b) the performance of the expert systems against that of four physicians who were not part of the gold standard. Output from the expert systems and the physicians was transformed so that comparisons could be made with both binary and probabilistic output. Metrics of comparison for binary output were sensitivity (sens), precision (prec), and specificity (spec). The metric of comparison for probabilistic output was the area under the receiver operator characteristic (ROC) curve. We used McNemar's test to determine statistical significance for binary output and univariate z-tests for probabilistic output. Measures of performance of the expert systems for binary (probabilistic) output were as follows: Rules—sens, 0.92; prec, 0.80; spec, 0.86 (Az, 0.960); Bayesian network—sens, 0.90; prec, 0.72; spec, 0.78 (Az, 0.945); decision tree—sens, 0.86; prec, 0.85; spec, 0.91 (Az, 0.940). Comparisons of the expert systems against each other using binary output showed a significant difference between the rules and the Bayesian network and between the decision tree and the Bayesian network. Comparisons of expert systems using probabilistic output showed no significant differences. Comparisons of binary output against physicians showed differences between the Bayesian network and two physicians. Comparisons of probabilistic output against physicians showed a difference between the decision tree and one physician. The expert systems performed similarly for the probabilistic output but differed in measures of sensitivity, precision, and specificity produced by the binary output. All three expert systems performed similarly to physicians.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.