PubMed-supported clinical term weighting approach for improving inter-patient similarity measure in diagnosis prediction.

Lawrence Wc Chan,Ky Kwok,Tao Chan,Andy Ph Yeung,Helen Kw Law,Sc Cesar Wong,Chi-Ren Shyu,Thomas Yh Lau,Kf Lo,Ying Liu,Sw Yeung,William Yl Chan

doi:10.1186/s12911-015-0166-2

Abstract

BackgroundSimilarity-based retrieval of Electronic Health Records (EHRs) from large clinical information systems provides physicians the evidence support in making diagnoses or referring examinations for the suspected cases. Clinical Terms in EHRs represent high-level conceptual information and the similarity measure established based on these terms reflects the chance of inter-patient disease co-occurrence. The assumption that clinical terms are equally relevant to a disease is unrealistic, reducing the prediction accuracy. Here we propose a term weighting approach supported by PubMed search engine to address this issue.MethodsWe collected and studied 112 abdominal computed tomography imaging examination reports from four hospitals in Hong Kong. Clinical terms, which are the image findings related to hepatocellular carcinoma (HCC), were extracted from the reports. Through two systematic PubMed search methods, the generic and specific term weightings were established by estimating the conditional probabilities of clinical terms given HCC. Each report was characterized by an ontological feature vector and there were totally 6216 vector pairs. We optimized the modified direction cosine (mDC) with respect to a regularization constant embedded into the feature vector. Equal, generic and specific term weighting approaches were applied to measure the similarity of each pair and their performances for predicting inter-patient co-occurrence of HCC diagnoses were compared by using Receiver Operating Characteristics (ROC) analysis.ResultsThe Areas under the curves (AUROCs) of similarity scores based on equal, generic and specific term weighting approaches were 0.735, 0.728 and 0.743 respectively (p < 0.01). In comparison with equal term weighting, the performance was significantly improved by specific term weighting (p < 0.01) but not by generic term weighting. The clinical terms “Dysplastic nodule”, “nodule of liver” and “equal density (isodense) lesion” were found the top three image findings associated with HCC in PubMed.ConclusionsOur findings suggest that the optimized similarity measure with specific term weighting to EHRs can improve significantly the accuracy for predicting the inter-patient co-occurrence of diagnosis when compared with equal and generic term weighting approaches.Electronic supplementary materialThe online version of this article (doi:10.1186/s12911-015-0166-2) contains supplementary material, which is available to authorized users.

Highlights

Similarity-based retrieval of Electronic Health Records (EHRs) from large clinical information systems provides physicians the evidence support in making diagnoses or referring examinations for the suspected cases
The huge amount of clinical data managed by the electronic health record (EHR) system potentiate case-based decision support where the reference cases are retrieved based on their similarity with the current case of interest [1, 2]
Feature extraction and report pair formation We extracted 38 image finding terms from 112 examination reports (59 hepatocellular carcinoma (HCC) and 53 no abnormality detected (NAD) cases). These terms are uniquely defined by 38 concepts in Unified Medical Language System (UMLS) and were projected to 36 feature concepts at level-4 of Systematized Nomenclature of Medicine (SNOMED) Clinical Terms (CT) “is-a” hierarchy

Summary

Introduction

Similarity-based retrieval of Electronic Health Records (EHRs) from large clinical information systems provides physicians the evidence support in making diagnoses or referring examinations for the suspected cases. Clinical Terms in EHRs represent high-level conceptual information and the similarity measure established based on these terms reflects the chance of inter-patient disease co-occurrence. To measure the inter-patient similarity consistently, the feature vector model has been established by transforming the clinical information of EHRs, including laboratory test findings, medical images and diagnostic reports, to vector elements systematically [3,4,5,6]. The transformation of textual information, such as image findings, to feature vector requires the support of a medical ontology [5, 6]. The ontological feature vector contains numerical elements, each of which is inferred by integrating the semantic distances from all the EHR terms to a feature concept. It has been proved that the ontological vector model significantly outperforms the simple string matching in predicting inter-patient co-occurrence of subclinical disorder [12]

Objectives

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Medical Informatics and Decision Making	Publication Date: Jun 2, 2015
Citations: 33	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

PubMed-supported clinical term weighting approach for improving inter-patient similarity measure in diagnosis prediction.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Medical Informatics and Decision Making

Lead the way for us

Similar Papers

Magnetic resonance imaging of hepatocellular carcinoma
Bachir Taouli ... Glenn Krinsky
Gastroenterology | VOL. 127
Bachir Taouli, et. al.Bachir Taouli ... Glenn Krinsky
01 Nov 2004
Gastroenterology | VOL. 127

Contrast-enhanced computed tomography and ultrasound-guided liver biopsy to diagnose dysplastic liver nodules in cirrhosis
Massimo Iavarone ... Massimo Colombo
Digestive and Liver Disease | VOL. 45
Massimo Iavarone, et. al.Massimo Iavarone ... Massimo Colombo
26 Sep 2012
Digestive and Liver Disease | VOL. 45

Detection of hepatocellular carcinomas and dysplastic nodules in cirrhotic livers: accuracy of helical CT in transplant patients.
Jae Hoon Lim ... Kwang Cheol Koh
American Journal of Roentgenology | VOL. 175
Jae Hoon Lim, et. al.Jae Hoon Lim ... Kwang Cheol Koh
01 Sep 2000
American Journal of Roentgenology | VOL. 175

Expression and clinicopathologic significance of GPC3 and other antibodies in well-differentiated hepatocellular carcinoma
Yu-Lan Wang ... Li-Xin Wei
Chinese Journal of Pathology | VOL. 40
Yu-Lan Wang, et. al.Yu-Lan Wang ... Li-Xin Wei
01 Jan 2010
Chinese Journal of Pathology | VOL. 40

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

PubMed-supported clinical term weighting approach for improving inter-patient similarity measure in diagnosis prediction.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Medical Informatics and Decision Making