Abstract

The extensive growth and use of electronic health records (EHRs) and extending medical literature have led to huge opportunities to automate the extraction of relevant clinical information that helps in concise and effective clinical decision support. However, processing such information has traditionally been dependent on labor-intensive processes with human errors such as fatigue, oversight, and interobserver variability. Hence, this study aims at the processing of EHRs and performing multilevel and multiclass classification by fetching dominant characteristic features that are sufficient to detect and differentiate various types of breast lesions. In this study, unstructured EHRs on breast lesions obtained through fine-needle aspiration cytology technique are considered. The raw text was normalized into structured tabular form and converted to scores by performing sentiment analysis that helps to decide the total polarity or class label of the EHR. Supervised machine learning approaches, namely random forest and feed-forward neural network trained using Levenberg-Marquardt training function, are used for classification of the collected EHR data set containing 2,879 records that are split in the ratio of 80:20 as training and testing data sets, respectively. Random forest and feed-forward neural network classifiers gave the best performance with an accuracy of 99.36%, an overall receiver operating characteristic-area under the curve of 99.2%, a correlation with ground truth of 98.3%, and a histopathologic correlation of 98.6%. Natural language processing has huge potential to automate the extraction of clinical features from breast lesions. The proposed multilevel and multiclass classification approach is used to classify 13 different types of breast lesions with 20 different labels into five classes to decide the type of treatment that should be given to patients by a physician or oncologist.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call