With the rapid advancement in healthcare, there has been exponential growth in the healthcare records stored in large databases to help researchers, clinicians, and medical practitioner’s for optimal patient care, research, and trials. Since these studies and records are lengthy and time consuming for clinicians and medical practitioners, there is a demand for new, fast, and intelligent medical information retrieval methods. The present study is a part of the project which aims to design an intelligent medical information retrieval and summarization system. The whole system comprises three main modules, namely adverse drug event classification (ADEC), medical named entity recognition (MNER), and multi-model text summarization (MMTS). In the current study, we are presenting the design of the ADEC module for classification tasks, where basic machine learning (ML) and deep learning (DL) techniques, such as logistic regression (LR), decision tree (DT), and text-based convolutional neural network (TextCNN) are employed. In order to perform the extraction of features from the text data, TF-IDF and Word2Vec models are employed. To achieve the best performance of the overall system for efficient information retrieval and summarization, an ensemble strategy is employed, where predictions of the selected base models are integrated to boost the robustness of one model. The performance results of all the models are recorded as promising. TextCNN, with an accuracy of 89%, performs better than the conventional machine learning approaches, i.e., LR and DT with accuracies of 85% and 77%, respectively. Furthermore, the proposed TextCNN outperforms the existing adverse drug event classification approaches, achieving precision, recall, and an F1 score of 87%, 91%, and 89%, respectively.
Read full abstract