Abstract

Technologies in big data have improved the analysis of clinical information for better understanding diseases in order to provide more efficient diagnoses. An online healthcare system has created huge data by record maintaining, taking into account acceptable requirements and the patient’s care. These clinical records are in files that pose a challenge for data processing and finding relevant documents. In this work, we used a method that combines Statistical Topic Models, Language Models, and Natural Language Processing, in order to retrieve clinical records. On the contrary, for analyzing large clinical records in the form of documents, topic models are used to finding related clusters of disease patterns. Here, it is explored the decomposition of clinical record summaries into topics which enables the effective clustering of relevant documents based on the topic under study. Clinical documents selected in a topic-based approach give proper information to the users for better understanding and derive insights from the related data. In our proposed method, clustering-based semantic similarity topic modeling is used in order to summarize the clinical reports based on latent Dirichlet allocation (LDA) in a MapReduce framework. Automated unsupervised analysis of LDA models is used to identify different disease patterns and to rank topic significance. In this, topic and keyword re-ranking methods assist physicians to get improved information through the LDA-obtained topics. The experimental assessment confirmed the value of the used methods in clinical documents summarization.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call