Chapter 9 - Machine learning analysis of topic modeling re-ranking of clinical records

Vijayalakshmi Kakulapati,Sheri Mahender Reddy,B Sri Sai Deepthi,João Manuel R.S Tavares

doi:10.1016/b978-0-12-820781-9.00009-7

Vijayalakshmi Kakulapati, Sheri Mahender Reddy + Show 2 more

Open Access

https://doi.org/10.1016/b978-0-12-820781-9.00009-7

Copy DOI

Abstract

Technologies in big data have improved the analysis of clinical information for better understanding diseases in order to provide more efficient diagnoses. An online healthcare system has created huge data by record maintaining, taking into account acceptable requirements and the patient’s care. These clinical records are in files that pose a challenge for data processing and finding relevant documents. In this work, we used a method that combines Statistical Topic Models, Language Models, and Natural Language Processing, in order to retrieve clinical records. On the contrary, for analyzing large clinical records in the form of documents, topic models are used to finding related clusters of disease patterns. Here, it is explored the decomposition of clinical record summaries into topics which enables the effective clustering of relevant documents based on the topic under study. Clinical documents selected in a topic-based approach give proper information to the users for better understanding and derive insights from the related data. In our proposed method, clustering-based semantic similarity topic modeling is used in order to summarize the clinical reports based on latent Dirichlet allocation (LDA) in a MapReduce framework. Automated unsupervised analysis of LDA models is used to identify different disease patterns and to rank topic significance. In this, topic and keyword re-ranking methods assist physicians to get improved information through the LDA-obtained topics. The experimental assessment confirmed the value of the used methods in clinical documents summarization.

Full Text