Abstract

Healthcare providers generate huge amount of data every day through registration, lab results, prescriptions, and others. This is stored in the form of Electronic Medical Records (EMR) in a central repository. A medical record data is very huge, difficult to read and understand. To give an insight to the professionals in analyzing the different domains a patient belongs to, it is necessary to get pointers to a file before classifying it to a particular department for further analysis. This study provides a EMR processing system to automatically classify EMRs based on the important medical terms using TF-IDF and topic modeling. Automatic Classification of EMRs help the healthcare professionals in taking accurate decisions, providing efficient service, and improves the time taken for processing huge amount of data and in better organizing of patient. The data stored on the cloud may contain duplicate copies of EMR on several storage systems at file level thus increasing the network bandwidth, cost, and consuming storage space. Hence, a deduplication mechanism is required to avoid or reduce the data redundancy. Adapting cloud computing for healthcare systems necessitates sharing patient data with cloud service providers, which creates security concerns as the data may contain diagnosis, medication, laboratory results and medical claims. The main aim of this work is to classify the EMRs as per the specialization using KNN algorithm, optimize storage using deduplication and protect the data using DNA encryption algorithm before uploading to Hadoop. Data redundancy is taken care by implementing deduplication techniques using MD5 hashing. Proposed methodology shows an accuracy of 90% for EMR record classification and handles duplication and security aspects. This in-turn proves the state of the art approach for health care data management.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.