Multi-Modal Stacked Denoising Autoencoder for Handling Missing Data in Healthcare Big Data

Joo-Chang Kim,Kyungyong Chung

doi:10.1109/access.2020.2997255

Abstract

Supply and demand increase in response to healthcare trends. Moreover, personal health records (PHRs) are being managed by individuals. Such records are collected using different avenues and vary considerably in terms of their type and scope depending on the particular circumstances. As a result, some data may be missing, which has a negative effect on the data analysis, and such data should, therefore, be replaced with appropriate values. In this study, a method for estimating missing data using a multi-modal autoencoder applied to the field of healthcare big data is proposed. The proposed method uses a stacked denoising autoencoder to estimate the missing data that occur during the data collection and processing stages. Autoencoders are neural networks that output value of x <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">^</sup> similar to an input value of x. In the present study, data from the Korean National Health Nutrition Examination Survey (KNHNES), conducted by the Korea Centers for Disease Control and Prevention (KCDC), are used. As representative healthcare data from South Korea, they contain a large number of parameters identical to those used in the PHRs. Based on this, models can be generated to estimate missing data occurring in PHRs. Furthermore, PHRs involve a multi-modality that allows the data to be collected from multiple sources for a single object. Therefore, the stacked denoising autoencoder applied is configured under a multi-modal setting. Through pre-processing, a set of data without missing value in KNHNES is designed. In the data set based learning, a label is set as original data, and an autoencoder input is set as noised input that additionally has as many random zero numbers as noise factor. In this way, the autoencoder learns in the way of making the zero-based noise value similar to the original label value. When the amount of missing data in a dataset reaches approximately 25%, the accuracy of the proposed method using a multi-modal stacked denoising autoencoder is 0.9217, which is higher than that achieved by other ordinary methods. For a single-modal denoising autoencoder, the accuracy is 0.932, with a slight difference of approximately 0.01, which falls within the allowable limits in data analysis. In terms of computational performance, a single-modal autoencoder has 10,384 parameters, which is 5,594 more than those used in a multi-modal stacked autoencoder. These parameters affect the speed of the model. Both models exhibit a significant difference in the number of parameters but demonstrate a relatively small difference in accuracy, suggesting that the proposed multi-modal stacked denoising autoencoder is advantageous over a single-modal model when used on a personal device. Moreover, a multi-modal model can save additional time when processing large amounts of data in locations such as hospitals and institutions.

Highlights

Healthcare big data involve complex relationships among the different parameters and are adaptable to changes in theThe associate editor coordinating the review of this manuscript and approving it for publication was Shuihua Wang .surroundings
A total of 80 parameters are selected from the preprocessed Korean National Health Nutrition Examination Survey (KNHNES) data
The results show that the accuracy of the proposed method is 0.9321 when a noise factor of 0.25 is applied

Summary

Introduction

Healthcare big data involve complex relationships among the different parameters and are adaptable to changes in theThe associate editor coordinating the review of this manuscript and approving it for publication was Shuihua Wang .surroundings. HANDLING OF MISSING DATA USING MULTI-MODAL STACKED DENOISING AUTOENCODER IN HEALTHCARE BIG DATA KNHNES [16] data can be classified into health, health examination, and nutritional survey data. A method for estimating missing data using a multi-modal stacked denoising autoencoder in the field of healthcare big data is proposed.

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE access : practical innovations, open solutions	Publication Date: Jan 1, 2020
Citations: 32	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Multi-Modal Stacked Denoising Autoencoder for Handling Missing Data in Healthcare Big Data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE access : practical innovations, open solutions

Lead the way for us

Similar Papers

LDWPSO based Bi-LSTM Model for Predicting the Missing Data in PHRs
Piyush Kumar Pareek ... Anoop N Prasad
-
Piyush Kumar Pareek, et. al.Piyush Kumar Pareek ... Anoop N Prasad
24 Feb 2023
24 Feb 2023

Legal issues pertaining to the collection of sociodemographic data in emergency departments.
Haley Hrymak ... Carmen Hrymak
Academic Emergency Medicine | VOL. 30
Haley Hrymak, et. al.Haley Hrymak ... Carmen Hrymak
22 Mar 2023
Academic Emergency Medicine | VOL. 30

Mining health-risk factors using PHR similarity in a hybrid P2P network
Joo-Chang Kim ... Kyungyong Chung
Peer-to-peer networking and applications | VOL. 11
Joo-Chang Kim, et. al.Joo-Chang Kim ... Kyungyong Chung
05 Feb 2018
Peer-to-peer networking and applications | VOL. 11

A Personal Electronic Health Record: Study Protocol of a Feasibility Study on Implementation in a Real-World Health Care Setting
Dominik Ose ... Aline Kunz
JMIR Research Protocols | VOL. 6
Dominik Ose, et. al.Dominik Ose ... Aline Kunz
02 Mar 2017
JMIR Research Protocols | VOL. 6

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Multi-Modal Stacked Denoising Autoencoder for Handling Missing Data in Healthcare Big Data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE access : practical innovations, open solutions