Abstract

Sharing data across hospitals for disease modeling is challenging due to concerns over patient privacy and the lack of an efficient privacy-preserving data mining framework. Contextual embedding models, which encode medical events into vector representations while preserving the contextual dependencies between events, have shown promise in privacy-preserving data mining without requiring original data disclosure. However, the medical event representations learned from multiple data sources lie in different embedding spaces and cannot be directly integrated. Existing embedding harmonization algorithms require a list of common medical events between different data sources and use them as corresponding pairs for transformation, known as the supervised harmonization method. However, common medical events can be difficult to collect in clinical practice. To promote data mining across hospitals, we developed a novel unsupervised embedding harmonization system that introduces an unsupervised harmonization algorithm to align contextual embeddings without the need for corresponding pairs. The proposed framework also considered different contextual embedding techniques, including Word2Vec and Med2Vec, to explore the robustness of the proposed unsupervised harmonization algorithm. The proposed framework was evaluated using medical events extracted from the Medical Information Mart for Intensive Care III database. By integrating the embeddings from multiple sources, the proposed framework can achieve better disease prediction accuracy and medical event clustering compared to models built on a single data source. The proposed unsupervised harmonization method, which achieves similar performance to the supervised harmonization model under different contextual embedding techniques, holds great promise for predictive modeling and event clustering in healthcare.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call