Abstract

Data, information, and knowledge processing systems, in the domain of healthcare, are currently plagued by heterogeneity at various levels. Current solutions have focused on developing a standard-based, manual intervention mechanism, which requires a large number of human resources and necessitates the realignment of existing systems. State-of-the-art methodologies in the field of natural language processing and machine learning can help to partially automate this process, reducing the resource requirements and providing a relatively good multi-class-based classification algorithm. We present a novel methodology for bridging the gap between various healthcare data management solutions by leveraging the strength of transformer-based machine learning models, to create mappings between the data elements. Additionally, the annotated data, collected against five medical schemas and labeled by four annotators is made available for helping future researchers. Our results indicate, that for biased, dependent multi-class text classification, transformer-based models provide better results than linguistic and other classical models. In particular, the Robustly Optimized BERT Pretraining Approach (RoBERTa) provides the best schema matching performance by achieving a Cohen's kappa score of 0.47 and Matthews Correlation Coefficient (MCC) score of 0.48, with human-annotated data.

Highlights

  • D ATA and Information modeling in the healthcare domain have witnessed significant improvements in the last decade owing to advances in the development of state-ofthe-art Information and Communication Technologies (ICT) and formalization of storage and messaging standards

  • Healthcare interoperability which aims to provide a solution to this problem, can be compartmentalized into data interoperability, process interoperability, and knowledge interoperability

  • In [8], we have presented the Ubiquitous Health Platform (UHP), which provides semantic reconciliation-onread based data curation for resolving data interoperability between various schema

Read more

Summary

Introduction

D ATA and Information modeling in the healthcare domain have witnessed significant improvements in the last decade owing to advances in the development of state-ofthe-art Information and Communication Technologies (ICT) and formalization of storage and messaging standards. One of the major reasons behind this limitation is due to the numerous heterogeneities in healthcare at data, knowledge, and process level. Healthcare interoperability which aims to provide a solution to this problem, can be compartmentalized into data interoperability, process interoperability, and knowledge interoperability. Data interoperability resolves the heterogeneity between data artificats to enable seamless and interpretable communcation among source and target organizations, while preserving the data’s original intention during storage, communication, and usage (as defined by IEEE 610.12 [1], Health Level Seven International (HL7), and Healthcare Information and Management Systems Society HIMSS [2]). Knowledge interoperability provides a sharing mechanism for reusing interpretable medical knowledge, acquired through expert intervention and other mechanisms, across decision support systems [4]

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call