Abstract

Algorithmic exploitation of medical data for diagnostic purposes has become state of the art in the modern medical world. Applying artificial intelligence algorithms is gaining importance and electrocardiogram recordings have successfully been used as input for deep learning models and produce viable diagnoses. Algorithms are noninvasive, relatively low-cost and promise to have high diagnostic leverage. However, for supervised learning algorithms such as deep learning models the amount of high quality data labelled with correct diagnoses required for training is considerable. In this paper, we present a pipeline that processes raw electrocardiogram recordings preparing them for use in training and validation of neural network models. Although, the electrocardiogram is widely used, appropriately labelled training data is rare and provided in different formats and from technically different sources. Therefore, our end-to-end pipeline not only processes data from modern digital ECG devices, e.g. in XML file format, but can also extract all necessary information from PDF files (both scanned hard copies and digitally generated PDFs). We present a use case in which data from XML and PDF sources is read, cleaned and combined into a unified dataset to be used by a model predicting myocardial scar. Our pipeline will become a cornerstone of our environment for building AI based diagnostic instruments.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call