Abstract

Automatic ICD-10 coding is an unresolved challenge in terms of Machine Learning tasks. Despite hospitals generating an enormous amount of clinical documents, data is considerably sparse, associated with a very skewed and unbalanced code distribution, what entails reduced interoperability. In addition, in some languages the availability of coded documents is very limited. This paper proposes a cross-lingual approach based on Machine Translation methods to code death certificates with ICD-10 using supervised learning. The aim of this approach is to increase the availability of coded documents by combining collections of different languages, which may also contribute to reduce their possible bias in the ICD distribution, i.e. to avoid the promotion of a subset of codes due to service or environmental factors. A significant improvement in system performance is achieved for those labels with few occurrences.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.