Abstract

BackgroundA considerable amount of meaningful information is routinely recorded in Chinese clinical data in text format, referred to as Chinese clinical terms. The lack of coding is a major difficulty hindering the application of clinical terms. SNOMED CT is a widely used and comprehensive clinical health care terminology collection because of its coverage, granularity, clinical orientation, and logical underpinning. It is useful and efficient for automatically assigning SNOMED CT codes to Chinese clinical terms, but it still faces several problems. Current cross-language clinical term matching studies rely on external resources, such as machine translation and rule-based methods. Semantic matching methods have achieved strong performance on text matching, but few studies have been done on cross-language clinical term matching. We present an effective attention-based semantic matching algorithm to automatically cross-language code Chinese clinical terms with SNOMED CT. MethodFirstly, BERT was used to turn the input into word embedding. Then, the word embeddings were encoded through a BiLSTM with self-attention to focus on capturing distant relationships among words with different weights depending on their contribution to semantic matching. Then, decomposable attention was used to make semantic matching trivially parallelizable to speed up calculation. Finally, fully connected layers and a sigmoid were utilized to output matching results. ResultsThe 29,960 manually coded Chinese clinical terms, 30,040 unmatched Chinese clinical terms and SNOMED CT codes were collected to evaluate the proposed method. Compared with the existing semantic matching method, the proposed approach achieves state-of-the-art results demonstrating the effectiveness of the method with an accuracy of 0.905, a precision of 0.856, a recall of 0.518, and an F-measure of 0.645. The proposed Chinese-English bilingual term mapping, Chinese character-level and word-level encoder, English word-level encoder, BERT model, and attention mechanism performed better than other methods. ConclusionThe proposed automatic SNOMED CT coding approach of Chinese clinical terms via attention-based semantic matching can improve the performance of automated SNOMED CT code assignment for Chinese clinical terms and improve the efficiency of the code assignment.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.