Boosting biomedical document classification through the use of domain entity recognizers and semantic ontologies for document representation: The case of gluten bibliome

Martín Pérez-Pérez,Tânia Ferreira,Anália Lourenço,Gilberto Igrejas,Florentino Fdez-Riverola

doi:10.1016/j.neucom.2021.10.100

Abstract

The increasing number of scientific research documents published keeps growing at an unprecedented rate, making it increasingly difficult to access practical information within a target domain. This situation is motivating a growing interest in applying text mining techniques for the automatic processing of text resources to structure the information that helps researchers to find information of interest and infer knowledge of practical use. However, the automatic processing of research documents requires the previous existence of large, manually annotated text corpora to develop robust and accurate text mining processing methods and machine learning models. In this context, semi-automatic extraction techniques based on structured data and state-of-the-art biomedical tools appear to have significant potential to enhance curator productivity and reduce the costs of document curation. In this line, this work proposes a semi-automatic machine learning workflow and a NER + Ontology boosting technique for the automatic classification of biomedical literature. The practical relevance of the proposed approach has been proven in the curation of 4,115 gluten-related documents extracted from PubMed and contrasted against the word embedding alternative. Comparing the results of the experiments, the proposed NER + Ontology technique is an effective alternative to other state-of-the-art document representation techniques to process the existing biomedical literature.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Neurocomputing	Publication Date: Nov 11, 2021
Citations: 3	License type: cc-by

R Discovery Prime

R Discovery Prime

Boosting biomedical document classification through the use of domain entity recognizers and semantic ontologies for document representation: The case of gluten bibliome

Abstract

Talk to us

Similar Papers

More From: Neurocomputing

Lead the way for us

Similar Papers

Accuracy of machine learning models using ultrasound images in prostate cancer diagnosis: a systematic review
Retta Catherina Sihotang ... Agus Rizal Ardy Hariandy Hamid
Medical Journal of Indonesia | VOL. 32
Retta Catherina Sihotang, et. al.Retta Catherina Sihotang ... Agus Rizal Ardy Hariandy Hamid
20 Oct 2023
Medical Journal of Indonesia | VOL. 32

A Meta-analysis of Predicting Disorders of Consciousness After Traumatic Brain Injury by Machine Learning Models.
Xi Zhu ... Li Gao
Alpha psychiatry | VOL. 25
Xi Zhu, et. al.Xi Zhu ... Li Gao
01 Jun 2024
Alpha psychiatry | VOL. 25

Abstract B065: FabricaTM: A large-scale data simulation platform isolates tumor signal from cell-free DNA and improves tissue of origin prediction accuracy
Kade Pettie ... Kieran Chacko
Clinical Cancer Research | VOL. 30
Kade Pettie, et. al.Kade Pettie ... Kieran Chacko
13 Nov 2024
Clinical Cancer Research | VOL. 30

Projecting Large Fires in the Western US With an Interpretable and Accurate Hybrid Machine Learning Method
Fa Li ... Min Chen
Earth's Future | VOL. 12
Fa Li, et. al.Fa Li ... Min Chen
01 Oct 2024
Earth's Future | VOL. 12

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Boosting biomedical document classification through the use of domain entity recognizers and semantic ontologies for document representation: The case of gluten bibliome

Abstract

Talk to us

Similar Papers

More From: Neurocomputing