Recognizing Scientific Artifacts in Biomedical Literature

Tudor Groza,Jane Hunter,Hamed Hassanzadeh

doi:10.4137/bii.s11572

Tudor Groza, Jane Hunter + Show 1 more

Open Access

https://doi.org/10.4137/bii.s11572

Copy DOI

Journal: Biomedical Informatics Insights	Publication Date: Jan 1, 2013
Citations: 6	License type: NO-CC CODE

Affiliation: University of Queensland

Abstract

Today’s search engines and digital libraries offer little or no support for discovering those scientific artifacts (hypotheses, supporting/contradicting statements, or findings) that form the core of scientific written communication. Consequently, we currently have no means of identifying central themes within a domain or to detect gaps between accepted knowledge and newly emerging knowledge as a means for tracking the evolution of hypotheses from incipient phases to maturity or decline. We present a hybrid Machine Learning approach using an ensemble of four classifiers, for recognizing scientific artifacts (ie, hypotheses, background, motivation, objectives, and findings) within biomedical research publications, as a precursory step to the general goal of automatically creating argumentative discourse networks that span across multiple publications. The performance achieved by the classifiers ranges from 15.30% to 78.39%, subject to the target class. The set of features used for classification has led to promising results. Furthermore, their use strictly in a local, publication scope, ie, without aggregating corpus-wide statistics, increases the versatility of the ensemble of classifiers and enables its direct applicability without the necessity of re-training.

Full Text