A structural SVM approach for reference parsing

Xiaoli Zhang,George R Thoma,Jie Zou,Daniel X Le

doi:10.1186/1471-2105-12-s3-s7

Xiaoli Zhang, George R Thoma + Show 2 more

Open Access

https://doi.org/10.1186/1471-2105-12-s3-s7

Copy DOI

Abstract

BackgroundAutomated extraction of bibliographic data, such as article titles, author names, abstracts, and references is essential to the affordable creation of large citation databases. References, typically appearing at the end of journal articles, can also provide valuable information for extracting other bibliographic data. Therefore, parsing individual reference to extract author, title, journal, year, etc. is sometimes a necessary preprocessing step in building citation-indexing systems. The regular structure in references enables us to consider reference parsing a sequence learning problem and to study structural Support Vector Machine (structural SVM), a newly developed structured learning algorithm on parsing references.ResultsIn this study, we implemented structural SVM and used two types of contextual features to compare structural SVM with conventional SVM. Both methods achieve above 98% token classification accuracy and above 95% overall chunk-level accuracy for reference parsing. We also compared SVM and structural SVM to Conditional Random Field (CRF). The experimental results show that structural SVM and CRF achieve similar accuracies at token- and chunk-levels.ConclusionsWhen only basic observation features are used for each token, structural SVM achieves higher performance compared to SVM since it utilizes the contextual label features. However, when the contextual observation features from neighboring tokens are combined, SVM performance improves greatly, and is close to that of structural SVM after adding the second order contextual observation features. The comparison of these two methods with CRF using the same set of binary features show that both structural SVM and CRF perform better than SVM, indicating their stronger sequence learning ability in reference parsing.

Highlights

Automated extraction of bibliographic data, such as article titles, author names, abstracts, and references is essential to the affordable creation of large citation databases
Given a training sample of input-output pairs (x1,y1),...Î X ×Y drawn from an unknown distribution, structural Support Vector Machine (SVM) addresses the general problem of learning a mapping f : X ® Y from input patterns x Î X to discrete outputs y Î Y that has low prediction errors
We have compared SVM and structural SVM as methods for parsing references that appear in medical journal articles

Summary

Introduction

Automated extraction of bibliographic data, such as article titles, author names, abstracts, and references is essential to the affordable creation of large citation databases. References, typically appearing at the end of journal articles, can provide valuable information for extracting other bibliographic data. Parsing individual reference to extract author, title, journal, year, etc. Bibliographic references, typically cited at the end of scientific articles, provide much valuable information. Parsing these references is an essential step for building citation-indexing systems. With the rapid increase of journal literature indexed by MEDLINE every year, it is essential to have automated methods to extract bibliographic data, including article titles, author names, affiliations, abstracts, and many others

Objectives

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC bioinformatics	Publication Date: Jun 9, 2011
Citations: 34	License type: CC BY 2.0

R Discovery Prime

R Discovery Prime

A structural SVM approach for reference parsing

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC bioinformatics

Lead the way for us

Similar Papers

A Structural SVM Approach for Reference Parsing
Xiaoli Zhang ...
-
Xiaoli Zhang, et. al.Xiaoli Zhang ...
01 Dec 2010
01 Dec 2010

Scalable sequential alternating proximal methods for sparse structural SVMs and CRFs
P Balamurugan ... T Ravindra Babu
Knowledge and information systems | VOL. 38
P Balamurugan, et. al.P Balamurugan ... T Ravindra Babu
03 Sep 2013
Knowledge and information systems | VOL. 38

Recognizing clinical entities in hospital discharge summaries using Structural Support Vector Machines with word representation features
Buzhou Tang ... Hua Xu
BMC medical informatics and decision making | VOL. 13
Buzhou Tang, et. al.Buzhou Tang ... Hua Xu
01 Apr 2013
BMC medical informatics and decision making | VOL. 13

A comprehensive study of named entity recognition in Chinese clinical text
J Lei ... H Xu
Journal of the American Medical Informatics Association : JAMIA | VOL. 21
J Lei, et. al.J Lei ... H Xu
17 Dec 2013
Journal of the American Medical Informatics Association : JAMIA | VOL. 21

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A structural SVM approach for reference parsing

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC bioinformatics