Building a generic debugger for information extraction pipelines

Anish Das Sarma,Philip Bohannon,Alpa Jain

doi:10.1145/2063576.2063933

Building a generic debugger for information extraction pipelines

Anish Das Sarma, Philip Bohannon + Show 1 more

Open Access

https://doi.org/10.1145/2063576.2063933

Copy DOI

Publication Date: Oct 24, 2011

Citations: 13

Affiliation: Yahoo (United States)

#Information Extraction Pipelines #Extraction Pipelines + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

Complex information extraction (IE) pipelines are becoming an integral component of most text processing frameworks. We introduce a first system to help IE users analyze extraction pipeline semantics and operator transformations interactively while debugging. This allows the effort to be proportional to the need, and to focus on the portions of the pipeline under the greatest suspicion. We present a generic debugger for running post-execution analysis of any IE pipeline consisting of arbitrary types of operators. For this, we propose an effective provenance model for IE pipelines which captures a variety of operator types, ranging from those for which full to no specifications are available. We have evaluated our proposed algorithms and provenance model on large-scale real-world extraction pipelines.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Similar Papers

Paper Title

Journal

Date

Author

View more papers

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.