Hybrid in-database inference for declarative information extraction

Daisy Zhe Wang,Michael L Wick,Minos Garofalakis,Michael J Franklin,Joseph M Hellerstein

doi:10.1145/1989323.1989378

Abstract

In the database community, work on information extraction (IE) has centered on two themes: how to effectively manage IE tasks, and how to manage the uncertainties that arise in the IE process in a scalable manner. Recent work has proposed a probabilistic database (PDB) based declarative IE system that supports a leading statistical IE model, and an associated inference algorithm to answer top-k-style queries over the probabilistic IE outcome. Still, the broader problem of effectively supporting general probabilistic inference inside a PDB-based declarative IE system remains open. In this paper, we explore the in-database implementations of a wide variety of inference algorithms suited to IE, including two Markov chain Monte Carlo algorithms, the Viterbi and the sum-product algorithms. We describe the rules for choosing appropriate inference algorithms based on the model, the query and the text, considering the trade-off between accuracy and runtime. Based on these rules, we describe a hybrid approach to optimize the execution of a single probabilistic IE query to employ different inference algorithms appropriate for different records. We show that our techniques can achieve up to 10-fold speedups compared to the non-hybrid solutions proposed in the literature.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Hybrid in-database inference for declarative information extraction

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Probabilistic declarative information extraction
Daisy Zhe Wang ... Michael J Franklin
-
Daisy Zhe Wang, et. al.Daisy Zhe Wang ... Michael J Franklin
01 Jan 2009
01 Jan 2009

Optimizing Statistical Information Extraction Programs over Evolving Text
Fei Chen ... Min Wang
-
Fei Chen, et. al.Fei Chen ... Min Wang
01 Apr 2012
01 Apr 2012

InfoXtract
Rohini K Srihari ... Wei Li
-
Rohini K Srihari, et. al.Rohini K Srihari ... Wei Li
01 Jan 2003
01 Jan 2003

InfoXtract: A customizable intermediate level information extraction engine
Rohini K Srihari ... Thomas Cornell
Natural Language Engineering | VOL. 14
Rohini K Srihari, et. al.Rohini K Srihari ... Thomas Cornell
09 Jun 2006
Natural Language Engineering | VOL. 14

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Hybrid in-database inference for declarative information extraction

Abstract

Talk to us

Similar Papers