Information extraction meets relation databases

Davood Rafiei,Patrick Pantel,Andrei Broder,Edward Chang

doi:10.1145/1645953.1646067

Abstract

Information extraction from unstructured text has much in common with querying in databases systems. Despite some differences on how data is modeled or represented, the general goal remains the same, i.e. to retrieve data or tag elements that satisfy some user-specified constraints. In recent years, the two paradigms have become much closer thanks to the large volume of data on the World Wide Web and the need for more automated search tools for information extraction and often the need for relating the extracted pieces. Several developments have contributed to the growth of the area including the work on named entity recognition (marked by MUC-6 and subsequent conferences) and natural language processing, Web information retrieval and mining, and Web query languages inspired by the query languages in the relational world. This panel explores the areas where the two paradigms overlap, the impacts and contributions they have had on each other and the areas that may be open for further research. The panel will bring together researchers who have worked in some established areas that closely relate to extracting structured information from unstructured text. In the first (role-playing) round, each panelist will strongly take a side on where the intersection is heading, arguing that one area will subsume the other area in near future. In the second round, the panelists will counter one or two others, pointing out the challenges that one area would be facing in subsuming the other and implications for future research directions.

Full Text