ObjectRunner

Talel Abdessalem,Bogdan Cautis,Nora Derouiche

doi:10.14778/1920841.1921045

Abstract

We present in this paper ObjectRunner, a system for extracting, integrating and querying structured data from the Web. Our system harvests real-world items from template-based HTML pages (the so-called structured Web). It illustrates a two-phase querying of the Web, in which an intentional description of the targeted data is first provided, in a flexible and widely applicable manner. ObjectRunner follows then a lightweight, best-effort approach, leveraging both the input description and the source structure. This process is domain-independent, in the sense that it applies to any relation, either flat or nested, describing real-world items. We advocate via our prototype that fully automatic extraction and integration of structured data can be done fast and effectively, when the redundancy of the Web meets knowledge over the to-be-extracted data. We present the technical details and the overall platform through several application scenarios on real-life Web sources.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

ObjectRunner

Abstract

Talk to us

Similar Papers

More From: Proceedings of the VLDB Endowment

Lead the way for us

Journal: Proceedings of the VLDB Endowment	Publication Date: Sep 1, 2010
Citations: 13

Similar Papers

A Framework for the Automatic Integration and Diagnosis of Building Energy Consumption Data.
Shuang Yuan ... Yun-Yi Zhang
Sensors | VOL. 21
Shuang Yuan, et. al.Shuang Yuan ... Yun-Yi Zhang
17 Feb 2021
Sensors | VOL. 21

Automatic Extraction of Structured Web Data with Domain Knowledge
Nora Derouiche ... Talel Abdessalem
-
Nora Derouiche, et. al.Nora Derouiche ... Talel Abdessalem
01 Apr 2012
01 Apr 2012

QODI: Query as Context in Automatic Data Integration
Aibo Tian ... Juan F Sequeda
-
Aibo Tian, et. al.Aibo Tian ... Juan F Sequeda
01 Jan 2013
01 Jan 2013

Schema clustering and retrieval for multi-domain pay-as-you-go data integration systems
Hatem A Mahmoud ... Ashraf Aboulnaga
-
Hatem A Mahmoud, et. al.Hatem A Mahmoud ... Ashraf Aboulnaga
06 Jun 2010
06 Jun 2010

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

ObjectRunner

Abstract

Talk to us

Similar Papers

More From: Proceedings of the VLDB Endowment