Data Extraction for Deep Web Using WordNet

Jer Lang Hong

doi:10.1109/tsmcc.2010.2089678

Abstract

Our survey shows that the techniques used in data extraction from deep webs need to be improved to achieve the efficiency and accuracy of automatic wrappers. Further investigations indicate that the development of a lightweight ontological technique using existing lexical database for English (WordNet) is able to check the similarity of data records and detect the correct data region with higher precision using the semantic properties of these data records. The advantages of this method are that it can extract three types of data records, namely, single-section data records, multiple-section data records, and loosely structured data records, and it also provides options for aligning iterative and disjunctive data items. Experimental results show that our technique is robust and performs better than the existing state-of-the-art wrappers. Tests also show that our wrapper is able to extract data records from multilingual web pages and that it is domain independent.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Data Extraction for Deep Web Using WordNet

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews)

Lead the way for us

Journal: IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews)	Publication Date: Nov 1, 2011
Citations: 62

Similar Papers

A Deep Web Data Extraction and Application System Based on Cloud Technology
Zi Yang Han ... Feng Ying Wang
Advanced Materials Research | VOL. 756-759
Zi Yang Han, et. al.Zi Yang Han ... Feng Ying Wang
01 Sep 2013
Advanced Materials Research | VOL. 756-759

Automated Data Extraction with Multiple Ontologies
Jer Lang Hong
International Journal of Grid and Distributed Computing | VOL. 9
Jer Lang HongJer Lang Hong
30 Jun 2016
International Journal of Grid and Distributed Computing | VOL. 9

Aligning Data Records Using WordNet
Jer Lang Hong ... Simon Egerton
-
Jer Lang Hong, et. al.Jer Lang Hong ... Simon Egerton
01 Jan 2009
01 Jan 2009

Information extraction for search engines using fast heuristic techniques
Jer Lang Hong ... Simon Egerton
Data & Knowledge Engineering | VOL. 69
Jer Lang Hong, et. al.Jer Lang Hong ... Simon Egerton
24 Oct 2009
Data & Knowledge Engineering | VOL. 69

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Data Extraction for Deep Web Using WordNet

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews)