Abstract
Extracting data from deep Web pages is a challenging problem due to the underlying intricate structures of such pages. A large number of techniques have been proposed to address this problem, but all of them have inherent limitations because they are Web-page-programming-language-dependent. The contents on Web pages are always displayed regularly for users to browse. There is different ways for deep Web data extraction to overcome the limitations of previous works by utilizing some interesting common visual features on the deep Web pages. In this paper vision-based approach is Web page programming-language-independent approach is proposed. This approach utilizes the visual features of the web pages to extract data from deep web pages including data record extraction and data item extraction. Again we also propose a new evaluation measure revision to capture human effort needed to produce exact extraction of data. Our implementation on large set of web databases describes the proposed vision-based approach is highly effective for data extraction from deep web pages.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.