Web Query Research Articles

Along with the development of network technology, web information are rapidly growing, and the way of information storage is gradually changed from the html to the database, thus web information can be divided into the surface web and deep web. Deep web is a concept corresponding to the surface web. It means from ordinary search engine that is difficult to discover the information content of a web page. The traditional crawler only crawl the content on the surface of a web, which makes the current traditional search engine, did not retrieve deep web data. Deep web compared with surface web has the advantage of large volume, high quality, theme single-minded, good structured. In view of several advantages, the establishment of deep web data integration system is becoming a research hotspot. The deep web query interface is the only entrance of the background database, so how to determine which web form is the query interface is important to the deep web information access. However, because the page proportion on the internet which contains querying interface is very small, using the traditional breadth-first strategy and keyword filtering method to crawl, it will download a lot of unrelated pages, spend a lot of resources, we need a way to efficiently find and collect the query interfaces through deep web crawling strategy. We proposed novel query planning approach, for executing different types of complex attribute through queries over multiple inter-dependent deep web data sources. increase accelerate query searching based on attribute selection, execution and propose optimization techniques, including query plan merging and grouping optimization. Keywords: Novel query planning approach, Deep web, Semantic Deep Web, Ontologies, attribute.

Read full abstract

The amount of information contained in databases available on the Web has grown explosively in the last years. This information, known as the Deep Web, is heterogeneous and dynamically generated by querying these back-end (relational) databases through Web Query Interfaces (WQIs) that are a special type of HTML forms. The problem of accessing to the information of Deep Web is a great challenge because the information existing usually is not indexed by general-purpose search engines. Therefore, it is necessary to create efficient mechanisms to access, extract and integrate information contained in the Deep Web. Since WQIs are the only means to access to the Deep Web, the automatic identification of WQIs plays an important role. It facilitates traditional search engines to increase the coverage and the access to interesting information not available on the indexable Web. The accurate identification of Deep Web data sources are key issues in the information retrieval process. In this paper we propose a new strategy for automatic discovery of WQIs. This novel proposal makes an adequate selection of HTML elements extracted from HTML forms, which are used in a set of heuristic rules that help to identify WQIs. The proposed strategy uses machine learning algorithms for classification of searchable (WQIs) and non-searchable (non-WQI) HTML forms using a prototypes selection algorithm that allows to remove irrelevant or redundant data in the training set. The internal content of Web Query Interfaces was analyzed with the objective of identifying only those HTML elements that are frequently appearing provide relevant information for the WQIs identification. For testing, we use three groups of datasets, two available at the UIUC repository and a new dataset that we created using a generic crawler supported by human experts that includes advanced and simple query interfaces. The experimental results show that the proposed strategy outperforms others previously reported works.

Read full abstract

Web Query Research Articles

Related Topics

Articles published on Web Query

Exploiting location information for Web search

Efficient Processing of Semantic Web Queries in HBase and MySQL Cluster

Deep Web query interface schema matching based on matching degree and semantic similarity

Syndromic Surveillance for Outbreak Detection and Investigation

Query Intent Disambiguation of Keyword-Based Semantic Entity Search in Dataspaces

Mining subtopics from text fragments for a web query

Spatio-Temporal Web Sensors Using Web Queries vs. Documents

Evaluating Question Answering Over Linked Data

Efficient Approach for Knowledge Management Using Deep Web Information Retrieval System

Advance Frameworks for Hidden Web Retrieval Using Innovative Vision-Based Page Segmentation

A dynamic composition of ontology modules approach: application to web query reformulation

Moving towards Positive Security Model for Web Application Firewall

Head Lice Surveillance on a Deregulated OTC-Sales Market: A Study Using Web Query Data

Joint Top-K Spatial Keyword Query Processing

BENCHMARKING THE QUALITY OF GEOWEB: INFORMATION AND TACIT KNOWLEDGE ABOUT RESTAURANTS IN THREE ITALIAN CITIES

A Highest Sense Count Based Method for Disambiguation of Web Queries for Hindi Language Web Information Retrieval

Web Supported Query Taxonomy Classifier

Entity Synonyms for Structured Web Search

Automatic discovery of Web Query Interfaces using machine learning techniques

Data management on the spatial web

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Web Query Research Articles

Related Topics

Articles published on Web Query

Exploiting location information for Web search

Efficient Processing of Semantic Web Queries in HBase and MySQL Cluster

Deep Web query interface schema matching based on matching degree and semantic similarity

Syndromic Surveillance for Outbreak Detection and Investigation

Query Intent Disambiguation of Keyword-Based Semantic Entity Search in Dataspaces

Mining subtopics from text fragments for a web query

Spatio-Temporal Web Sensors Using Web Queries vs. Documents

Evaluating Question Answering Over Linked Data

Efficient Approach for Knowledge Management Using Deep Web Information Retrieval System

Advance Frameworks for Hidden Web Retrieval Using Innovative Vision-Based Page Segmentation

A dynamic composition of ontology modules approach: application to web query reformulation

Moving towards Positive Security Model for Web Application Firewall

Head Lice Surveillance on a Deregulated OTC-Sales Market: A Study Using Web Query Data

Joint Top-K Spatial Keyword Query Processing

BENCHMARKING THE QUALITY OF GEOWEB: INFORMATION AND TACIT KNOWLEDGE ABOUT RESTAURANTS IN THREE ITALIAN CITIES

A Highest Sense Count Based Method for Disambiguation of Web Queries for Hindi Language Web Information Retrieval

Web Supported Query Taxonomy Classifier

Entity Synonyms for Structured Web Search

Automatic discovery of Web Query Interfaces using machine learning techniques

Data management on the spatial web