Some web applications use the ontology to integrate multiple data sources because ontology-based data integration is the ideal solution to handle the semantic conflict. For accessing the data on ontology, ontology understanding query such as SPARQL is needed. However, end users enter the unstructured sentence (words, statements, etc.) as an input when they wanted to search the required information on the web. So, it is needed to extract the triplets (i.e. subjects, predicates and objects) from the input query to build the ontology browsing query SPARQL. Although there are many triplet extraction algorithms, either they can't fully define all triple patterns from the incoming query or they are time consuming process. The proposed algorithm presented in this paper can handle this triplet's incompleteness problem and the aim of this system is to extract the specific triplets from incoming query and to add the necessary information for supporting SPARQL query generating process in a time-saving manner. triple patterns from the user input query. Then the mediator automatically generates the SPARQL query to browse the ontology by using the triplets obtained from the triplet extraction algorithm. After the SPARQL query has been generated, the data included in the ontology can be retrieved and returned to the user. There are three main processing phases in ontology based data integration system such as ontology creation, ontology mapping and query service. Extracting the triplets from the user query is one of the most important roles of query service as describe in above. When the user query is submitted to the system, this query is needed to translate to the ontology understanding query SPARQL. The query language SPARQL has a graph-based structure and can be built by combining triple patterns extracted from the user input query (2). If the user input query is unstructured query, it will be more difficult to define whether which words are subjects or objects. The proposed algorithm is able to identify the subjects, predicates and objects in the unstructured query exactly. Currently there are many triplets extraction algorithm for supporting the triple patterns to assist the necessary application. They implement the triplet extraction process based on the parse tree generated by the parser. In this proposed algorithm, triple patterns included in the user input query are extracted with the help of domain specific ontology and machine readable dictionary WorldNet instead of using the parser. So, the processing time of this algorithm is lesser than other parser-based approaches. This paper is organized as follows. In Section II, this paper presents the related work. The system design detail is presented in Section III. It gives in Section IV the description of proposed triplet extraction algorithm. Section V demonstrates the sample query testing and results of proposed algorithm by comparing the result of other competitive algorithms. Section VI concludes with some remarks.
Read full abstract