TwigStack-MR: An Approach to Distributed XML Twig Query Using MapReduce

Hongjie Fan,Junfei Liu,Han Yang,Zhiyi Ma

doi:10.1109/bigdatacongress.2016.79

Abstract

Twig pattern query is the core operation of XML process, which directly affects the efficiency of XML data query. It is a challenge to manipulate massive XML data, especially on distributed cluster, such as how to effectively ensure the completeness and correctness of the query results, and minimize communication costs between the various machines. In this paper, we present TwigStack-MR, which simultaneously processes several twig pattern queries for a massive volume of XML data based on MapReduce framework. We first split the large scale XML data file into file-splits as input to the distributed storage system. Then we present the distributed twig algorithm, processing different subtrees of the document tree in parallel. Finally we use the MapReduce framework, full characteristics of distributed environments, to process twig query efficiently. The experimental results show that our approach is efficient and scalable on this issue.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

TwigStack-MR: An Approach to Distributed XML Twig Query Using MapReduce

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Distributed XPath query processing over large XML data based on MapReduce framework
Hongjie Fan ... Junfei Liu
-
Hongjie Fan, et. al.Hongjie Fan ... Junfei Liu
01 Aug 2016
01 Aug 2016

Handling distributed XML queries over large XML data based on MapReduce framework
Hongjie Fan ... Junfei Liu
Information Sciences | VOL. 453
Hongjie Fan, et. al.Hongjie Fan ... Junfei Liu
11 Apr 2018
Information Sciences | VOL. 453

HadoopXML
Hyebong Choi ... Yoon-Joon Lee
-
Hyebong Choi, et. al.Hyebong Choi ... Yoon-Joon Lee
29 Oct 2012
29 Oct 2012

Parallel labeling of massive XML data with MapReduce
Hyebong Choi ... Kyong-Ha Lee
The Journal of Supercomputing | VOL. 67
Hyebong Choi, et. al.Hyebong Choi ... Kyong-Ha Lee
29 Aug 2013
The Journal of Supercomputing | VOL. 67

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

TwigStack-MR: An Approach to Distributed XML Twig Query Using MapReduce

Abstract

Talk to us

Similar Papers