XML has become a standard technology in exchange of a wide variety of data on web and internet for its structure, label, portability and expansibility. To efficiently query XML documents has been the primary urgent task. At the present time, most of XML index and query are based on encoding the XML document tree, so all kinds of XML encoding schemes are proposed, and region coding is the mainstream coding and used most commonly, such as Dietz coding, Li-Moon coding, Zhang coding, Wan coding, etc. The paper proposes an extended region coding based on region coding. Preorder XML document tree, and take preorder numerical orders of a node's all descendants as the region. When carrying out structural join, if preorder numerical order of a node is in this region, structural relation can be ensured. So this extended region coding can help effectively judge structural relation and the XML document tree needn’t be traversed. Furthermore, the better structural join algorithms of XML path queries have received considerable attention recently, and some researchers have proposed some fine algorithms to solve the problem. Stack-Tree-Desc algorithm is one of these fine algorithms, it need separately scan ancestor list and descendant list one time to decide ancestor/descendant structural relation, but some unneeded join nodes still be scanned. For this reason, if some element nodes of ancestor list and descendant list which don’t need participate in structural join can be jumped, the query efficiency is enhanced. Therefore, based on Stack-Tree-Desc algorithm an improved algorithm which introduces index structure to avoid scanning unwanted nodes, so ordered scanning is unnecessary, the consuming time of query shortens accordingly. And this improved algorithm can quickly judge structural relation based on extended region coding presented in this paper. Experiment is conducted to test the effectiveness of the extended region coding and the Indexed Stack-Tree-Desc algorithm. Experiment results show that the method in this paper is effective.
Read full abstract