Abstract

Providing efficient mining algorithm to discover recent frequent XML user query patterns is crucial, as many applications use XML to represent data in their disciplines over the Internet. These recent frequent XML user query patterns can be used to design an index mechanism or cached and thus enhance XML query performance. Several XML query pattern stream mining algorithms have been proposed to record user queries in the system and thus discover the recent frequent XML query patterns over a stream. By using these recent frequent XML query patterns, the query performance of XML data stream is improved. In this paper, user queries are modeled as a stream of XML queries and the recent frequent XML query patterns are thus mined over the stream. Data-stream mining differs from traditional data mining since its input of mining is data streams, while the latter focuses on mining static databases. To facilitate the one-pass mining process, novel schemes (i.e. XstreamCode and XstreamList) are devised in the mining algorithm (i.e. X2StreamMiner) in this paper. X2StreamMiner not only reduces the memory space, but also improves the mining performance. The simulation results also show that X2StreamMiner algorithm is both efficient and scalable. There are two major contributions in this paper. First, the novel schemes are proposed to encode and store the information of user queries in an XML query stream. Second, based on the two schemes, an efficient XML query stream mining algorithm, X2StreamMiner, is proposed to discover the recent frequent XML query patterns.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.