Abstract
With the advent of big data era, enormous volumes of data are generated every second. Varied data processing algorithms and architectures have been proposed in the past to achieve better execution of data mining algorithms. One such algorithm is extracting most frequently occurring patterns from the transactional database. Dependency of transactions on time and location further makes frequent itemset mining task more complex. The present work targets to identify and extract the frequent patterns from such time and location-aware transactional data. Primarily, the spatio-temporal dependency of air quality data is leveraged to find out frequently co-occurring pollutants over several locations of Delhi, the capital city of India. Varied approaches have been proposed in the past to extract frequent patterns efficiently, but this work suggests a generalized approach that can be applied to any numeric spatio-temporal transactional data, including air quality data. Furthermore, a comprehensive description of the algorithm along with a sample running example on air quality dataset is shown in this work. A detailed experimental evaluation is carried out on the synthetically generated datasets, benchmark datasets, and real world datasets. Furthermore, a comparison with spatio-temporal apriori as well as the other state-of-the-art non-apriori-based algorithms is shown. Results suggest that the proposed algorithm outperformed the existing approaches in terms of execution time of algorithm and memory resources.
Highlights
AND MOTIVATIONWeb generates enormous volumes of heterogeneous data every second via sources such as social media, sensors, business enterprises etc
We propose hashing based spatiotemporal frequent itemset mining algorithm in this work which can be applied to varied spatio-temporal datasets including the air quality dataset
With the commencement of associated location and temporal information along with the transactions, efficient algorithms are required for extracting frequent itemsets from such databases
Summary
Web generates enormous volumes of heterogeneous data every second via sources such as social media, sensors, business enterprises etc. One such data that is focused upon in this work is air quality dataset, in addition to other transactional and synthetically generated datasets. Association rule mining plays a critical role in applications such as market basket analysis, business [1] etc. Prominent among them are the applications which generate spatio-temporal transactional data. Such datasets have a property that the information generated at one space and time behaves differently than the information generated at other space and time. To mine association rules among such databases, considering space-time information
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.