A Novel Integrated Approach for Companion Vehicle Discovery Based on Frequent Itemset Mining on Spark

Abdulrahman Al-Badwi,Kamal Al-Sabahi,Mohammed Al-Habib,Zhe Long,Zuping Zhang

doi:10.1007/s13369-019-03831-9

Abstract

Companion vehicle discovery received much attention from the research community. It has been widely adopted by traffic management departments in many aspects such as the involved vehicle tracking. Since there are a massive amount of traffic data that have complex and inaccurate accompanying vehicle relationships, companion vehicle discovery has become a challenge yet hot research topic. Several algorithms have been proposed to solve this issue on transactional datasets some of which are based on the frequent item mining algorithms that are used to extract knowledge from data in several real-world applications, such as market basket analysis, crime detection/prevention, and crowd mining. However, most of those algorithms mostly fail on large-scale datasets since it needs to scan the datasets iteratively for several times, which makes them unfeasible and time-consuming while dealing with big data. To this end, we proposed a novel HD-FIM algorithm to extract the companion vehicles from a massive amount of traffic data with the best execution efficiency on spark platform. It works in a hybrid approach between depth first and breadth first to handle the big data in distributed clusters. Experiment results show that the proposed algorithm, HD-FIM, outperforms the existing typical frequent itemset mining algorithms through practical vehicle set extraction calculations and it can be applied in any applicable traffic big data.

Full Text