Abstract

One of the main challenges in data-intensive sectors like scientific research, data mining, and machine learning is efficiently analyzing enormous datasets. A popular data structure in similarity search algorithms to speed up the retrieval of closest neighbors is the N-List. In this paper, a high-performance method for mining frequent item sets called EN-list is presented. It represents item sets using an N-list and finds frequently recurring item sets directly using an aset-enumeration search tree. Specifically, it drastically reduces the search field by applying the powerful pruning approach known as Children-Parent Equivalency pruning. We conducted extensive experiments to compare En-list against three state-of-the-art algorithms: Fin, PrePost, and DiffNodesets on four distinct real datasets. The experimental results show that EN-list is always the fastest approach across all datasets. Furthermore, EN-list shows good memory consumption performance, requiring less memory than DiffNodesets and PrePost methods and just slightly more than the Fin approach.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call