Abstract

AbstractIn this paper, we propose a new algorithm for mining association rules in corpus efficiently. Compared to classical transactional association rule mining problems, corpus contains large amount of items, and what is more, there are by far more item sets in corpus, and traditional association rule mining algorithm cannot handle corpus efficiently. To address this issue, a new algorithm, which combines the techniques of inverted hashing and the advantage of FP-Growth structure, is designed with enough considerations on the characteristic of corpus. Experimental results demonstrate that the new algorithm has gained a great promotion on performance. KeywordsText miningAssociation rulesInverted hashingApriori algorithm

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call