Mining of Association Rules in Text Databases Using Inverted Hashing and Pruning

John D Holt,Soon M Chung

doi:10.1007/3-540-44466-1_29

Mining of Association Rules in Text Databases Using Inverted Hashing and Pruning

John D Holt, Soon M Chung

https://doi.org/10.1007/3-540-44466-1_29

Copy DOI

Publication Date: Jan 1, 2000

Citations: 7

#Text Databases #Existing Mining Algorithms + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

In this paper, we propose a new algorithm named Inverted Hashing and Pruning (IHP) for mining association rules between words in text databases. The characteristics of text databases are quite different from those of retail transaction databases, and existing mining algorithms cannot handle text databases efficiently, because of the large number of itemsets (i.e., words) that need to be counted. Two well-known mining algorithms, the Apriori algorithm [1] and Direct Hashing and Pruning (DHP) algorithm [5], are evaluated in the context of mining text databases, and are compared with the proposed IHP algorithm. It has been shown that the IHP algorithm has better performance for large text databases.

Full Text