Dense Databases Research Articles

The concise representations of sequential patterns, including maximal sequential patterns, closed sequential patterns and sequential generator patterns, play an important role in data mining since they provide several benefits when compared to sequential patterns. One of the most important benefits is that their cardinalities are generally much less than the cardinality of the set of sequential patterns. Therefore, they can be mined more efficiently, use less storage space, and it is easier for users to analyze the information provided by the concise representations. In addition, the set of all maximal sequential patterns can be utilized to recover the complete set of sequential patterns, while closed sequential patterns and sequential generators can be used together to generate non-redundant sequential rules and to quickly recover all sequential patterns and their frequencies. Several algorithms have been proposed to mine the concise representations separately, i.e., each of them has been designed to discover only a type of the concise representation. However, they remain time-consuming and memory intensive tasks. To address this problem, we propose three novel efficient algorithms named FMaxSM, FGenCloSM and MaxGenCloSM to exploit only maximal sequential patterns, to simultaneously mine both the sets of closed sequential patterns and generators, and to discover all three concise representations during the same process. To our knowledge, MaxGenCloSM is the first algorithm for concurrently mining the three concise representations of sequential patterns. The proposed algorithms are based on two novel local pruning strategies called LPMAX and LPMaxGenClo that are designed to prune non-maximal, non-closed and non-generator patterns earlier and more efficiently at two and three successive levels of the prefix tree without subsequence relation checking. Extensive experiments on real-life and synthetic databases show that FMaxSM, FGenCloSM and MaxGenCloSM are up to two orders of magnitude faster than the state-of-the-art algorithms and that the proposed algorithms consume much less memory, especially for low minimum support thresholds and for dense databases.

Read full abstract

Among several methods of extracting association rules that have been reported, a new evolutionary method named Genetic Network Programming (GNP) has also shown its effectiveness for small databases in the sense that they have a relatively small number of attributes. However, this conventional GNP method is not be able to deal with large databases with a huge number of attributes, because its search space becomes very large, causing bad performance at running time. The aim of this paper is to propose a new method to extract association rules from large and dense databases with a huge amount of attributes through the combination of conventional GNP based mining method and a specially designed genetic algorithm (GA). Each of these evolutionary methods works in its own processing level and they are highly synchronized to act as one system.Our strategy consists in the division of a large and dense database into many small databases. These small databases are considered as individuals and form a population. Then the conventional GNP based mining method is applied to extract association rules for each of these individuals. Finally, the population is evolved through several generations using GA with special genetic operators considering the acquired information. Two complementary processing levels are defined: Global Level and Local Level, each with its own independent tasks and processes. In the Global Level mainly GA process is carried out, whereas in the Local Level, conventional GNP based mining method is carried out in parallel and they generate their own local pools of association rules. Several special genetic operations for GA in the Global Level are proposed and the performance of each of them and their combination is shown and compared.In our simulations, the conventional GNP based mining method and our proposed method are compared using a real world large and dense database with a huge amount of attributes. The results show that extending the conventional GNP based mining method using GA allows to extract association rules from large and dense databases directly and more efficiently than the conventional GNP method.

Read full abstract

Dense Databases Research Articles

Related Topics

Articles published on Dense Databases

An efficient strategy for mining high-efficiency itemsets in quantitative databases

Efficient mining of cross-level high-utility itemsets in taxonomy quantitative databases

Efficient algorithms for mining closed high utility itemsets in dynamic profit databases

Generating a Condensed Representation for Positive and Negative Association Rules

Frequent Itemset Mining in a Unique Sc an using Transaction Database

Efficient algorithms for simultaneously mining concise representations of sequential patterns based on extended pruning conditions

A new framework for mining weighted periodic patterns in time series databases

Vertical Pattern Mining Algorithm for Multiple Support Thresholds

Scalable Indoor Localization via Mobile Crowdsourcing and Gaussian Process.

Finding sequential patterns with TF-IDF metrics in health-care databases

Novel parallel method for association rule mining on multi-core shared memory systems

Efficient Algorithms for Mining Frequent Patterns from Sparse and Dense Databases

Research Methods in Child Language: A Practical Guide. By ErikaHoff, ed. First Edition. Blackwell Publishing Ltd., 2012. 362 pp. $39.72 (paperback).

Clustering local frequency items in multiple databases

An adaptive approach to mining frequent itemsets efficiently

TripleEye: Mining Closed Itemsets with Minimum Length Thresholds Based on Ordered Inclusion Tree

Mining Sequential Patterns in Dense Databases

Combination of Two Evolutionary Methods for Mining Association Rules in Large and Dense Databases

On pushing weight constraints deeply into frequent itemset mining

Comparative Association Rules Mining Using Genetic Network Programming (GNP) with Attributes Accumulation Mechanism and its Application to Traffic Systems

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Dense Databases Research Articles

Related Topics

Articles published on Dense Databases

An efficient strategy for mining high-efficiency itemsets in quantitative databases

Efficient mining of cross-level high-utility itemsets in taxonomy quantitative databases

Efficient algorithms for mining closed high utility itemsets in dynamic profit databases

Generating a Condensed Representation for Positive and Negative Association Rules

Frequent Itemset Mining in a Unique Sc an using Transaction Database

Efficient algorithms for simultaneously mining concise representations of sequential patterns based on extended pruning conditions

A new framework for mining weighted periodic patterns in time series databases

Vertical Pattern Mining Algorithm for Multiple Support Thresholds

Scalable Indoor Localization via Mobile Crowdsourcing and Gaussian Process.

Finding sequential patterns with TF-IDF metrics in health-care databases

Novel parallel method for association rule mining on multi-core shared memory systems

Efficient Algorithms for Mining Frequent Patterns from Sparse and Dense Databases

Research Methods in Child Language: A Practical Guide. By ErikaHoff, ed. First Edition. Blackwell Publishing Ltd., 2012. 362 pp. $39.72 (paperback).

Clustering local frequency items in multiple databases

An adaptive approach to mining frequent itemsets efficiently

TripleEye: Mining Closed Itemsets with Minimum Length Thresholds Based on Ordered Inclusion Tree

Mining Sequential Patterns in Dense Databases

Combination of Two Evolutionary Methods for Mining Association Rules in Large and Dense Databases

On pushing weight constraints deeply into frequent itemset mining

Comparative Association Rules Mining Using Genetic Network Programming (GNP) with Attributes Accumulation Mechanism and its Application to Traffic Systems