ABSTRACT Based on the deep research of Ant Colony Algorithm, an Ant Colony decision rule Algorithm is proposed for rules mining based on the thought of ant colony and rule mining of decision tree. Then the algorithm is compared with C4.5 and applied on the rules mining and the results are showed by simulation. INTRODUCTION In general, the goal of data mining is to extract knowledge from data. Data mining is an inter-disciplinary field, whose core is at the intersection of machine learning, statistics and databases (Quinlan, 1986). There are several data mining tasks, including classification, regression, clustering, dependence modeling, etc. (Quinlan, 1993). Each of these tasks can be regarded as a kind of problem to be solved by a data mining algorithm. Therefore, the first step in designing a data mining algorithm is to define which task the algorithm will address. In recent, there are many mining tools, such as neutral network, gene algorithm, decision trees, rule referring, to predict the future development[1], and they are able to help people to make good decisions. But there are some shortcomings in these methods, such as incomprehensive results, over-fit rules and difficulties in being applied on distributed simulation. In this paper we propose an Ant Colony Optimization (ACO) algorithm (Dorigo, Maniezzo & Colorni, 1996; Stutzle & Hoos, 1997; and de A. Silla & Ramalho, 2001) for the classification task of data mining. In this task the goal is to assign each case (object, record, or instance) to one class, out of a set of predefined classes, based on the values of some attributes (called predictor attributes) for the case. In the context of the classification task of data mining, discovered knowledge is often expressed in the form of IF-THEN rules, as follows: IF THEN . The rule antecedent (IF part) contains a set of conditions, usually connected by a logical conjunction operator (AND). In this paper we will refer to each rule condition as a term, so that the rule antecedent is a logical conjunction of terms in the form: IF term1 AND term2 AND ... Each term is a triple, such as . The rule consequent (THEN part) specifies the class predicted for cases whose predictor attributes satisfy all the terms specified in the rule antecedent. From a data mining viewpoint, this kind of knowledge representation has the advantage of being intuitively comprehensible for the user, as long as the number of discovered rules and the number of terms in rule antecedents are not large. In this paper, the ACDR is applied for resolving the distributed database mining. The simulation results show that the ACDR is an available and correct algorithm for distributed mining. THE ANT COLONY OPTIMIZATION ALGORITHM The ant colony optimization technique has emerged recently as a novel meta-heuristic belongs to the class of problem-solving strategies derived from natural (other categories include neural networks, simulated annealing, and evolutionary algorithms). The ant system optimization algorithm is basically a multi-agent system where low level interactions between single agents (i.e., artificial ants) result in a complex behavior of the whole ant colony. Ant system optimization algorithms have been inspired by colonies of real ants, which deposit a chemical substance (called pheromone) on the ground. It was found that the medium used to communicate information among individuals regarding paths, and used to decide where to go, consists of pheromone trails. A moving ant lays some pheromone (in varying quantities) on the ground, thus making the path by a trail of this substance. While an isolated ant moves essentially at random, an ant encountering a previously laid trail can detect it and decide with high probability to follow it, thus reinforcing the trail with its own pheromone. …
Read full abstract