Abstract

Ant colony optimization (ACO) algorithms have been successfully applied to identify classification rules in data mining. This paper proposes a new ant colony optimization algorithm, named hmAntMinerorder, for the hierarchical multilabel classification problem in protein function prediction. The proposed algorithm is characterized by an orderly roulette selection strategy that distinguishes the merits of the data attributes through attributes importance ranking in classification model construction. A new pheromone update strategy is introduced to prevent the algorithm from getting trapped in local optima and thus leading to more efficient identification of classification rules. The comparison studies to other closely related algorithms on 16 publicly available datasets reveal the efficiency of the proposed algorithm.

Highlights

  • In the last few decades, various techniques have been successfully proposed to solve classification problems in the fields of machine learning and data mining [1,2,3]

  • This paper proposes a new ant colony optimization algorithm, named hmAntMinerorder, for the hierarchical multilabel classification problem in protein function prediction

  • Most of the existing classification techniques are designed to handle data with binary or nominal class labels. They cannot handle problems with multiple class labels organized in hierarchical structure (CHS) [4]

Read more

Summary

Introduction

In the last few decades, various techniques have been successfully proposed to solve classification problems in the fields of machine learning and data mining [1,2,3]. Most of the existing classification techniques are designed to handle data with binary or nominal class labels (where class labels are independent). They cannot handle problems with multiple class labels organized in hierarchical structure (CHS) [4]. Such problems are commonly known as hierarchical classification with regard to the one-level flat classification problems. Samples classified in the lower levels of the hierarchy must satisfy the parent-child relationships; that is, they should fall within the parent classes. A sample can be classified to multiple classes that have no parent-child relationship

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call