Conventional Classification Algorithms Research Articles

Text mining is an important research direction, which involves several fields, such as information retrieval, information extraction, and text categorization. In this paper, we propose an efficient multiple classifier approach to text categorization based on swarm-optimized topic modelling. The Latent Dirichlet allocation (LDA) can overcome the high dimensionality problem of vector space model, but identifying appropriate parameter values is critical to performance of LDA. Swarm-optimized approach estimates the parameters of LDA, including the number of topics and all the other parameters involved in LDA. The hybrid ensemble pruning approach based on combined diversity measures and clustering aims to obtain a multiple classifier system with high predictive performance and better diversity. In this scheme, four different diversity measures (namely, disagreement measure, Q-statistics, the correlation coefficient, and the double fault measure) among classifiers of the ensemble are combined. Based on the combined diversity matrix, a swarm intelligence based clustering algorithm is employed to partition the classifiers into a number of disjoint groups and one classifier (with the highest predictive performance) from each cluster is selected to build the final multiple classifier system. The experimental results based on five biomedical text benchmarks have been conducted. In the swarm-optimized LDA, different metaheuristic algorithms (such as genetic algorithms, particle swarm optimization, firefly algorithm, cuckoo search algorithm, and bat algorithm) are considered. In the ensemble pruning, five metaheuristic clustering algorithms are evaluated. The experimental results on biomedical text benchmarks indicate that swarm-optimized LDA yields better predictive performance compared to the conventional LDA. In addition, the proposed multiple classifier system outperforms the conventional classification algorithms, ensemble learning, and ensemble pruning methods.

Read full abstract

PurposeThe immense quantity of available unstructured text documents serve as one of the largest source of information. Text classification can be an essential task for many purposes in information retrieval, such as document organization, text filtering and sentiment analysis. Ensemble learning has been extensively studied to construct efficient text classification schemes with higher predictive performance and generalization ability. The purpose of this paper is to provide diversity among the classification algorithms of ensemble, which is a key issue in the ensemble design.Design/methodology/approachAn ensemble scheme based on hybrid supervised clustering is presented for text classification. In the presented scheme, supervised hybrid clustering, which is based on cuckoo search algorithm and k-means, is introduced to partition the data samples of each class into clusters so that training subsets with higher diversities can be provided. Each classifier is trained on the diversified training subsets and the predictions of individual classifiers are combined by the majority voting rule. The predictive performance of the proposed classifier ensemble is compared to conventional classification algorithms (such as Naïve Bayes, logistic regression, support vector machines and C4.5 algorithm) and ensemble learning methods (such as AdaBoost, bagging and random subspace) using 11 text benchmarks.FindingsThe experimental results indicate that the presented classifier ensemble outperforms the conventional classification algorithms and ensemble learning methods for text classification.Originality/valueThe presented ensemble scheme is the first to use supervised clustering to obtain diverse ensemble for text classification

Read full abstract

Conventional Classification Algorithms Research Articles

Related Topics

Articles published on Conventional Classification Algorithms

When Collective Knowledge Meets Crowd Knowledge in a Smart City: A Prediction Method Combining Open Data Keyword Analysis and Case-Based Reasoning.

Convolutional neural network for classifying space target of the same shape by using RCS time series

Biomedical Text Categorization Based on Ensemble Pruning and Optimized Topic Modelling.

Hyperspectral Image Classification With Imbalanced Data Based on Orthogonal Complement Subspace Projection

Identifying tweets of personal health experience through word embedding and LSTM neural network

Prediction of Effective Drug Combinations by an Improved Naïve Bayesian Algorithm.

Optimised multikernels based extreme learning machine for face recognition

Multiple kernel approach to semi-supervised fuzzy clustering algorithm for land-cover classification

EVALUATION OF MULTIPLE KERNEL LEARNING ALGORITHMS FOR CROP MAPPING USING SATELLITE IMAGE TIME-SERIES DATA

Instance-based classification with Ant Colony Optimization

ASSESSMENT OF LIBRARY USERS’ FEEDBACK USING MODIFIED MULTILAYER PERCEPTRON NEURAL NETWORKS

Medical image classification based on multi-scale non-negative sparse coding.

Dynamic extreme learning machine for data stream classification

Hybrid supervised clustering based ensemble scheme for text classification

컨볼루션 신경망을 이용한 CCTV 영상 기반의 성별구분

Dynamic Reconfigurable Ternary Content Addressable Memory for OpenFlow-Compliant Low-Power Packet Processing

Face recognition using class specific dictionary learning for sparse representation and collaborative representation

Considerations for Design and Implementation of a RF Emitter Localization System with Array Antennas

All-in Text: Learning Document, Label, and Word Representations Jointly

Separation and localization of multiple distributed wideband chirps using the fractional Fourier transform

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Conventional Classification Algorithms Research Articles

Related Topics

Articles published on Conventional Classification Algorithms

When Collective Knowledge Meets Crowd Knowledge in a Smart City: A Prediction Method Combining Open Data Keyword Analysis and Case-Based Reasoning.

Convolutional neural network for classifying space target of the same shape by using RCS time series

Biomedical Text Categorization Based on Ensemble Pruning and Optimized Topic Modelling.

Hyperspectral Image Classification With Imbalanced Data Based on Orthogonal Complement Subspace Projection

Identifying tweets of personal health experience through word embedding and LSTM neural network

Prediction of Effective Drug Combinations by an Improved Naïve Bayesian Algorithm.

Optimised multikernels based extreme learning machine for face recognition

Multiple kernel approach to semi-supervised fuzzy clustering algorithm for land-cover classification

EVALUATION OF MULTIPLE KERNEL LEARNING ALGORITHMS FOR CROP MAPPING USING SATELLITE IMAGE TIME-SERIES DATA

Instance-based classification with Ant Colony Optimization

ASSESSMENT OF LIBRARY USERS’ FEEDBACK USING MODIFIED MULTILAYER PERCEPTRON NEURAL NETWORKS

Medical image classification based on multi-scale non-negative sparse coding.

Dynamic extreme learning machine for data stream classification

Hybrid supervised clustering based ensemble scheme for text classification

컨볼루션 신경망을 이용한 CCTV 영상 기반의 성별구분

Dynamic Reconfigurable Ternary Content Addressable Memory for OpenFlow-Compliant Low-Power Packet Processing

Face recognition using class specific dictionary learning for sparse representation and collaborative representation

Considerations for Design and Implementation of a RF Emitter Localization System with Array Antennas

All-in Text: Learning Document, Label, and Word Representations Jointly

Separation and localization of multiple distributed wideband chirps using the fractional Fourier transform