Feature Set Partitioning Research Articles

Multi-view ensemble learning has the potential to address issues related to the high dimensionality of data. It attempts to utilize all the relevant only discarding the irrelevant features. The view of a dataset is the sub-table of the training data with respect to a subset of the feature set. The problem of discarding the irrelevant features and obtaining subsets of the relevant features is useful for dimension reduction and dealing with the problem of having fewer training examples than even the reduced set of relevant features. A feature set partitioning resulting in the blocks of relevant features may not yield multiple-view-based classifiers with good classification performance. In this work the optimal feature set partition approach has been proposed. Further, the ensemble learning from views aims to maximize the performance of the classifier. The experiments study the performance of random feature set partitioning, attribute bagging, view generation using attribute clustering, view construction using genetic algorithm and OFSP proposed method. The blocks of relevant feature subsets are used to construct the multi-view classifier ensemble using K-nearest neighbor, Naive Bayesian and support vector machine algorithm applied to sixteen high-dimensional data sets from UCI machine learning repository. The performance parameters considered for comparison are classification accuracy, disagreement among the classifiers, execution time and percentage reduction of attributes.

Read full abstract

In privacy-preserving data mining (PPDM), a widely used method for achieving data mining goals while preserving privacy is based on k-anonymity. This method, which protects subject-specific sensitive data by anonymizing it before it is released for data mining, demands that every tuple in the released table should be indistinguishable from no fewer than k subjects. The most common approach for achieving compliance with k-anonymity is to replace certain values with less specific but semantically consistent values. In this paper we propose a different approach for achieving k-anonymity by partitioning the original dataset into several projections such that each one of them adheres to k-anonymity. Moreover, any attempt to rejoin the projections, results in a table that still complies with k-anonymity. A classifier is trained on each projection and subsequently, an unlabelled instance is classified by combining the classifications of all classifiers. Guided by classification accuracy and k-anonymity constraints, the proposed data mining privacy by decomposition (DMPD) algorithm uses a genetic algorithm to search for optimal feature set partitioning. Ten separate datasets were evaluated with DMPD in order to compare its classification performance with other k-anonymity-based methods. The results suggest that DMPD performs better than existing k-anonymity-based algorithms and there is no necessity for applying domain dependent knowledge. Using multiobjective optimization methods, we also examine the tradeoff between the two conflicting objectives in PPDM: privacy and predictive performance.

Read full abstract

Feature Set Partitioning Research Articles

Articles published on Feature Set Partitioning

Ensemble multi-view feature set partitioning method for effective multi-view learning

Enhancing multi-view ensemble learning with zig-zag pattern-based feature set partitioning

A review of feature set partitioning methods for multi-view ensemble learning

Collaboration graph for feature set partitioning in data classification

Non-sequential partitioning approaches to decision tree classifier

Multi-view ensemble learning: an optimal feature set partitioning for high-dimensional data classification

Multi-view Ensemble Learning Using Optimal Feature Set Partitioning: An Extended Experiments and Analysis in Low Dimensional Scenario

Privacy-preserving data mining: A feature set partitioning approach

Genetic algorithm-based feature set partitioning for classification problems

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Feature Set Partitioning Research Articles

Articles published on Feature Set Partitioning

Ensemble multi-view feature set partitioning method for effective multi-view learning

Enhancing multi-view ensemble learning with zig-zag pattern-based feature set partitioning

A review of feature set partitioning methods for multi-view ensemble learning

Collaboration graph for feature set partitioning in data classification

Non-sequential partitioning approaches to decision tree classifier

Multi-view ensemble learning: an optimal feature set partitioning for high-dimensional data classification

Multi-view Ensemble Learning Using Optimal Feature Set Partitioning: An Extended Experiments and Analysis in Low Dimensional Scenario

Privacy-preserving data mining: A feature set partitioning approach

Genetic algorithm-based feature set partitioning for classification problems