A new evolutionary algorithm for mining top-k discriminative patterns in high dimensional data

Tarcísio Lucas,Túlio C.P.B Silva,Renato Vimieiro,Teresa B Ludermir

doi:10.1016/j.asoc.2017.05.048

Abstract

This paper presents an evolutionary algorithm for Discriminative Pattern (DP) mining that focuses on high dimensional data sets. DPs aims to identify the sets of characteristics that better differentiate a target group from the others (e.g. successful vs. unsuccessful medical treatments). It becomes more natural to extract information from high dimensionality data sets with the increase in the volume of data stored in the world (30GB/s only in the Internet). There are several evolutionary approaches for DP mining, but none focusing on high-dimensional data. We propose an evolutionary approach attributing features that reduce the cost of memory and processing in the context of high-dimensional data. The new algorithm thus seeks the best (top-k) patterns and hides from the user many common parameters in other evolutionary heuristics such as population size, mutation and crossover rates, and the number of evaluations. We carried out experiments with real-world high-dimensional and traditional low dimensional data. The results showed that the proposed algorithm was superior to other approaches of the literature in high-dimensional data sets and competitive in the traditional data sets.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A new evolutionary algorithm for mining top-k discriminative patterns in high dimensional data

Abstract

Talk to us

Similar Papers

More From: Applied Soft Computing

Lead the way for us

Journal: Applied Soft Computing	Publication Date: Jun 8, 2017
Citations: 13

Similar Papers

Multivariate Procedure for Variable Selection and Classification of High Dimensional Heterogeneous Data
Tahir Mehmood ... Zahid Rasheed
Communications for Statistical Applications and Methods | VOL. 22
Tahir Mehmood, et. al.Tahir Mehmood ... Zahid Rasheed
30 Nov 2015
Communications for Statistical Applications and Methods | VOL. 22

Feature Selection for High-Dimensional and Imbalanced Biomedical Data Based on Robust Correlation Based Redundancy and Binary Grasshopper Optimization Algorithm.
Garba Abdulrauf Sharifai ... Zurinahni Zainol
Genes | VOL. 11
Garba Abdulrauf Sharifai, et. al.Garba Abdulrauf Sharifai ... Zurinahni Zainol
27 Jun 2020
Genes | VOL. 11

Resampling Imbalanced Data and Impact of Attribute Selection Methods in High Dimensional Data
K Ulaga Priya ... S Pushpa
-
K Ulaga Priya, et. al.K Ulaga Priya ... S Pushpa
01 Jan 2021
01 Jan 2021

Statistical analysis of high-dimensional biomedical data: a gentle introduction to analytical goals, common approaches and challenges
Jörg Rahnenführer ... Eugenia Migliavacca
BMC Medicine | VOL. 21
Jörg Rahnenführer, et. al.Jörg Rahnenführer ... Eugenia Migliavacca
15 May 2023
BMC Medicine | VOL. 21

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A new evolutionary algorithm for mining top-k discriminative patterns in high dimensional data

Abstract

Talk to us

Similar Papers

More From: Applied Soft Computing