Abstract

High-level data classification techniques are capable of considering not only physical aspects of the data, such as space, distance, proximity, distribution, but can also consider their functional, topological and structural aspects. High-level techniques are commonly defined in two major steps: the construction of a network from the feature vector data and the uncovering of its underlying patterns using complex networks properties. In the network construction step, heuristics based on k-nearest neighbors strategies have been widely adopted, while several complex network measures (e.g. PageRank) have been modeled to learn high-level patterns of the input data. As both steps are directly related, i.e., the network configuration impacts directly the results obtained by the classifier, in this paper we develop a genetic algorithm (GA) to optimize the network construction step. To be specific, we hypothesize that the salient features of GAs, such as their robust search mechanism and binary representation, may provide a more powerful network representation in the context of the high-level classification based on importance characterization. In summary, extensive experiments with real data sets demonstrate that the networks provided by our GA strategy achieved higher predictive accuracy than those of a widely adopted method based on the nearest neighbors heuristic and competitive results against state-of-the-art ones.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.