Abstract

Many application domains involve the consideration of multiple data sources. Typically, each of these data views provides a different perspective of a given set of entities. Inspired by early work on multiview (supervised) learning, multiview algorithms for data clustering offer the opportunity to consider and integrate all this information in an unsupervised setting. In practice, some complex real-world problems may give rise to a handful or more data views, each with different reliability levels. However, existing algorithms are often limited to the consideration of two views only, or they assume that all the views have the same level of importance. Here, we describe the design of an evolutionary algorithm for the problem of multiview cluster analysis, exploiting recent advances in the field of evolutionary optimization to address settings with a larger number of views. The method is capable of considering views that are represented in the form of distinct feature sets, or distinct dissimilarity matrices, or a combination of the two. Our experimental results on standard (including real-world) benchmark datasets confirm that the adoption of a many-objective evolutionary algorithm addresses limitations of previous work, and can easily scale to settings with four or more data views. The final highlight of our paper is an illustration of the potential of the approach in an application to breast lesion classification.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call