Learning representations from dendrograms

Morteza Haghir Chehreghani,Mostafa Haghir Chehreghani

doi:10.1007/s10994-020-05895-3

Morteza Haghir Chehreghani, Mostafa Haghir Chehreghani

Open Access

https://doi.org/10.1007/s10994-020-05895-3

Copy DOI

Abstract

We propose unsupervised representation learning and feature extraction from dendrograms. The commonly used Minimax distance measures correspond to building a dendrogram with single linkage criterion, with defining specific forms of a level function and a distance function over that. Therefore, we extend this method to arbitrary dendrograms. We develop a generalized framework wherein different distance measures and representations can be inferred from different types of dendrograms, level functions and distance functions. Via an appropriate embedding, we compute a vector-based representation of the inferred distances, in order to enable many numerical machine learning algorithms to employ such distances. Then, to address the model selection problem, we study the aggregation of different dendrogram-based distances respectively in solution space and in representation space in the spirit of deep representations. In the first approach, for example for the clustering problem, we build a graph with positive and negative edge weights according to the consistency of the clustering labels of different objects among different solutions, in the context of ensemble methods. Then, we use an efficient variant of correlation clustering to produce the final clusters. In the second approach, we investigate the combination of different distances and features sequentially in the spirit of multi-layered architectures to obtain the final features. Finally, we demonstrate the effectiveness of our approach via several numerical studies.

Highlights

Real-world datasets often consist of complex and a priori unknown patterns and structures, requiring to improve the basic representation
We investigate inferring pairwise distances from a dendrogram computed according to an arbitrary criterion, i.e., beyond single linkage criterion
We investigate the different feature extraction methods with three different clustering algorithms

Summary

Introduction

Real-world datasets often consist of complex and a priori unknown patterns and structures, requiring to improve the basic representation. Kernel methods are commonly used for this purpose (Hofmann et al 2008; Shawe-Taylor and Cristianini 2004) Their applicability is confined by several limitations (von Luxburg 2007; Nadler and Galun 2007; Chehreghani 2017b). (2) The proper values of the parameters usually occur inside a very narrow range that makes cross-validation critical, even in presence of labeled data To overcome such challenges, some graph-based distance measures have been developed in the context of algorithmic graph-theory. The final distance is obtained by summing up the path-specific distances of all paths between the two nodes This distance measure can be obtained by inverting the Laplacian of the base distance matrix related to Markov diffusion kernel (Fouss et al 2012; Yen et al 2008). It requires an O(n3) runtime, with n the number of objects

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Machine Learning	Publication Date: Aug 16, 2020
Citations: 6	License type: open-access

R Discovery Prime

R Discovery Prime

Learning representations from dendrograms

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Machine Learning

Lead the way for us

Similar Papers

Faster Monotone Min-Plus Product, Range Mode, and Single Source Replacement Paths
...
-
, et. al. ...
01 Jan 2020
01 Jan 2020

Consensus control for linear systems with optimal energy cost
Han Zhang ... Xiaoming Hu
Automatica | VOL. 93
Han Zhang, et. al.Han Zhang ... Xiaoming Hu
27 Mar 2018
Automatica | VOL. 93

Optimal energy consensus control for linear multi-agent systems
Han Zhang ... Xiaoming Hu
-
Han Zhang, et. al.Han Zhang ... Xiaoming Hu
01 Jan 2017
01 Jan 2017

On the definiteness of the weighted Laplacian and its connection to effective resistance
Daniel Zelazo ... Mathias Burger
-
Daniel Zelazo, et. al.Daniel Zelazo ... Mathias Burger
01 Dec 2014
01 Dec 2014

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Learning representations from dendrograms

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Machine Learning