Abstract

This paper presents an unsupervised clustering random-forest-based metric for affinity estimation in large and high-dimensional data. The criterion used for node splitting during forest construction can handle rank-deficiency when measuring cluster compactness. The binary forest-based metric is extended to continuous metrics by exploiting both the common traversal path and the smallest shared parent node.The proposed forest-based metric efficiently estimates affinity by passing down data pairs in the forest using a limited number of decision trees. A pseudo-leaf-splitting (PLS) algorithm is introduced to account for spatial relationships, which regularizes affinity measures and overcomes inconsistent leaf assign-ments. The random-forest-based metric with PLS facilitates the establishment of consistent and point-wise correspondences. The proposed method has been applied to automatic phrase recognition using color and depth videos and point-wise correspondence. Extensive experiments demonstrate the effectiveness of the proposed method in affinity estimation in a comparison with the state-of-the-art.

Highlights

  • Affinity estimation is an essential step in variousManuscript received: 2021-03-29; accepted: 2021-05-26 computer vision and image processing tasks

  • Aside from the binary affinity, we propose a continuous forest-based metric based on the common path Pij of two instances ti and tj as they traverse from the root to leaves and

  • We have presented unsupervised random-forest-based metrics for affinity estimation for large and highdimensional data, taking advantage of both the common traversal path and the smallest shared parent node

Read more

Summary

Introduction

Manuscript received: 2021-03-29; accepted: 2021-05-26 computer vision and image processing tasks. This paper presents an unsupervised random-forest-based metric for efficient affinity estimation, and demonstrates its efficacy on automatic phrase recognition and point-wise correspondence of a shape corpus. The mixed metric random forest (MMRF) utilized self-learning of data distributions for matching consistencies between images [28], taking advantage of the weak labeling and classification criterion to optimize node splitting. The main contributions of this work are: (i) a continuous forest-based metric exploiting both the common traversal path and the cardinality of the smallest shared parent node, enabling efficient and effective affinity estimation in large and highdimensional data, (ii) a PLS scheme to regularize the forest-based metric to account for global spatial and structural relationships, overcoming inconsistent leaf. The covariance matrices need to be repeatedly evaluated when given randomly selected parameters; it is time-consuming to evaluate the covariance matrix σ from scratch for the optimal splitting parameters

Binary forest-based metric
Continuous forest-based metric
Pseudo leaf splitting
Datasets and metric
Affinity estimation
Method
Consistent correspondence in shape corpus
Phrase recognition
Comparison with forest-based correspondence
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call