Abstract

This work presents a novel visual-tactile fused clustering framework, called <underline xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">L</u> ifelong <underline xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">V</u> isual- <underline xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">T</u> actile <underline xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">S</u> pectral <underline xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">C</u> lustering (i.e., LVTSC), to effectively learn consecutive object clustering tasks for robotic perception. Lifelong learning has become an important and hot topic in recent studies on machine learning, aiming to imitate “human learning” and reduce the computational cost when consecutively learning new tasks. Our proposed LVTSC model explores the knowledge transfer and representation correlation from a local modality-invariant perspective under modality-consistent constraint guidance. For the modality-invariant part, we design a set of modality-invariant basis libraries to capture the latent clustering centers of each modality and a set of modality-invariant feature libraries to forcibly embed the manifold information of each modality. A modal-consistent constraint reinforces the correlation between visual and tactile modalities by maximizing the feature manifold correspondences. When the object clustering task comes continuously, the overall objective is optimized by an effective alternating direction method with guaranteed convergence. Our proposed LVTSC framework has been extensively validated for its effectiveness and efficiency on the three challenging real-world robotic object perception datasets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call