Abstract

Few-shot learning poses a critical challenge due to the deviation problem caused by the scarcity of available samples. In this work, we aim to address deviations in both feature representations and prototypes. To achieve this, we propose a cross-modal de-deviation framework that leverages class semantic information to provide robust prior knowledge for the samples. This framework begins with a visual-to-semantic autoencoder trained on the labeled samples to predict semantic features for the unlabeled samples. Then, we devise a binary linear programming model to match the initial prototypes with the cluster centers of the unlabeled samples. To circumvent potential mismatches between the cluster centers and the initial prototypes, we perform the label assignment process in the semantic space by transforming the cluster centers into semantic representations and utilizing the class ground truth semantic features as reference points. Moreover, we model a linear classifier with the concatenation of the refined prototypes and the class ground truth semantic features serving as the initial weights. Then we propose a novel optimization strategy based on the alternating least squares (ALS) model. From the ALS model, we can derive two closed-form solutions regarding to the features and weights, facilitating alternative optimization of them. Extensive experiments conducted on few-shot learning benchmarks demonstrate the competitive advantages of our CMDD method over the state-of-the-art approaches, confirming its effectiveness in reducing deviation. The code is available at: https://github.com/pmhDL/CMDD.git.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call