Zero-shot learning is one of the most challenging machine learning tasks, in which learning stable and transferable knowledge from seen classes plays a pivotal role. To improve the currently unsatisfactory performance of zero-shot object recognition, this paper proposes a novel image representation method, namely, micro-knowledge. In our method, the segmentation of micro-regions and the consequent learning of micro-knowledge are unified by the introduction of a self-attention mechanism. A zero-shot classification framework is carefully designed based on micro-knowledge of images. Under this framework, multiple micro-region descriptions are first obtained by embedding micro-knowledge and then merged to carry out the final classification of unseen objects. Finally, a capsule-unified framework is employed as a graphical programming tool to accomplish the aforementioned tasks. Experiments on public datasets show that the proposed framework can generally achieve competitive results for the classification of unseen objects. Specifically, these results verify that the micro-knowledge learned from one dataset can be directly applied to others without complicated adjustments and demonstrate that using visual features instead of semantic features can result in a decrease in classification error. This research will bring new ideas into the field of zero-shot learning and will serve as an appealing option when addressing the problem of domain shift.
Read full abstract