EINet: camouflaged object detection with pyramid vision transformer

Chen Li,Ge Jiao

doi:10.1117/1.jei.31.5.053002

Abstract

Camouflaged object detection (COD) is a new computer vision challenge for locating and identifying camouflaged objects in complex situations. Camouflaged objects are more similar to their surroundings than conventional objects, and their appearance in terms of size and shape is also considerably different, making accurate identification of the COD tasks difficult. As a result, we propose an enhanced identification network (EINet) to strengthen the COD task’s identification capabilities. First, the pyramid vision transformer is used as an encoder for extracting more robust multiscale features. Second, the multiple texture refinement modules are exploited to refine the multiscale features. Third, an improved neighbor and hop connection decoder is designed to produce a coarse estimation map for guiding the detailed identification of camouflaged objects backward. Finally, numerous new reverse criss-cross block attention modules that gradually recognize fine-grained features at various scales is designed to allow for the accurate recognition of camouflaged objects. Extensive experiments have been conducted on four benchmarked datasets of camouflaged objects. The results of the experiments reveal that our EINet is a powerful COD model that outperforms current state-of-the-art models.

Full Text