Abstract
Current attention or transform modules in Convolutional Neural Networks (CNNs) are designed pursuing lightweight and in-place. Generally, we need to decrease the channel dimension of input feature maps for reducing computation cost firstly. And then we do some transformation for extracting weight maps or converting to other feature space etc. Finally, we increase the channel dimension back for outputting feature maps with the same size as input. When we change the channel dimension, commonly we choose $1\times 1$ convolutional layers or fully connected layers. They are simple and effective, but need learning parameters and consuming more memory with other computation resources. We propose a novel parameter free method named Channel Transformer Network (CTN) to decrease or increase channels for these modules whilst keeping most information with lower computation complexity. We also introduce a Video Co-segment Attentive Network (VCAN) for person re-identification (ReID) to improve pedestrian’s noticeable representation across multiple video frames. We embed CTN in Non-local, CBAM, COSAM and VCAN blocks to replace $1\times 1$ convolutional or fully connected layers. Experiments of VCAN and CTN embedding models on Mars dataset for person ReID show significant performance in computation efficiency and accuracy, especially VCAN reaches 90.05% in Rank-1. We believe CTN can also be used in other vision tasks like image classification and object detection etc.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.