Abstract
VT-BPAN: vision transformer-based bilinear pooling and attention network fusion of RGB and skeleton features for human action recognition
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have