Abstract

The latest research on temporal motion detection research content is an important research topic in the field of computer vision. The TAP (temporal action proposal) generation task is an essential part for quickly and accurately drawing semantically important action proposals from untrimmed video. As a separate and important research field, the proposal generated by the temporal action proposal generation network should have three attributes: (1) flexible temporal length, (2) accurate temporal boundaries and (3) reliable confidence scores. Therefore, a new type of effective and efficient universal action proposal generation network for temporally uncropped videos is provided by us, named Boundary Sensitive and Category Sensitive (BSCS) network. Firstly, We need a combination of a frame-level appearance features and a clip-level optical flow features, so a two-stream network is adopted. Secondly, a sub-network in our network is designed. This sub-network needs to outputs 203 probability sequences. These probability sequences represent the temporal distribution of the start and end frames and categories, respectively. The ending frame and starting frame belong to the same kind of action are screened through the probability sequence of 200 action categories. Finally, according to certain rules that combining the ending frames and the starting frames, our network generates proposals. We further compare our method with more experiments experimentally with the existing networks in ActivityNet-1.3 [1]. Comparing the network results with experiments proves that our method is more effective.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call