Unmanned aerial vehicles (UAVs) can be used to great effect for wide-area searches such as search and rescue operations. UAV enables search and rescue teams to cover large areas more efficiently and in less time. However, using UAV for this purpose involves the creation of large amounts of data, typically in video format, which must be analyzed before any potential findings can be uncovered and actions taken. This is a slow and expensive process, which can result in significant delays to the response time after a target is seen by the UAV. To solve this problem, a deep model using a visual saliency approach to automatically analyze and detect anomalies in UAV video is proposed. Temporal contextual saliency model is based on the state of the art in visual saliency detection using deep convolutional neural networks and considers local and scene context, with novel additions in utilizing temporal information through a convolutional LSTM layer and modifications to the base model. Additionally, the impact of temporal versus non-temporal reasoning for this task is evaluated. This model achieves improved results on a benchmark dataset with the addition of temporal reasoning showing significantly improved results compared to the state of the art in saliency detection.
Read full abstract