Abstract

We present a deep autoencoder-based anomaly detection method (GridNet) for indoor surveillance. Unlike similar studies, GridNet is image-agnostic by taking a specific representation of a scene as inputs instead of the raw image itself. Its input is grid representations of scene images, which indicate spatial layouts of objects in a scene. This approach allows us to isolate the anomaly detection problem from any vision-related issues, such as illumination variations. In addition to grid representations, GridNet takes a location vector of a scene as input to learn the normalities of each scene conditioned on its location. We also propose a novel loss function that increases the model's reconstruction capability for grid representations. It enables the network to increase its precision and recall throughout the reconstruction. In our experiments, we compare our method with the existing studies on simulated and real-world data. The experimental results show the superiority of our method compared to the baseline methods. The code, data, and simulation environments will be available at https://bozcani.github.io/gridnet .

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call