Abstract

In this article we investigate whether an artificial agent can successfully perform the complex task of object detection in static natural images. For this task, we evolve situated agents that map local image samples to gaze shifts in order to find object locations in the image. We apply the agents to both a task of mug detection in a domestic environment and a task of face detection in an office environment. The analysis of evolved agents shows that they employ sensory-motor coordination to exploit the object's visual context. The experimental results show that the situated agents achieve a detection performance equal to that of existing object detection methods, while extracting ~50 times fewer local image samples. This advantage comes at the expense of limited generalisation performance: evolved agents exploit the scene-specific contextual clues which may be confined to a single type of visual environment and may therefore not generalise to other types of visual environments. We conclude that the studied situated agents can efficiently and successfully perform object detection at the cost of application generality

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call