This paper discusses about any approaches to determining the distance to an object based on an image generated by a monocular video camera, which use artificial neural networks at various stages of processing. Method based on finding a depth map, detecting an object, and then projecting its coordinates onto the depth map is analyzed. It describes a method that uses the relationship between the real size of an object and its size in the image. It considers a method based on a modification of the YOLO, which allows expanding the resulting descriptor with an additional vector characterizing the distance to the object. Data sets used to train neural networks used in algorithms for calculating the absolute distance to an object based on an image is analyzed. The paper discusses about the effectiveness of the methods considered, their advantages and disadvantages, as well as the prospects for using them for practical solutions.
Read full abstract