Abstract

Constraints often exist in the high-dimensional data output in object detection, such as the inverse vector (cos θ, sin θ) of the two-dimensional object and the attitude quaternion of the threedimensional object. The range of each component of the output value of the traditional neural network is unconstrained, which is difficult to meet the needs of practical problems. To solve this problem, this paper designed the transformation network layer according to the high dimensional space transformation theory and constructed a constrained neural network model to detect the pose of objects from a single aerial image. Firstly, in the yolov3 network structure, according to the size of the three field scales, three scale transformation network layers are added correspondingly to implement the constrained unit quaternion field. Secondly, according to the characteristics of quaternion, we proposed a special loss function. Then a new constrained neural network called the quaternion field pose network (qfiled PoseNet) model is constructed, which can predict the probability field of the object and the contained unit quaternion field respectively. Next, The object’s probability field is generated to determine the 2D bounding box of the object, and the unit quaternion field is generated to determine the 3D rotation R. Finally, combining the rotation matrix R and the 2D bounding box of the object to calculate the 3D translation T. We used our method to experiment on DOTA1.5 data set and the HSRC2016 data set respectively. The experimental results show that our method can detect the pose of the object well.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call