Abstract
Monocular 3D detection is to obtain the 3D information of the object from the image. The mainstream methods mainly use L1 loss or L1-like loss to control the instance depth prediction. However, these methods have not achieved satisfactory results. One of the main reasons is that L1 loss or L1-like loss does not accurately reflect the fit between the predicted instance depth and the corresponding ground truth. Another of the main reason is that the instance depth on the RGB image hard to be directly learned in the network. In order to solve the above problems, a novel thermodynamic loss based on the principle of free energy minimisation and a novel depth decoupling method are proposed in this paper. The proposed method is called the monocular 3D object detection network with thermodynamic loss and decoupled instance depth (TDN). In TDN, the optimisation of the instance depth prediction is regarded as the thermodynamic process. Therefore, the thermodynamic loss is designed according to the principle of free energy minimisation. TDN decouples the instance depth into three different depths. By combining the thermodynamic loss and the different types of depths, we can obtain the final instance depth.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.