Abstract
AI chips have developed rapidly and achieved remarkable acceleration effects in the corresponding algorithm field in recent years. However, deep learning algorithms are changing rapidly, including many operators that AI chips and inference frameworks cannot use in the short term. To solve the problem that it is challenging to deploy a stereo matching algorithm based on binocular vision on AI chips, this paper proposes a multi-stage unsupervised lightweight real-time depth estimation algorithm for AI chips called TradNet. TradNet combines the traditional matching algorithm with a convolutional neural network and uses convolution directly supported by AI chips to realize the structure of the traditional matching algorithm. TradNet is composed of operators directly supported by current AI chips, which reduces the computational complexity of the algorithm, and greatly improves the compatibility of the stereo matching algorithm with existing AI chips. Compared with the deep learning-based multi-stage binocular disparity algorithm AnyNet, the accuracy is improved by 5.12%, and the inference speed is only 12.7%. Compared with the matching-based binocular disparity algorithm BM, the accuracy is improved by 25.24%, and the inference speed is only 48.7%. Our final model can process 1280×720 resolution images within a range of 60–80 FPS on an NVIDIA TITAN Xp. It achieves 28FPS on a 1TOPS (Tera Operations Per Second) custom AI chip, and the power consumption is 0.88 W.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have