Abstract
Existing LiDAR point cloud compression (PCC) methods tend to treat compression as a <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">fidelity</i> issue, without sufficiently addressing its <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">machine perception</i> aspect. The latter issue is often encountered by the decoder agents that might aim to conduct scene-understanding related tasks only, such as computing the localization information. For tackling this challenge, a novel LiDAR PCC system is proposed to compress the point cloud geometry, which contains a <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">back channel</i> for allowing the decoder to initiate such request to the encoder. The key success of our PCC method lies in our proposed <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">semantic prior representation</i> (SPR) and its lossy encoding algorithm with variable precision to generate the final bitstream; the entire process is fast and achieves real-time performance. Note that our SPR is a compact and effective representation of three-dimensional (3D) input point clouds, and it consists of <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">labels, predictions</i> , and <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">residuals</i> . These information can be generated by first exploiting a <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">scene-aware object segmentation</i> to a set of 2D range images (frames) individually, which were generated from the 3D point clouds via a projection process. Based on the generated labels, the pixels associated with those moving objects are considered as noisy information and should be removed for not only saving bit budget on transmission but also, most importantly, improving the accuracy of localization computed at the decoder. Experimental results conducted on the commonly-used test dataset have shown that our proposed system outperforms the MPEG’s G-PCC (TMC13-v14.0) in a large bitrate range. In fact, the performance gap will become even larger when more and/or large moving objects are involved in the input point clouds.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE Transactions on Circuits and Systems for Video Technology
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.