Advanced driver assistance systems (ADASs) and autonomous vehicles rely on different types of sensors, such as camera, radar, ultrasonic, and LiDAR to sense the surrounding environment. Compared with the other types of sensors, millimeter-wave automotive radar has advantages in terms of cost and reliability under bad weather conditions (e.g., snow, rain, fog) and doesn't suffer from light condition variations (e.g., darkness). Typical radar devices used in today's commercial vehicles with ADAS features produce sparse point clouds in low angular resolution with a limited number of antennas. In this paper, we present a machine learning-aided signal processing chain to suppress the radar imaging blur effect introduced by the phase migration in time-division multiplexing multi-input multi-output (MIMO) radar, to generate low-level high-resolution radar bird's eye view (BEV) spectra with rich object's features. Compared with radar point clouds, there is no information loss in radar BEV spectra. We then propose a Temporal-fusion, Distance tolerant single stage object detection Network, termed as, <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">TDRadarNet</i> , and an enhanced version, <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">TDRadarNet+</i> to robustly detect vehicles in both long and short ranges on radar BEVs. We introduce a first-of-its-kind multi-model dataset, containing 14,800 frames of high-resolution low-level radar BEV spectra with synchronized stereo cameras RGB images and three-dimensional (3D) LiDAR point clouds. Our dataset achieves 0.39 m range resolution and <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$1.2^\circ$</tex-math></inline-formula> degree azimuth angular resolution with 100 meters maximum detectable range. Moreover, we create a sub-dataset, the Doppler Unfolding dataset, containing 244,140 beam vectors extracted from the 3D radar data cube. With extensive testing and evaluation, we demonstrate that our Doppler unfolding network achieves 93.46% Doppler unfolding accuracy. Compared to YOLOv7, a state-of-the-art image-based object detection network, TDRadarNet achieves a 70.3% Average Precision (AP) for vehicle detection, demonstrating a 21.0% improvement; TDRadarNet+ achieves a 73.9% AP, showing a 24.6% improvement in performance.
Read full abstract