Abstract

Although the through-the-wall radar imaging (TWRI) system working in the appropriate frequency band can penetrate the nonmetallic obstacles and sense the targets behind, its low imaging spatial resolution hinders the acquisition of more detailed information, such as human pose and shape. This article mainly discusses a deep learning-based human pose and shape recovery method from TWRI images. Inspired by cross-modal learning, the method follows a teacher–student learning pipeline that avoids the heavy cost of manual labeling. Specifically, a camera is attached to the self-develop radar system to simultaneously capture paired red-green-blue (RGB) images and TWRI images in a scenario without wall occlusion. A pose estimation framework (Hourglass) and a semantic segmentation framework (UNet) serve as the teacher network to convert the RGB images into the pose keypoints and the shape masks. By taking inspiration from the topological architecture of these frameworks, a student network radar pose shape network (RPSNet) is designed to extract the information from the corresponding radar images and predict the keypoints and masks that are close to the results above. Instead of learning two single-task objectives independently, multitasking learning is introduced to adaptatively learn common features. When applied to wall-occlusive scenarios, only the radar images are collected and fed into the student network for pose and shape recovery. The advantages of this method over computer vision-based methods for human recovery are demonstrated in scenarios both without and with wall occlusion.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call