Abstract

In this study, an estimation method for human height is proposed using color and depth information. Color images are used for deep learning by mask R-CNN to detect a human body and a human head separately. If color images are not available for extracting the human body region due to low light environment, then the human body region is extracted by comparing between current frame in depth video and a pre-stored background depth image. The topmost point of the human head region is extracted as the top of the head and the bottommost point of the human body region as the bottom of the foot. The depth value of the head top-point is corrected to a pixel value that has high similarity to a neighboring pixel. The position of the body bottom-point is corrected by calculating a depth gradient between vertically adjacent pixels. Two head-top and foot-bottom points are converted into 3D real-world coordinates using depth information. Two real-world coordinates estimate human height by measuring a Euclidean distance. Estimation errors for human height are corrected as the average of accumulated heights. In experiment results, we achieve that the estimated errors of human height with a standing state are 0.7% and 2.2% when the human body region is extracted by mask R-CNN and the background depth image, respectively.

Highlights

  • The physical measurements of a person such as human height, body width and stride length are important bases for identifying a person from video

  • We propose a human-height estimation method using color and depth information

  • The human body region was extracted through the pre-trained mask R-convolutional neural network (CNN) to color video

Read more

Summary

Introduction

The physical measurements of a person such as human height, body width and stride length are important bases for identifying a person from video. The height of the person captured by a surveillance video is important evidence for identifying a suspect. Physical quantities are used as important information for continuously tracking a specific person in video surveillance system consisting of multiple cameras [1]. Human height is estimated by obtaining 3D information of the human body from color video [2,3,4,5,6]. The estimation methods of human height based on color video have a disadvantage in that the camera parameters or information about a reference object are required. Extracting Human Body by Mask R-CNN by Background Depth Image Error (%) by Mask R-CNN

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.