Abstract

Combining depth information and color image, D-RGB cameras provide a ready detection of human and associated 3D skeleton joints data, facilitating, if not revolutionizing, conventional image centric researches in, among others, computer vision, surveillance, and human activity analysis. Applicability of a D-RBG camera, however, is restricted by its limited range of frustum of depth in the range of 0.8 to 4 meters. Although a D-RGB camera network, constructed by deployment of several D-RGB cameras at various locations, could extend the range of coverage, it requires precise localization of the camera network: relative location and orientation of neighboring cameras. By introducing a skeleton-based viewpoint invariant transformation (SVIT), which derives the relative location and orientation of a detected humans upper torso to a D-RGB camera, this paper presents a reliable automatic localization technique without the need for additional instrument or human intervention. By respectively applying SVIT to two neighboring D-RGB cameras on a commonly observed skeleton, the respective relative position and orientation of the detected humans skeleton for these two cameras can be obtained before being combined to yield the relative position and orientation of these two cameras, thus solving the localization problem. Experiments have been conducted in which two Kinects are situated with bearing differences of about 45 degrees and 90 degrees; the coverage can be extended by up to 70% with the installment of an additional Kinect. The same localization technique can be applied repeatedly to a larger number of D-RGB cameras, thus extending the applicability of D-RGB cameras to camera networks in making human behavior analysis and context-aware service in a larger surveillance area.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call