Automatic pose estimation has become a valuable tool for the study of human behavior, including dyadic interactions. It allows researchers to analyze the nuanced dynamics of interactions more effectively, and facilitates the integration of behavioral data with other modalities (EEG, etc.). However, many technical difficulties remain. Particularly, for parent-infant interactions, automatic pose estimation for infants is unpredictable; the immature proportions and smaller bodies of children may cause misdetections. OpenPose is one tool that has shown high performance in pose tracking from video, even in infants. However, OpenPose is limited to 2D (i.e., coordinates relative to the image space). This may be undesirable in a multitude of paradigms (e.g., naturalistic settings). We developed a method for expanding the functionality of OpenPose to 3D, tailored to parent-infant interaction paradigms. This method merges the estimations from OpenPose with the depth information from a depth camera to obtain a 3D pose that works even for young infants.•Video recordings of interactions of parents and infants are taken using a dual color-depth camera.•2D-positions of parents and their infants are estimated from the color video.•Using the depth camera, we transform the 2D estimations into real-world 3D positions, allowing movement analysis in full-3D space.