Inferring Camera Intrinsics Based on Surfaces of Revolution: A Single Image Geometric Network Approach for Camera Calibration

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

Camera calibration is a necessary prerequisite in many applications of robotics, especially in robot vision in order to obtain metric reconstruction from a 2D image. In this paper, we address the problem of calibrating from a single image of a surface of revolution (SOR) based on deep learning, in order to determine the camera intrinsic parameters. Geometric constraints based on the symmetry properties of the SOR structure are deployed to our proposed learning-based camera calibration framework. To enable the calibration from a single view, we also propose a learning-based conics detection model fitting the geometric primitive of a cylinder. The calibration from a single view can be completed by minimizing the geometric constraints of two conics detected by the learning-based model with cylinder images as input. Objects with a surface of revolution are commonly visible in daily life, such as cans, bottles, and bowls, making this research both significant and practical. Finally, traditional calibration techniques are compared against our single image calibration. Experiments conducted on newly generated dataset demonstrate the effectiveness and robustness of the proposed method.

Similar Papers
  • Research Article
  • Cite Count Icon 118
  • 10.1109/tpami.2005.14
Metric 3D reconstruction and texture acquisition of surfaces of revolution from a single uncalibrated view
  • Jan 1, 2005
  • IEEE Transactions on Pattern Analysis and Machine Intelligence
  • C Colombo + 2 more

Image analysis and computer vision can be effectively employed to recover the three-dimensional structure of imaged objects, together with their surface properties. In this paper, we address the problem of metric reconstruction and texture acquisition from a single uncalibrated view of a surface of revolution (SOR). Geometric constraints induced in the image by the symmetry properties of the SOR structure are exploited to perform self-calibration of a natural camera, 3D metric reconstruction, and texture acquisition. By exploiting the analogy with the geometry of single axis motion, we demonstrate that the imaged apparent contour and the visible segments of two imaged cross sections in a single SOR view provide enough information for these tasks. Original contributions of the paper are: single view self-calibration and reconstruction based on planar rectification, previously developed for planar surfaces, has been extended to deal also with the SOR class of curved surfaces; self-calibration is obtained by estimating both camera focal length (one parameter) and principal point (two parameters) from three independent linear constraints for the SOR fixed entities; the invariant-based description of the SOR scaling function has been extended from affine to perspective projection. The solution proposed exploits both the geometric and topological properties of the transformation that relates the apparent contour to the SOR scaling function. Therefore, with this method, a metric localization of the SOR occluded parts can be made, so as to cope with them correctly. For the reconstruction of textured SORs, texture acquisition is performed without requiring the estimation of external camera calibration parameters, but only using internal camera parameters obtained from self-calibration.

  • Research Article
  • Cite Count Icon 5
  • 10.1016/j.radonc.2024.110566
Development of learning-based predictive models for radiation-induced atrial fibrillation in non-small cell lung cancer patients by integrating patient-specific clinical, dosimetry, and diagnostic information
  • Oct 1, 2024
  • Radiotherapy and Oncology
  • Sang Kyun Yoo + 8 more

Development of learning-based predictive models for radiation-induced atrial fibrillation in non-small cell lung cancer patients by integrating patient-specific clinical, dosimetry, and diagnostic information

  • Book Chapter
  • Cite Count Icon 39
  • 10.1007/11744023_21
Camera Calibration with Two Arbitrary Coaxial Circles
  • Jan 1, 2006
  • Carlo Colombo + 2 more

We present an approach for camera calibration from the image of at least two circles arranged in a coaxial way. Such a geometric configuration arises in static scenes of objects with rotational symmetry or in scenes including generic objects undergoing rotational motion around a fixed axis. The approach is based on the automatic localization of a surface of revolution (SOR) in the image, and its use as a calibration artifact. The SOR can either be a real object in a static scene, or a “virtual surface” obtained by frame superposition in a rotational sequence. This provides a unified framework for calibration from single images of SORs or from turntable sequences. Both the internal and external calibration parameters (square pixels model) are obtained from two or more imaged cross sections of the SOR, whose apparent contour is also exploited to obtain a better calibration accuracy. Experimental results show that this calibration approach is accurate enough for several vision applications, encompassing 3D realistic model acquisition from single images, and desktop 3D object scanning.

  • Research Article
  • Cite Count Icon 8
  • 10.1109/temc.2022.3212860
Induced Electric Field in Learning-Based Head Models With Smooth Conductivity for Exposure to Uniform Low-Frequency Magnetic Fields
  • Dec 1, 2022
  • IEEE Transactions on Electromagnetic Compatibility
  • Yinliang Diao + 2 more

Computational human models generated from medical images have been widely used to assess induced electric field for exposure to electromagnetic field. Traditional methods to develop human models include tissue segmentation, which involves huge effort in identifying tissues from medical images. When such models are applied to low-frequency electromagnetic dosimetry, computational artifacts result in substantial error. Deep learning techniques have been utilized to map medical images directly to tissue electrical conductivity, generating human models with smooth transitions in tissue conductivity across tissue boundaries and even within the same tissue. In this study, eight head models with smoothed conductivities were generated using the deep learning network. The induced electric fields in the models were assessed for exposure to a uniform low-frequency magnetic field and were compared with traditional segmented models. Computational results showed that the induced electric field distributions in learning-based and segmented models were consistent, and the former was smoother. The differences in the 99th to 99.99th percentile values between nonuniform and segmented models were within 8% and 13% for gray and white matter, respectively. The staircasing errors were suppressed in the learning-based models because of the smooth transition of the conductivity values, especially at the tissue interface. The intersubject variation of the maximum electric fields was smaller for the nonuniform models than for the segmented models, with a relative standard deviation within 12% for nonuniform models and 22% for segmented models. This difference is much smaller than the reduction factor of 3 associated with the numerical uncertainty set in the International Commission on Non-Ionizing Radiation Protection 2010 guidelines. Our findings could be helpful in deriving appropriate reduction factor in international guidelines, which is used for setting the limit from the threshold of adverse health effects.

  • Book Chapter
  • Cite Count Icon 3
  • 10.1007/978-3-540-89646-3_76
Automatic Segmentation of the Apparent Contour for 3D Modeling of Cutting Tools from Single View
  • Jan 1, 2008
  • Xi Zhang + 4 more

One of the industrial applications for vision-based model reconstruction of surfaces of revolution (SOR) is to rebuild 3D models of rotating mill cutters. For this application, the automation of the process is crucial. One of the critical issues with the automation is the segmentation of the apparent contour. Therefore, this paper introduces a new approach for the automatic apparent contour extraction of SORs. It consists of three parts. Firstly, the region of SOR is located on image and the contour of SOR is extracted inside this region. Secondly, the extracted contour is partitioned into several portions based on curvature analysis. A property of SOR is used to verify the partitioning. Finally, the contour is classified into the apparent contours and the imaged cross sections by exploiting both 2D and 3D information. The experiment on machine tool verifies that the algorithms proposed are reliable and accurate in the industrial environment.

  • Conference Article
  • Cite Count Icon 8
  • 10.1109/humanoids.2014.7041338
Intrinsic camera and hand-eye calibration for a robot vision system using a point marker
  • Nov 1, 2014
  • Ivan Lundberg + 2 more

Accurate robot camera calibration is a requirement for vision guided robots to perform precision assembly tasks. In this paper, we address the problem of doing intrinsic camera and hand-eye calibration on a robot vision system using a single point marker. This removes the need for using bulky special purpose calibration objects, and also facilitates on line accuracy checking and re-calibration when needed, without altering the robots production environment. The proposed solution provides a calibration routine that produces high quality results on par with the robot accuracy and completes a calibration in 3 minutes without need of manual intervention. We also present a method for automatic testing of camera calibration accuracy. Results from experimental verification on the dual arm concept robot FRIDA are presented.

  • Conference Article
  • Cite Count Icon 18
  • 10.1109/tdpvt.2002.1024072
Uncalibrated 3D metric reconstruction and flattened texture acquisition from a single view of a surface of revolution
  • Nov 7, 2002
  • C Colombo + 2 more

We describe a geometric approach for reconstructing 3D textured graphical models of surface of revolution (SOR) objects from a single uncalibrated view Our approach is based on the fact that, for the object class of interest, the structure of the scene provides enough constraints for camera calibration even from a single view. Reconstruction (up to a scaling factor) of 3D shape is complemented with the extraction of flattened 2D texture, so as to support visual retrieval from 2D/3D cues anti to generate realistic 3D visualization models. The approach developed is quite simple, yet accurate and robust; its applications range from the preservation, analysis and classification of cultural heritage, to advanced graphics and multimedia.

  • Conference Article
  • Cite Count Icon 3
  • 10.1061/9780784413029.089
A Transformational Approach to Explicit Stereo Camera Calibration for Improved Euclidean Accuracy of Infrastructure 3D Reconstruction
  • Jun 24, 2013
  • Computing in Civil Engineering
  • H Fathi + 1 more

The accuracy of the results in stereo image-based 3D reconstruction is very sensitive to the intrinsic and extrinsic camera parameters determined during camera calibration. The existing camera calibration algorithms induce a significant amount of error due to poor estimation accuracies in camera parameters when they are used for long-range scenarios such as mapping civil infrastructure. This leads to unusable results, and may result in the failure of the whole reconstruction process. This paper proposes a novel way to address this problem. Instead of incremental improvements to the accuracy typically induced by new calibration algorithms, the authors hypothesize that a set of multiple calibrations created by videotaping a moving calibration pattern along a specific path can increase overall calibration accuracy. This is achieved by using conventional camera calibration algorithms to perform separate estimations for some predefined distance values. The result, which is a set of camera parameters for different distances, is then uniquely input in the Structure from Motion process to improve the Euclidean accuracy of the reconstruction. The proposed method has been tested on infrastructure scenes and the experimental analyses indicate the improved performance.

  • Research Article
  • Cite Count Icon 98
  • 10.1109/tpami.2003.1177148
Camera calibration from surfaces of revolution
  • Feb 1, 2003
  • IEEE Transactions on Pattern Analysis and Machine Intelligence
  • K.-Y.K Wong + 2 more

This paper addresses the problem of calibrating a pinhole camera from images of a surface of revolution. Camera calibration is the process of determining the intrinsic or internal parameters (i.e., aspect ratio, focal length, and principal point) of a camera, and it is important for both motion estimation and metric reconstruction of 3D models. In this paper, a novel and simple calibration technique is introduced, which is based on exploiting the symmetry of images of surfaces of revolution. Traditional techniques for camera calibration involve taking images of some precisely machined calibration pattern (such as a calibration grid). The use of surfaces of revolution, which are commonly found in daily life (e.g., bowls and vases), makes the process easier as a result of the reduced cost and increased accessibility of the calibration objects. In this paper, it is shown that two images of a surface of revolution will provide enough information for determining the aspect ratio, focal length, and principal point of a camera with fixed intrinsic parameters. The algorithms presented in this paper have been implemented and tested with both synthetic and real data. Experimental results show that the camera calibration method presented is both practical and accurate.

  • Book Chapter
  • Cite Count Icon 1
  • 10.1007/978-3-540-88458-3_72
Motion Recovery for Uncalibrated Turntable Sequences Using Silhouettes and a Single Point
  • Jan 1, 2008
  • Hui Zhang + 2 more

This paper addresses the problem of self-calibration and motion recovery for turntable sequences. Previous works exploited silhouette correspondences induced by epipolar tangencies to estimate the image invariants under turntable motion and recover the epipolar geometry. These approaches, however, require the camera intrinsics in order to obtain an Euclidean motion, and a dense sequence is required to provide a precise initialization of the image invariants. This paper proposes a novel approach to estimate the camera intrinsics, the image invariants and the rotation angles from a sparse turntable sequence. The silhouettes and a single point correspondence are extracted from the image sequence. The point traces out a conic in the sequence, from which the fixed entities (i.e., the image of the rotation axis, the horizon, the vanishing point of the coordinates, the circular points and a scalar) can be recovered given a simple initialization of the camera intrinsic matrix. The rotation angles are then recovered by estimating the epipoles that minimize the transfer errors of the outer epipolar tangents to the silhouettes for each pair of images. The camera intrinsics can be further refined by the above optimization. Based on a given range of the initial focal length, a robust method is proposed to give the best estimate of the camera intrinsics, the image invariants, the full camera positions and orientations, and hence a Euclidean reconstruction. Experimental results demonstrate the simplicity of this approach and the accuracy in the estimated motion and reconstruction.

  • Conference Article
  • Cite Count Icon 3
  • 10.1109/icra46639.2022.9811629
Unified Data Collection for Visual-Inertial Calibration via Deep Reinforcement Learning
  • May 23, 2022
  • Yunke Ao + 5 more

Visual-inertial sensors have a wide range of applications in robotics. However, good performance often requires different sophisticated motion routines to accurately calibrate camera intrinsics and inter-sensor extrinsics. This work presents a novel formulation to learn a motion policy to be executed on a robot arm for automatic data collection for calibrating intrinsics and extrinsics jointly. Our approach models the calibration process compactly using model-free deep reinforcement learning to derive a policy that guides the motions of a robotic arm holding the sensor to efficiently collect measurements that can be used for both camera intrinsic calibration and camera-IMU extrinsic calibration. Given the current pose and collected measurements, the learned policy generates the subsequent transformation that optimizes sensor calibration accuracy. The evaluations in simulation and on a real robotic system show that our learned policy generates favorable motion trajectories and collects enough measurements efficiently that yield the desired intrinsics and extrinsics with short path lengths. In simulation, we are able to perform calibrations 10× faster than hand-crafted policies, which transfers to a real-world speed up of 3× over a human expert. The code of this work is publicly available at: https://github.com/ethz-asl/Learn-to-Calibrate.

  • Research Article
  • Cite Count Icon 16
  • 10.1007/s00371-005-0335-x
Single view compositing with shadows
  • Sep 1, 2005
  • The Visual Computer
  • Xiaochun Cao + 3 more

In this paper, we describe how geometrically correct and visually realistic shadows may be computed for objects composited into a single view of a target scene. Compared to traditional single view compositing methods, which either do not deal with the shadow effects or manually create the shadows for the composited objects, our approach efficiently utilizes the geometric and photometric constraints extracted from a single target image to synthesize the shadows consistent with the overall target scene for the inserted objects. In particular, we explore (i) the constraints provided by imaged scene structure, e.g. vanishing points of orthogonal directions, for camera calibration and thus explicit determination of the locations of the camera and the light source; (ii) the relatively weaker geometric constraint, the planar homology, that models the imaged shadow relations when explicit camera calibration is not possible; and (iii) the photometric constraints that are required to match the color characteristics of the synthesized shadows with those of the original scene. For each constraint, we demonstrate the working examples followed by our observations. To show the accuracy and the applications of the proposed method, we present the results for a variety of target scenes, including footage from commercial Hollywood movies and 3D video games.

  • Research Article
  • Cite Count Icon 10
  • 10.1109/tpami.2023.3269641
A Perceptual Measure for Deep Single Image Camera and Lens Calibration.
  • Sep 1, 2023
  • IEEE Transactions on Pattern Analysis and Machine Intelligence
  • Yannick Hold-Geoffroy + 5 more

Image editing and compositing have become ubiquitous in entertainment, from digital art to AR and VR experiences. To produce beautiful composites, the camera needs to be geometrically calibrated, which can be tedious and requires a physical calibration target. In place of the traditional multi-image calibration process, we propose to infer the camera calibration parameters such as pitch, roll, field of view, and lens distortion directly from a single image using a deep convolutional neural network. We train this network using automatically generated samples from a large-scale panorama dataset, yielding competitive accuracy in terms of standard l2 error. However, we argue that minimizing such standard error metrics might not be optimal for many applications. In this work, we investigate human sensitivity to inaccuracies in geometric camera calibration. To this end, we conduct a large-scale human perception study where we ask participants to judge the realism of 3D objects composited with correct and biased camera calibration parameters. Based on this study, we develop a new perceptual measure for camera calibration and demonstrate that our deep calibration network outperforms previous single-image based calibration methods both on standard metrics as well as on this novel perceptual measure. Finally, we demonstrate the use of our calibration network for several applications, including virtual object insertion, image retrieval, and compositing.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 12
  • 10.1186/s12885-023-11499-6
Radiation pneumonia predictive model for radiotherapy in esophageal carcinoma patients
  • Oct 17, 2023
  • BMC cancer
  • Liming Sheng + 9 more

BackgroundThe machine learning models with dose factors and the deep learning models with dose distribution matrix have been used to building lung toxics models for radiotherapy and achieve promising results. However, few studies have integrated clinical features into deep learning models. This study aimed to explore the role of three-dimension dose distribution and clinical features in predicting radiation pneumonitis (RP) in esophageal cancer patients after radiotherapy and designed a new hybrid deep learning network to predict the incidence of RP.MethodsA total of 105 esophageal cancer patients previously treated with radiotherapy were enrolled in this study. The three-dimension (3D) dose distributions within the lung were extracted from the treatment planning system, converted into 3D matrixes and used as inputs to predict RP with ResNet. In total, 15 clinical factors were normalized and converted into one-dimension (1D) matrixes. A new prediction model (HybridNet) was then built based on a hybrid deep learning network, which combined 3D ResNet18 and 1D convolution layers. Machine learning-based prediction models, which use the traditional dosiomic factors with and without the clinical factors as inputs, were also constructed and their predictive performance compared with that of HybridNet using tenfold cross validation. Accuracy and area under the receiver operator characteristic curve (AUC) were used to evaluate the model effect. DeLong test was used to compare the prediction results of the models.ResultsThe deep learning-based model achieved superior prediction results compared with machine learning-based models. ResNet performed best in the group that only considered dose factors (accuracy, 0.78 ± 0.05; AUC, 0.82 ± 0.25), whereas HybridNet performed best in the group that considered both dose factors and clinical factors (accuracy, 0.85 ± 0.13; AUC, 0.91 ± 0.09). HybridNet had higher accuracy than that of Resnet (p = 0.009).ConclusionBased on prediction results, the proposed HybridNet model could predict RP in esophageal cancer patients after radiotherapy with significantly higher accuracy, suggesting its potential as a useful tool for clinical decision-making. This study demonstrated that the information in dose distribution is worth further exploration, and combining multiple types of features contributes to predict radiotherapy response.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 9
  • 10.1109/access.2021.3109865
Accuracy in Depth Recovery and 3D Image Synthesis From Single Image Using Multi-Color Filter Aperture and Shallow Depth of Field
  • Jan 1, 2021
  • IEEE Access
  • Rashmi R Deshpande + 2 more

A computational 3D image generation using a single view with multi-color filter aperture (MCA) and multi-plane representation is a cost-effective approach and most useful when there is no option to acquire either stereo or multi-views with orientation at all. Although this approach generates 3D perception image that includes multiple objects with both similar and dissimilar colors having occluded by each other, it may be insufficient for virtual/augmented reality applications due to inaccurate depth. In this article, we obtain a more accurate geometric depth estimation by formulating a suitable relationship between inter-objects depth of the 3D scene in the depth-of-field (DoF) zone and its corresponding inter-image plane depths of a 3D perception image in depth-of-focus (DoFo) zone of a given camera under shallow DoF zone constraint. But, this shallow depth zone is configured to be dependent only on the focal distance between the lens and object while the remaining parameters such as aperture diameter, focal length, and sensor sensitivity are held at constant values. All-in-focus 3D perception image is synthesized from multi-plane images (MPIs) by utilizing the inter-image plane depths computed from the disparities caused across the boundaries and its smooth surface from image textures inside the respective boundaries of the 2D MCA image. The 2.1D sketch is used as a semantic segmentation technique to determine the number of objects in the 3D scene as one in-focus region and the rest as out-of-focus regions due to the circle of confusions (CoCs) on the fixed image sensor plane. The same enables both ordering of the image regions and identifying occlusion wherever applicable. An accurate depth 3D image is synthesized, replacing accurate inter-depths in place of inter-depth between MPIs used for 3D perception image. In the end, the paper summarizes few experimental validations for the proposed approach with some salient examples having depth gaps between 0.5cm to 10.5cm.

Save Icon
Up Arrow
Open/Close
Notes

Save Important notes in documents

Highlight text to save as a note, or write notes directly

You can also access these Documents in Paperpal, our AI writing tool

Powered by our AI Writing Assistant