Abstract
We present photometric stereo algorithms robust to non-Lambertian reflection, which are based on a convolutional neural network in which surface normals of objects with complex geometry and surface reflectance are estimated from a given set of an arbitrary number of images. These images are taken from the same viewpoint under different directional illumination conditions. The proposed method focuses on surface normal estimation, where multi-scale feature aggregation is proposed to obtain a more accurate surface normal, and max pooling is adopted to obtain an intermediate order-agnostic representation in the photometric stereo scenario. The proposed multi-scale feature aggregation scheme using feature concatenation is easily incorporated into existing photometric stereo network architectures. Our experiments were performed with a DiLiGent photometric stereo benchmark dataset consisting of ten real objects, and they demonstrated that the accuracies of our calibrated and uncalibrated photometric stereo approaches were improved over those of baseline methods. In particular, our experiments also demonstrated that our uncalibrated photometric stereo outperformed the state-of-the-art method. Our work is the first to consider the multi-scale feature aggregation in photometric stereo, and we showed that our proposed multi-scale fusion scheme estimated the surface normal accurately and was beneficial to improving performance.
Highlights
In Woodham’s work, the orientation on the surface of an object is determined from a set of at least three images captured from a fixed orthographic camera under different illumination directions
We present a convolutional neural network (CNN)-based method to discover the relationship between the surface normal and a set of an arbitrary number of images taken under a photometric stereo setup
Most of the hyper-parameters setting of the proposed normal estimation network (NENet) for the calibrated and uncalibrated photometric stereo follow those of PS-fully convolutional neural network (FCN) and SDPS-Net, respectively
Summary
In Woodham’s work, the orientation on the surface of an object is determined from a set of at least three images captured from a fixed orthographic camera under different illumination directions. The example-based photometric stereo uses the reference objects, such as multiple types of spheres, with a homogeneous material property that is placed with target objects in the same scene [34,35,36] This approach adopts an orientation consistency cue—the same image irradiance value is observed at two different points on the surface of objects having identical surface appearances and surface normals under the same illumination [34]. Deep learning algorithms have recently achieved remarkable progress in various domains, such as computer vision, speech recognition, and natural language processing Following this trend, recent advances in the calibrated and uncalibrated photometric stereo to produce a high-fidelity surface normal have been achieved employing deep learning, where the deep photometric stereo network learns the mapping from the multiple images to the surface normal vector [40,41,42,43,44,45,46].
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.