Sensing the Sensor: Estimating Camera Properties with Minimal Information
Public outdoor surveillance cameras often have limited metadata describing their properties. Frequently, a public camera’s precise position, orientation, focal length, and image center are unknown; these attributes are necessary to precisely pinpoint the location of events seen in the camera. In this article, we ask: what is the minimal information needed to accurately estimate these properties for public cameras? We show, using a judicious combination of projective geometry, neural networks, and crowd-sourced annotations from human workers, that it is possible to, for example, localize 95% of the cameras in our test data set to within 12 m using a single image taken from the camera. This performance is an order of magnitude better than PoseNet, a state-of-the-art neural network that needs significantly more information than our approach, and can only estimate position and orientation (and not other properties). Finally, we show that the camera’s inferred pose and properties can help design a number of virtual sensors , all of which have good accuracy.
- Conference Article
12
- 10.1109/embc46164.2021.9630868
- Nov 1, 2021
Visual inspection of microscopic samples is still the gold standard diagnostic methodology for many global health diseases. Soil-transmitted helminth infection affects 1.5 billion people worldwide, and is the most prevalent disease among the Neglected Tropical Diseases. It is diagnosed by manual examination of stool samples by microscopy, which is a time-consuming task and requires trained personnel and high specialization. Artificial intelligence could automate this task making the diagnosis more accessible. Still, it needs a large amount of annotated training data coming from experts.In this work, we proposed the use of crowdsourced annotated medical images to train AI models (neural networks) for the detection of soil-transmitted helminthiasis in microscopy images from stool samples leveraging non-expert knowledge collected through playing a video game. We collected annotations made by both school-age children and adults, and we showed that, although the quality of crowdsourced annotations made by school-age children are sightly inferior than the ones made by adults, AI models trained on these crowdsourced annotations perform similarly (AUC of 0.928 and 0.939 respectively), and reach similar performance to the AI model trained on expert annotations (AUC of 0.932). We also showed the impact of the training sample size and continuous training on the performance of the AI models.In conclusion, the workflow proposed in this work combined collective and artificial intelligence for detecting soil-transmitted helminthiasis. Embedded within a digital health platform can be applied to any other medical image analysis task and contribute to reduce the burden of disease.
- Book Chapter
73
- 10.1007/978-0-85729-327-5_5
- Jan 1, 1999
We describe how 3D affine measurements may be computed from a single perspective view of a scene given only minimal geometric information determined from the image. This minimal information is typically the vanishing line of a reference plane and a vanishing point for a direction not parallel to the plane. It is shown that affine scene structure may then be determined from the image, without knowledge of the camera's internal calibration (e.g. focal length), nor of the explicit relation between camera and world (pose). In particular we show how to: compute the distance between planes parallel to the reference plane (up to a common scale factor); compute area and length ratios on any plane parallel to the reference plane; determine the camera's (viewer's) location. Simple geometric derivations are given for these results. We also develop an algebraic representation which unifies the three types of measurement and, amongst other advantages, permits a first order error propagation analysis to be performed, associating an uncertainty with each measurement. We demonstrate the technique for a variety of applications, including height measurements in forensic images and 3D graphical modelling from single images.
- Research Article
642
- 10.1023/a:1026598000963
- Nov 1, 2000
- International Journal of Computer Vision
We describe how 3D affine measurements may be computed from a single perspective view of a scene given only minimal geometric information determined from the image. This minimal information is typically the vanishing line of a reference plane, and a vanishing point for a direction not parallel to the plane. It is shown that affine scene structure may then be determined from the image, without knowledge of the camera's internal calibration (e.g. focal length), nor of the explicit relation between camera and world (pose). In particular, we show how to (i) compute the distance between planes parallel to the reference plane (up to a common scale factor)s (ii) compute area and length ratios on any plane parallel to the reference planes (iii) determine the camera's location. Simple geometric derivations are given for these results. We also develop an algebraic representation which unifies the three types of measurement and, amongst other advantages, permits a first order error propagation analysis to be performed, associating an uncertainty with each measurement. We demonstrate the technique for a variety of applications, including height measurements in forensic images and 3D graphical modelling from single images.
- Book Chapter
3
- 10.1007/978-3-642-40409-2_30
- Jan 1, 2013
Vision system will make robotic system has the ability to see and modeled the real world objects. There are many factors that can affect the process of robot vision such as lens distortion, camera position which is not always at the center on the robot environment, the robot and other objects movement. In this research, we design an architecture using neural network to apply for global vision in autonomous mobile robot engine. The scheme is concerning to the development of camera calibration technique using neural network for precise and accurate position and orientation the robots. Its goal is to develop a robust camera calibration technique, to estimate the parameters of a transformation in the real world coordinate into image coordinate systems in autonomous mobile robots. The objective of our research is to propose and develop calibration techniques in a global overhead vision system for autonomous mobile robots. It aims to map and identify the identity of a robot in various conditions and camera position. Artificial Neural Network method (ANN) has been proposed as a method for solving coordinates transformation problems for non-linear lens distortion. The coordinate transformation was tested by placing cameras at various heights and setting camera angle with various zoom and focal length values.
- Research Article
13
- 10.1016/j.asr.2022.07.050
- Jul 25, 2022
- Advances in Space Research
Neural network-based ionospheric modeling and predicting—To enhance high accuracy GNSS positioning and navigation
- Book Chapter
49
- 10.1007/978-3-642-17274-8_15
- Jan 1, 2010
For images taken in man-made scenes, vanishing points and focal length of camera play important roles in scene understanding. In this paper, we present a novel method to quickly, accurately and simultaneously estimate three orthogonal vanishing points (TOVPs) and focal length from single images. Our method is based on the following important observations: If we establish a polar coordinate system on the image plane whose origin is at the image center, angle coordinates of vanishing points can be robustly estimated by seeking peaks in a histogram. From the detected angle coordinates, altitudes of a triangle formed by TOVPs are determined. Novel constraints on both vanishing points and focal length could be obtained from the three altitudes. By using the constraints, radial coordinates of TOVPs and focal length can be estimated simultaneously. Our method decomposes a 2D Hough parameter space into two cascaded 1D Hough parameter spaces, which makes our method much faster and more robust than previous methods without losing accuracy. Enormous experiments on real images have been done to test feasibility and correctness of our method.
- Conference Article
4
- 10.1109/nsip.2005.1502243
- Jan 1, 2005
Summary form only given, as follows. We constructed an individual identification system of fingerprints by three-layered neural networks, and investigated the effects of the preprocessing method for determining the centers of fingerprint images and the neural networks on the performance of an individual identification system. The fingerprint images were classified into four directions, 0°, 45°, 90° and 135°, then two kinds of smoothing were done and the neural networks were optimized. From the results, we found that the preprocessing and the layered neural networks were useful for the individual identification system with a high performance and that the preprocessing method determining the centers of fingerprint images produced higher performance than the system without the preprocessing.
- Research Article
6
- 10.5194/isprs-archives-xlii-2-w13-61-2019
- Jun 4, 2019
- The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
Abstract. Highly precise ground control points, which are globally available, can be derived from the SAR satellite TerraSAR-X. This opens up many new applications like for example the precise aerial image orientation. In this paper, we propose a method for precise aerial image orientation using spaceborne geodetic Synthetic Aperture Radar Ground Control Points (SAR-GCPs). The precisely oriented aerial imagery can then be used e.g. for mapping of urban landmarks, which support the ego-positioning of autonomous cars. The method for precise image orientation was validated based on two aerial image data sets. SAR-GCPs were measured in images, then the image orientation has been improved by a bundle-adjustment. Results based on check points show, that the accuracy of the image orientation is better than 5 cm in X and Y coordinates.
- Conference Article
3
- 10.1109/robot.1995.525704
- May 21, 1995
Three dimensional vision applications, such as robot vision, require modelling of the relationship between the 2D images and the 3D world. Camera calibration is a process which accurately models this relationship. The calibration procedure determines the geometric parameters of the camera, such as focal length and center of the image. Most of the existing calibration techniques use predefined patterns and a static camera. Recently, A. Basu (1993) developed a novel calibration technique for computing the focal length and image center which uses an active camera. This technique does not require any predefined patterns or point to point correspondence between images-only a set of scenes with some stable edges. It was observed that the algorithms developed for image center are sensitive to noise and hence unreliable in real situations. The article extends the techniques provided by Basu to develop a simpler, yet more robust method for computing the image center.
- Research Article
52
- 10.1109/3477.584964
- Jun 1, 1997
- IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics)
Three dimensional vision applications, such as robot vision, require modeling of the relationship between the two-dimensional images and the three-dimensional world. Camera calibration is a process which accurately models this relationship. The calibration procedure determines the geometric parameters of the camera, such as focal length and center of the image. Most of the existing calibration techniques use predefined patterns and a static camera. Recently, a novel calibration technique for computing the focal length and image center, which uses an active camera, has been developed. This technique does not require any predefined patterns or point-to-point correspondence between images-only a set of scenes with some stable edges. It was observed that the algorithms developed for the image center are sensitive to noise and hence unreliable in real situations. This report extends the techniques provided to develop a simpler, yet more robust method for computing the image center.
- Research Article
20
- 10.1002/mmce.20711
- Feb 14, 2013
- International Journal of RF and Microwave Computer-Aided Engineering
This article presents a systematic analysis and design of the X-band Minkow- ski reflectarray antenna (RA) by using the 3-D Computer Simulation Technology Micro- wave Studio (CST MWS)-based multilayer perceptron neural network (MLP NN) model of a unit element. This MLP NN model is utilized efficiently as a fast and accurate model within a particle swarm optimization procedure to determine the calibration phasing char- acteristic belonging to the resultant optimum patch geometry and substrate. In design stage, the MLP NN analysis model is reversed to determine the variable-size of each reflectarray element to meet the necessary phase delay with the adaptive iterative step. In the final stage, the optimum Minkowski RA consisting of the variable-size Minkowski patches interspaced by 0.5 wavelength at the frequency of 11 GHz on the Taconic RF-35 with er ¼ 3.54, tan d ¼ 0.0018 and the optimum thickness (hopt) are analyzed using the 3D CST MWS and com- pared with the counterpart square and parabolic reflectors. Compared with the counterpart RA with square element and the parabolic reflector, this optimized Minkowski RA is resulted to be capable of providing higher realized gain and lower sidelobe level (SLL). Fur- thermore, the effect of feed movement along the focal length on the gain-bandwidth and radiation pattern is also worked out and demonstrated. It is concluded that this method can also be applied as a robust method for the design and analysis of a RA built by arbitrarily shaped patches. V C 2013 Wiley Periodicals, Inc. Int J RF and Microwave CAE 23:272-284, 2013.
- Conference Article
235
- 10.1109/icpr.1994.576402
- Oct 9, 1994
Given two arbitrary views of a scene under central projection, if the motion of points on a parametric surface is compensated, the residual parallax displacement field on the reference image is an epipolar field. If the surface aligned is a plane, the parallax magnitude at an image point is directly proportional to the height of the point from the plane and inversely proportional to its depth from the camera. The authors exploit the above theorem to infer 3D height information from oblique aerial 2D images. The authors use direct methods to register the aerial images and develop methods to infer height information under the following three conditions: (i) focal length and image center are both known, (ii) only the focal length is known, and (iii) both are unknown.
- Book Chapter
1
- 10.1007/978-3-642-03156-4_15
- Jan 1, 2009
The objective of camera calibration is to obtain the correlation between camera image coordinate and 3D real world coordinate. In this paper, we propose a new approach which is based on the neural network model instead of the physical camera model including position, orientation, focal length, and optical center. The neural network employed in this paper is MLPNN (MultiLayer Perceptron Type Neural Network), which is primarily used as a mapper between 2D image points and points of a certain space in 3D real world. The neural network model implicitly contains all the physical parameters, some of which are very difficult to be estimated in the conventional calibration methods. In order to show the performance of the proposed method, images from two different cameras with three different camera angles were used for calibrating the cameras. The performance of the proposed neural network approach is compared with the well-known Tsai’s two stage method in terms of calibration errors. The results show that the proposed approach gives much more consistent and acceptable calibration error over Tsai’s two stage method regardless of the quality of camera and the camera angles.
- Book Chapter
7
- 10.1007/978-981-10-0934-1_16
- Jan 1, 2016
Recently, unmanned aerial vehicle (UAV) technology has been growing rapidly and widely used in civil applications, such as aerial mapping, precise agriculture, power line patrolling. However, most of high precision position and orientation systems (POS) are still heavy and expensive, so they are still unpractical to be equipped on UAV. In this study, we investigated the feasibility of providing precise navigation and positioning service for UAV with single frequency GPS/BDS receiver. The challenges of precise positioning for UAV with low cost receivers are threefold: (1) signal captured with low cost receivers and antennas has poorer quality and interference immunity. (2) UAV moves fast and flexible. (3) UAV navigation requires real time and high reliable positioning solution. GPS-based single frequency can provide centimeter accuracy positioning in friendly environments, while its availability and reliability is difficult to meet the requirement. In this study, we introduced GPS/BDS combined RTK positioning, which increases the redundancy number and improved the precision and reliability of float solution. Meanwhile, we analyzed the signal and antenna gain characteristics of low cost receiver and antenna, and established a refined stochastic model for GPS and BDS observations. We also made use of the Doppler observations to improve the kinematic model and quality control, which significantly improves the success rate of carrier phase ambiguity resolution in single frequency RTK case. In order to validate the performance of our algorithm, we carried out in-flight tests with Tersus single frequency GPS/BDS receiver and analyzed the test results. The results indicate that our RTK algorithm can improve the fix rate of ambiguity resolution from 36 % to around 60 % in challenging environments. We also validated that GPS/BDS RTK has better availability and reliability than GPS stand-alone RTK.
- Research Article
29
- 10.1109/tmag.2018.2827397
- Nov 1, 2018
- IEEE Transactions on Magnetics
Micro and nano technology is an indispensable part of modern science and technology. Because of the excellent advantages of high energy density, rapid response and large mechanical force, GMA becomes more promising in precision positioning, microelectronic, and biomedicine field[1], [2]. However, the relationship between input current and output displacement of GMA is hysteresis nonlinearity, which shows that the output of GMA not only relates to the current input value, but also relates to previous output value. Besides, the hysteresis nonlinearity is rate-dependent, so that the output of GMA depends on the input frequency. The intrinsic rate-dependent hysteresis nonlinearity of GMA is the main sticking point preventing its application in the high precision positioning [3], [4]. Therefore, modeling of GMA has long been difficult to study and attracted the attention of researchers. In this paper, the Prandtl-Ishlinskii (PI) model with the parameters self-tuning ability is established by the internal time-delay recurrent neural network (RNN) to describe the hysteresis nonlinearity of GMA. PI model consists of play operator and density function. Play operator is a continuous hysteresis operator, whose output depends on not only the current input but also the previous input. Nevertheless, play operator is rate-independent. Identifying the applicable density function is an important part of modeling PI model of GMA. Neural network has the features of the nonlinear mapping property and high parallel process ability. Therefore it is suitable to be applied to identify the nonlinear model. In this paper, the internal time-delay RNN is used to replace the density function of PI model. The PI model structure identified by the internal time-delay RNN for GMA is shown in Fig. 1. The internal time-delay RNN is set by the input layer, output layer and hidden layer. Where output of play operator $v_{i}(t)$ is the $i$ th input sample of the network at time $t$, $y^{\ast }(t)$ is the output of the network. $^{1}w_{ji}$ and $^{2}w_{j}$ are the weights of input layer and output layer, respectively, $^{h}w_{jk}$ is the weight for the nodes of hidden layer. Compared with feedforward neural network, it has property of memory because there is time delay recurrent existed in the hidden layer. Due to the inherent feedback structure of internal time-delay RNN, it possesses the dynamic characteristics and can adjust the parameters of PI model adaptively. The simulation results of PI model identified by internal time-delay RNN at the different input frequency are shown in Fig. 2. The red solid lines are hysteresis loops measured in the experiments, the blue dotted lines are the hysteresis loops of PI model based on internal time-delay RNN, and the blue solid lines are the modeling error curves. To facilitate simulation, the normalized data is adopted. As shown in Fig.2, the maximum modeling error rate at 1Hz, 10Hz, 50Hz and 100Hz is 0.81%, 1.07%, 1.41% and 2.14%, respectively. The PI model identified in this paper can accurately describe the hysteresis nonlinearity of GMA with the increase of input frequency. Therefore, the ability of precise modeling by PI model based on internal time-delay RNN is certified. The proposed PI model can be used to eliminate the hysteresis nonlinearity at the compensation control of GMA and promote the application of GMA in precision positioning field in the future.