A human eye has about 120 million rod cells and 6 million cone cells. This huge number of light-sensing cells inside a human eye will continuously produce a huge quantity of visual signals that flow into the human brain for daily processing. However, the real-time processing of these visual signals does not cause excessive energy consumption by the human brain. This fact tells us the truth which is to say that human-like vision processes do not rely on complicated and expensive formulas to compute depth, displacement, and colors. On the other hand, the human eye is like a camera with pan-tilt (PT) motions. We all know that in computer vision, each set of PT parameters (i.e., coefficients of pan motion as well as tilt motion) requires a dedicated calibration to determine a camera’s projection matrix. Since there is an infinite number of PT parameters that could be produced by a human eye, it is unlikely that a human brain stores an infinite number of calibration matrices for each human eye. These observations inspire us to look for a simpler and computationally non-expensive solution which is to undertake three-dimensional (3D) projection in human-like binocular vision. In other words, it is an interesting question for us to answer, which is to say whether simpler and learning-friendly formulas for computing depth and displacement exist or not. If the answer is yes, these formulas must also be calibration-friendly (i.e., easy process on the fly or on the go). In this paper, we present an important discovery of a new solution to 3D projection in a human-like binocular vision system. This solution is computationally simpler and could be easily learned or calibrated on the fly. We know that the purpose of doing 3D projection in binocular vision is to undertake forward and inverse transformations (or mappings) between coordinates in 2D digital images and coordinates in a 3D analogue scene. The formulas underlying the new solution are accurate, easily computable, easily tunable (i.e., to be calibrated on the fly or on the go) and could be easily implemented by a neural system (i.e., a network of neurons or a network of computational flows). Experimental results have validated the newly derived formulas which are better than textbook solutions.
Read full abstract