The research described in the present article was designed to identify the minimal conditions for the visual perception of 3-dimensional structure from motion by comparing the theoretical limitations of ideal observers with the perceptual performance of actual human subjects on a variety of psychophysical tasks. The research began with a mathematical analysis, which showed that 2-frame apparent motion sequences are theoretically sufficient to distinguish between rigid and nonrigid motion and to identify structural properties of an object that remain invariant under affine transformations, but that 3 or more distinct frames are theoretically necessary to adequately specify properties of euclidean structure such as the relative 3-dimensional lengths or angles between nonparallel line segments. A series of four experiments was then performed to verify the psychological validity of this analysis. The results demonstrated that the determination of structure from motion in actual human observers may be restricted to the use of first order temporal relations, which are available within 2-frame apparent motion sequences. That is to say, the accuracy of observers' judgments did not improve in any of these experiments as the number of distinct frames in an apparent motion sequence was increased from 2 to 8, and performance on tasks involving affine structure was of an order of magnitude greater than performance on similar tasks involving euclidean structure.