In this paper, we introduce the three-dimensional aerial image interface, 3DAII. This interface reconstructs and aerially projects a three-dimensional object image, which can be simultaneously observed from various viewpoints or by multiple users with the naked eye. A pyramid reflector is used to reconstruct the object image, and a pair of parabolic mirrors is used to aerially project the image. A user can directly manipulate the three-dimensional object image by superimposing a user’s hand-finger or a rod on the image. A motion capture sensor detects the user’s hand-finger that manipulates the projected image, and the system immediately exhibits some reaction such as deformation, displacement, and discoloration of the object image, including sound effects. A performance test is executed to confirm the functions of 3DAII. The execution time of the end-tip positioning of a robotic arm has been compared among four operating devices: touchscreen, gamepad, joystick, and 3DAII. The results exhibit the advantages of 3DAII; we can directly instruct the movement direction and movement speed of the end-tip of the robotic arm, using the three-dimensional Euclidean vector outputs of 3DAII in which we can intuitively make the end-tip of the robotic arm move in three-dimensional space. Therefore, 3DAII would be one important alternative to an intuitive spatial user interface, e.g., an operation device of aerial robots, a center console of automobiles, and a 3D modelling system. A survey has been conducted to evaluatecomfortandfatiguebased on ISO/TS 9241-411 andease of learningandsatisfactionbased on the USE questionnaire. We have identified several challenges related to visibility, workspace, and sensory feedback to users that we would like to address in the future.