Video quality assessment (VQA) is very important for many video processing applications, e.g., compression, archiving, restoration, and enhancement. An ideal video quality metric should achieve consistency between video distortion prediction and psychological perception of human visual system. Different from the quality assessment of single images, motion information and temporal distortion should be carefully considered for VQA. Most of previous VQA algorithms deal with the motion information through two ways: either incorporating motion characteristics into a temporal weighting scheme to account for their affects on the spatial distortion, or modeling the temporal distortion and spatial distortion independently. Optical flows need to be estimated in the two ways. In this paper, we propose a different methodology to deal with the motion information. Instead of explicitly calculating the optical flow and independently modeling the temporal distortion, both the spatial edge features and temporal motion characteristics are accounted for by some structural features in the localized spacetime regions. We propose to represent the structural information by two descriptors extracted from the 3-D structure tensors, which are the largest eigenvalue as well as its corresponding eigenvector. Experimental results on LIVE database and VQEG FR-TV Phase-I database show that the proposed VQA metric is competitive with state-of-the-art VQA metrics, while keeping relatively low computing complexity.