This paper addresses the problem of recognizing human actions from depth videos. We propose a depth-based local descriptor and affine subspace coding representation with locality-constrained affine subspace coding (LASC) for 3D action recognition. First, each depth video sequence is divided into a set of subsequences (i.e., multi-scale sub-actions) based on the normalized motion energy vector. Next, depth motion map-based gradient local auto-correlation features are employed to capture the shape information and motion cues of each sub-action. In order to obtain discriminative and compact representation, we extract the local high-order information of the depth video using LASC. Through experiments, we show that the use of LASC exhibits better performance compared with existing methods such as locality-constrained linear coding. We compared LASC with the state-of-the-art methods based on similar principle, using features extracted from a single modality, on four datasets, and with those using multiple features or nonlinear recognition machines. The results on four datasets clearly show the effectiveness of the proposed method.