Motivated by the discriminative ability of shape information and local patterns in object recognition, this paper proposes a window-based object descriptor that integrates both cues. In particular, contour templates representing object shape are used to derive a set of so-called key points at which local appearance features are extracted. These key points are located using an improved template matching method that utilises both spatial and orientation information in a simple and effective way. At each of the extracted key points, a new local appearance feature, namely non-redundant local binary pattern (NR-LBP), is computed. An object descriptor is formed by concatenating the NR-LBP features from all key points to encode the shape as well as the appearance of the object. The proposed descriptor was extensively tested in the task of detecting humans from static images on the commonly used MIT and INRIA datasets. The experimental results have shown that the proposed descriptor can effectively describe non-rigid objects with high articulation and improve the detection rate compared to other state-of-the-art object descriptors.