Proposed is a discriminative scale invariant feature transform (D-SIFT) for facial expression recognition. Keypoint descriptors of the SIFT features are used to construct distinctive facial feature vectors. Kullback Leibler divergence is used for the initial classification of the localised facial expressions and the weighted majority voting classifier is employed to fuse the decisions obtained from localised rectangular facial regions to generate the overall decision. Experiments on the 3D-BUFE database illustrate that the D-SIFT is effective and efficient for facial expression recognition.