Facial expression recognition (FER) is an essential part of effective human–computer interaction and serves as a helpful medium for children and patients who have problems with communication. However, most of the previous studies focus on building a FER model based on supervised and unsupervised approaches. This paper is focused on a semi-supervised deep belief network (DBN) approach to predict the facial expressions from the CK+, Oulu CASIA, MMI, and JAFFE datasets. To achieve accurate classification of the facial expressions, a gravitational search algorithm (GSA) is applied to optimize some parameters in the DBN network. The Histogram oriented gradients (HOG) and 2D-Discrete Wavelet Transform (2D-DWT) are used for feature extraction from the lip, cheek, brow, eye, and furrow patches. The unwanted information present in the image is eliminated using a feature selection approach. The feature extraction is done by the Kernel-principal component analysis to obtain higher-order correlations between input variables and detect non-linear components. The HOG features extracted from the lip patch provides the best performance for accurate facial expression classification. Finally, a comparative analysis to compare the proposed model with different machine learning techniques based on the evaluation criteria. The results demonstrate that DBN-GSA based classifier is more accurate than the rest of the classifiers.