AbstractFacial expression is closely related to the emotions of drivers, thus facilitating safe driving detection in advanced driving assistance system (ADAS). Recently, deep learning techniques have become prevalent for facial expression recognition. However, for driving scenarios, the facial expression recognition is mainly challenged by the problems of small sample size as well as effective feature representation. To address the above issues, this paper puts forward a transfer learning‐based method for driver facial expression recognition by fully exploiting the other sources of facial expression data that may obey different distributions. Specifically, an enhanced feature attention module is firstly devised such that affluent features with multi‐scales can be extracted and refined based on the spatial and channel attention mechanisms. Then, a joint correlation alignment loss is presented ensuring that the samples in the source and target domains are transformed into the shared common subspace to reduce difference of both the marginal and conditional distributions. Multiple transfer learning tasks on real‐world data are carried out to evaluate the proposed method. The experimental results show that our model achieves better recognition accuracy for driver facial expression compared with several traditional and deep learning‐based transfer learning algorithms.