Face recognition is widely utilized and has become a ubiquitous part of daily lives. However, original face images are often sensitive and unauthorized access could imperil personal information. To protect privacy and security of multimodal face data, we propose an original approach that integrates differential privacy and a lightweight convolutional neural network with unsupervised predefined quaternion-type filters. Firstly, different modalities of face images are banded together by means of full quaternion matrix representation for simultaneously processing. Then the proposed quaternion two-dimensional discrete orthogonal Stockwell transform is exploited for implementing difference privacy, which is disturbed by virtue of Laplace and exponential mechanism. Afterwards, we devise a stacked quaternion two-dimensional principal component analysis network with three-stages for learning representative features. The proposed network model can enhance the discriminative characteristics through inserting nonlinearity, together with union of feature maps in different levels to achieve information complementation. Experiments conducted on four multimodal face datasets unveil that the proposed method achieves superior recognition performance in comparison with other state-of-the-art approaches.