This has become a significant study area in recent years because of its use in brain-machine interaction (BMI). The robustness problem of emotion classification is one of the most basic approaches for improving the quality of emotion recognition systems. One of the two main branches of these approaches deals with the problem by extracting the features using manual engineering and the other is the famous artificial intelligence approach, which infers features of EEG data. This study proposes a novel method that considers the characteristic behavior of EEG recordings and based on the artificial intelligence method. The EEG signal is a noisy signal with a non-stationary and non-linear form. Using the Empirical Wavelet Transform (EWT) signal decomposition method, the signal's frequency components are obtained. Then, frequency-based features, linear and non-linear features are extracted. The resulting frequency-based, linear, and nonlinear features are mapped to the 2-D axis according to the positions of the EEG electrodes. By merging this 2-D images, 3-D images are constructed. In this way, the multichannel brain frequency of EEG recordings, spatial and temporal relationship are combined. Lastly, 3-D deep learning framework was constructed, which was combined with convolutional neural network (CNN), bidirectional long-short term memory (BiLSTM) and gated recurrent unit (GRU) with self-attention (AT). This model is named EWT-3D–CNN–BiLSTM-GRU-AT. As a result, we have created framework comprising handcrafted features generated and cascaded from state-of-the-art deep learning models. The framework is evaluated on the DEAP recordings based on the person-independent approach. The experimental findings demonstrate that the developed model can achieve classification accuracies of 90.57 % and 90.59 % for valence and arousal axes, respectively, for the DEAP database. Compared with existing cutting-edge emotion classification models, the proposed framework exhibits superior results for classifying human emotions.