In the research of motor imagery brain-computer interface (MI-BCI), traditional electroencephalogram (EEG) signal recognition algorithms appear to be inefficient in extracting EEG signal features and improving classification accuracy. In this paper, we discuss a solution to this problem based on a novel step-by-step method of feature extraction and pattern classification for multiclass MI-EEG signals. First, the training data from all subjects is merged and enlarged through autoencoder to meet the need for massive amounts of data while reducing the bad effect on signal recognition because of randomness, instability, and individual variability of EEG data. Second, an end-to-end sharing structure with attention-based time-incremental shallow convolution neural network is proposed. Shallow convolution neural network (SCNN) and bidirectional long short-term memory (BiLSTM) network are used to extract frequency-spatial domain features and time-series features of EEG signals, respectively. Then, the attention model is introduced into the feature fusion layer to dynamically weight these extracted temporal-frequency-spatial domain features, which greatly contributes to the reduction of feature redundancy and the improvement of classification accuracy. At last, validation tests using BCI Competition IV 2a data sets show that classification accuracy and kappa coefficient have reached 82.7 ± 5.57% and 0.78 ± 0.074, which can strongly prove its advantages in improving classification accuracy and reducing individual difference among different subjects from the same network.