The motor imagery brain-computer interface (MI-BCI) based on electroencephalography (EEG) enables direct communication between the human brain and external devices. In this paper, the MTFB-CNN, a parallel multi-scale time-frequency block convolutional neural network based on the channel attention module, is proposed for EEG signals decoding, which can adaptively extract the time, frequency, and time-frequency domain features through parallel multi-scale time-frequency blocks, and then fuses and filters the features through attention mechanism and residual module. Experimental results based on the BCI Competition IV 2a and 2b datasets and the high gamma dataset show that the model achieves the highest average accuracy and kappa compared with existing baseline models. The MTFB-CNN is a novel and effective end-to-end model for decoding EEG signals without complex signals pre-processing operations, which has multi-scale feature extraction capability, making it successful in MI-BCI applications.