Motor imagery (MI) based brain-computer interfaces (BCIs) decode the users' intentions from electroencephalography (EEG) to achieve information control and interaction between the brain and external devices. In this paper, firstly, we apply Riemannian geometry to the covariance matrix extracted by spatial filtering to obtain robust and distinct features. Then, a multiscale temporal-spectral segmentation scheme is developed to enrich the feature dimensionality. In order to determine the optimal feature configurations, we utilize a linear learning-based temporal window and spectral band (TWSB) selection method to evaluate the feature contributions, which efficiently reduces the redundant features and improves the decoding efficiency without excessive loss of accuracy. Finally, support vector machines are used to predict the classification labels based on the selected MI features. To evaluate the performance of our model, we test it on the publicly available BCI Competition IV dataset 2a and 2b. The results show that the method has an average accuracy of 79.1% and 83.1%, which outperforms other existing methods. Using TWSB feature selection instead of selecting all features improves the accuracy by up to about 6%. Moreover, the TWSB selection method can effectively reduce the computational burden. We believe that the framework reveals more interpretable feature information of motor imagery EEG signals, provides neural responses discriminative with high accuracy, and facilitates the performance of real-time MI-BCI.