Objective. In order to increase the number of states classified by a brain–computer interface (BCI), we utilized a motor imagery task where subjects imagined both force and speed of hand clenching. Approach. The BCI utilized simultaneously recorded electroencephalographic (EEG) and functional near-infrared spectroscopy (fNIRS) signals. The time-phase-frequency feature was extracted from EEG, whereas the HbD [the difference of oxy-hemoglobin (HbO) and deoxy-hemoglobin (Hb)] feature was used to improve the classification accuracy of fNIRS. The EEG and fNIRS features were combined and optimized using the joint mutual information (JMI) feature selection criterion; then the extracted features were classified with the extreme learning machines (ELMs). Main results. In this study, the averaged classification accuracy of EEG signals achieved by the time-phase-frequency feature improved by 7%, to 18%, more than the single-type feature, and improved by 15% more than common spatial pattern (CSP) feature. The HbD feature of fNIRS signals improved the accuracy by 1%, to 4%, more than Hb, HbO, or HbT (total hemoglobin). The EEG–fNIRS feature for decoding motor imagery of both force and speed of hand clenching achieved an accuracy of 89% ± 2%, and improved the accuracy by 1% to 5% more than the sole EEG or fNIRS feature. Significance. Our novel motor imagery paradigm improves BCI performance by increasing the number of extracted commands. Both the time-phase-frequency and the HbD feature improve the classification accuracy of EEG and fNIRS signals, respectively, and the hybrid EEG–fNIRS technique achieves a higher decoding accuracy for two-class motor imagery, which may provide the framework for future multi-modal online BCI systems.