ObjectiveBio-Signals such as electroencephalography (EEG) and electromyography (EMG) are widely used for the rehabilitation of physically disabled people and for the characterization of cognitive impairments. Successful decoding of these bio-signals is however non-trivial because of the time-varying and non-stationary characteristics. Furthermore, existence of short- and long-range dependencies in these time-series signal makes the decoding even more challenging. State-of-the-art studies proposed Convolutional Neural Networks (CNNs) based architectures for the classification of these bio-signals, which are proven useful to learn spatial representations. However, CNNs because of the fixed size convolutional kernels and shared weights pay only uniform attention and are also suboptimal in learning short-long term dependencies, simultaneously, which could be pivotal in decoding EEG and EMG signals. Therefore, it is important to address these limitations of CNNs. To learn short- and long-range dependencies simultaneously and to pay more attention to more relevant part of the input signal, Transformer neural network-based architectures can play a significant role. Nonetheless, it requires a large corpus of training data. However, EEG and EMG decoding studies produce limited amount of the data. Therefore, using standalone transformers neural networks produce ordinary results. In this study, we ask a question whether we can fix the limitations of CNN and transformer neural networks and provide a robust and generalized model that can simultaneously learn spatial patterns, long-short term dependencies, pay variable amount of attention to time-varying non-stationary input signal with limited training data. ApproachIn this work, we introduce a novel single hybrid model called ConTraNet, which is based on CNN and Transformer architectures that contains the strengths of both CNN and Transformer neural networks. ConTraNet uses a CNN block to introduce inductive bias in the model and learn local dependencies, whereas the Transformer block uses the self-attention mechanism to learn the short- and long-range or global dependencies in the signal and learn to pay different attention to different parts of the signals. Main resultsWe evaluated and compared the ConTraNet with state-of-the-art methods on four publicly available datasets (BCI Competition IV dataset 2b, Physionet MI-EEG dataset, Mendeley sEMG dataset, Mendeley sEMG V1 dataset) which belong to EEG-HMI and EMG-HMI paradigms. ConTraNet outperformed its counterparts in all the different category tasks (2-class, 3-class, 4-class, 7-class, and 10-class decoding tasks). SignificanceWith limited training data ConTraNet significantly improves classification performance on four publicly available datasets for 2, 3, 4, 7, and 10-classes compared to its counterparts.