Research on Lower Limb Motion Recognition (LLMR) based on various wearable sensors has been widely applied in exoskeleton robots, exercise rehabilitation, etc. Typically, employing multimodal information tends to yield higher accuracy and stronger robustness compared to using unimodal information. Due to the inevitable reliance on feature engineering in shallow machine learning-based LLMR methods, this study leverages the powerful non-linear feature mapping capability of deep learning (DL) to construct several end-to-end LLMR frameworks, including: Convolutional Neural Networks (CNNs), CNN-Recurrent Neural Networks (RNNs) and CNN-Graph Neural Networks (GNNs). The effectiveness of the proposed frameworks is verified in distinct tasks, including the recognition of seven types of lower limb motions in healthy subjects and three types of motions in patients with stroke, as well as the phase recognition task during the sit-to-stand (SitTS) process in patients with stroke, achieving the highest mean accuracy of 95.198 %, 99.784 %, and 99.845 %, respectively. Further research and integration of two transfer learning techniques, adaptive Batch Normalization (BN) and model fine-tuning, significantly enhance the applicability of the proposed frameworks in inter-subject prediction. Additionally, systematic analyses are conducted to assess the strengths and weaknesses of different models in terms of recognition performance, complexity, and adaptability to variations in the number of modalities and sensor channels. Experimental results indicate that the proposed frameworks hold promise in providing potential support for the development of human-robot collaborative lower limb exoskeletons or rehabilitation robots.