The poor generalization performance and heavy training burden of the gesture classification model contribute as two main barriers that hinder the commercialization of sEMG-based human-machine interaction (HMI) systems. To overcome these challenges, eight unsupervised transfer learning (TL) algorithms developed on the basis of convolutional neural networks (CNNs) were explored and compared on a dataset consisting of 10 gestures from 35 subjects. The highest classification accuracy obtained by CORrelation Alignment (CORAL) reaches more than 90%, which is 10% higher than the methods without using TL. In addition, the proposed model outperforms 4 common traditional classifiers (KNN, LDA, SVM, and Random Forest) using the minimal calibration data (two repeated trials for each gesture). The results also demonstrate the model has a great transfer robustness/flexibility for cross-gesture and cross-day scenarios, with an accuracy of 87.94% achieved using calibration gestures that are different with model training, and an accuracy of 84.26% achieved using calibration data collected on a different day, respectively. As the outcomes confirm, the proposed CNN TL method provides a practical solution for freeing new users from the complicated acquisition paradigm in the calibration process before using sEMG-based HMI systems.