Food fraud is widespread in the aquatic food market, hence fast and non-destructive methods of identification of fish flesh are needed. In this study, multispectral imaging (MSI) was used to screen flesh slices from 20 edible fish species commonly found in the sea around Yantai, China, by combining identification based on the mitochondrial COI gene. We found that nCDA images transformed from MSI data showed significant differences in flesh splices of the 20 fish species. We then employed eight models to compare their prediction performances based on the hold-out method with 70% training and 30% test sets. Convolutional neural network (CNN), quadratic discriminant analysis (QDA), support vector machine (SVM), and linear discriminant analysis (LDA) models perform well on cross-validation and test data. CNN and QDA achieved more than 99% accuracy on the test set. By extracting the CNN features for optimization, a very high degree of separation was obtained for all species. Furthermore, based on the Gini index in RF, 11 bands were selected as key classification features for CNN, and an accuracy of 98% was achieved. Our study developed a successful pipeline for employing machine learning models (especially CNN) on MSI identification of fish flesh, and provided a convenient and non-destructive method to determine the marketing of fish flesh in the future.