Multi-spectral transmission image provides a possibility for the detection of early breast cancer. However, in the process of acquiring multi-spectral transmission images, the recognition of heterogeneities has many difficulties due to the image blur caused by the scattering effect of light source in biological tissues and weak transmission signals. This paper proposes a combination method of modulation–demodulation-frame accumulation technique and pattern recognition to achieve heterogeneous classification. First, the acquisition experiment of the phantom multi-spectral images is designed. Then, the signal-to-noise ratio (SNR) of the image is improved by the modulation–demodulation and frame accumulation technique, and the 14-dimensional feature information (firmness, angular second-order distance, contrast, gray-scale correlation, entropy, inverse gap, smoothness, dissimilarity, consistency, center of gravity, area, perimeter, long diameter of irregular image, and short diameter of irregular image) of the heterogeneous region are extracted from the image with high SNR. Finally, the heterogeneous classification accuracy of different models is compared. The results show that: compared with the classification accuracy of the traditional multi-spectral image classification models, random forest (RF) and extreme learning machine (ELM) models have better classification effect when subdividing the four types of heterogeneity based on the data set of this paper. Among them, the RF and ELM models established by the dataset of four-wavelength combination have the best classification effect, and the classification accuracy rate reaches 100%, second, it is the three-wavelength combined model. The single-wavelength model has the worst classification effect. And the operating efficiency of ELM is significantly higher than RF. In conclusion, the image quality is improved by modulation–demodulation and frame accumulation technique. And compared with the classification accuracy of the traditional multi-spectral image classification models, the RF and ELM models established in this paper have better classification effect, which may promote the application of multi-spectral transmission imaging in early screening of breast tumors.