Background and aimNear-infrared spectroscopy (NIRS) is a non-invasive and convenient tool, which gains features related to chemical components in biological samples. Machine learning (ML) has been popularized in medical diagnosis. This study aimed at investigating a novel cancer diagnosis strategy using NIRS data based ML modeling. MethodsPlasma samples were collected from a total of 247 participants, including lung cancer, cervical cancer, nasopharyngeal cancer, and healthy control, and were randomly split into train set and test set. After performing NIRS analysis, the train dataset was utilized to train ML models, including partial least-squares (PLS), random forest (RF), gradient boosting machine (GBM), and support-vector machine (SVM). Subsequently, these models were tested for their prediction performance by the test set. ResultsAll ML models demonstrated high prediction performance in differentiating cancers from controls, and SVM had high prediction accuracy for different types of cancers. SVM was considered as the most suitable model for its minimal computational cost and high accuracies for both binary and quaternary classification. ConclusionsThis strategy coupling NIRS with ML is insightful that may aid in clinic cancer diagnosis, while further studies should test our results in a larger cohort with better representativeness.