Protein-energy malnutrition (PEM) is a global concern, particularly in underdeveloped and underprivileged regions dependent on plant-based diets. Plant-based meat alternatives offer a potential solution, with rice bean emerging as a promising crop due to its high protein content. However, traditional methods for determining protein content in rice bean are labor-intensive, costly, and time-consuming, thus necessitating more efficient approaches. This study addresses the need for efficient, rapid technique by integrating Near-Infrared Spectroscopy (NIRS) with machine learning models to predict protein content in rice bean (Vigna umbellata L.) germplasm. We developed predictive models using Support Vector Regressor (SVR), Random Forest Regressor (RFR), and Modified Partial Least Square (MPLS) coupled with key wavelengths selection algorithms including Genetic Algorithm (GA), Ant Colony Optimization (ACO), Particle Swarm Optimization (PSO), and Competitive Adaptive Reweighted Sampling (CARS). The MPLS model demonstrated superior performance, achieving an R² (Coefficient of Determination) of 0.81 and an RPD (Residual Prediction Deviation) of 2.14 when combined with the ACO algorithm. The SVR model also performed well when combined with ACO and PSO, yielding an R² of 0.78, an RPD of ∼1.95. In contrast, the RFR model showed relatively lower performance (R² of ∼0.45 and RPD ≤ 1.74). These results indicate that MPLS and SVR, combined with selected wavelength selection algorithms significantly improves prediction accuracy. The study highlights the importance of appropriate model and algorithm selection in enhancing predictive performance of protein content. The developed technology provides a rapid and efficient method for selecting nutritionally superior crop germplasm present in global respositories.