Mildew infestation is a significant cause of loss during grain storage. The growth and metabolism of mildew leads to changes in gas composition and temperature within granaries. Recent advances in sensor technology and machine learning enable the prediction of grain mildew during storage. Current research primarily focuses on predicting mildew occurrence or grading using simple machine learning methods, without in-depth exploration of the time series characteristics of mildew process data. A monitoring device was designed and developed to capture high-quality microenvironment parameters and image data during a simulated mildew process experiment. Using the "Yongyou 15" rice varieties from Zhejiang Province, five simulation experiments were conducted under varying temperature and humidity conditions between January and May 2023. Mildew grades were defined through manual analysis to construct a multimodal dataset for the rice mildew process. This study proposes a combined model (CNN-LSTM-A) that integrates convolutional neural networks (CNN), long short-term memory (LSTM) networks, and attention mechanisms to predict the mildew grade of stored rice. The proposed model was compared with LSTM, CNN-LSTM, and LSTM-Attention models. The results indicate that the proposed model outperforms the others, achieving a prediction accuracy of 98%. The model demonstrates superior accuracy and more stable performance. The generalization performance of the prediction model was evaluated using four experimental datasets with varying storage temperature and humidity conditions. The results show that the model achieves optimal prediction stability when the training set contains similar storage temperatures, with prediction accuracy exceeding 99.8%. This indicates that the model can effectively predict the mildew grades in rice under varying environmental conditions, demonstrating significant potential for grain mildew prediction and early warning systems.