Student Performance Prediction (SPP) has received a lot of attention due to its educational implications, such as personalized instruction. Among numerous attempts in SPP, recommendation-based approaches (e.g. matrix factorization; MF) are especially attractive because of the specialized nature of personalization. However, they commonly struggle with the cold-start problem and incorporating side information. While existing works cope with the problem by utilizing additional information, such as students’ personalities, none of them cover the multimodal auxiliary information that can unleash the full potential of the recommendation-based SPP. In this work, we leverage multimodal auxiliary information at SPP for the first time by adopting large language models (GPT-J and Llama 2) as information blenders to produce extra guidance signals for the MF method. Specifically, our language model-guided matrix factorization (LMgMF), consumes the verbalized multimodal information, produces semantically rich embeddings for educational interaction, and uses them as auxiliary signals for MF. While doing so, we harness a API-level black-box language model without requiring parameter accessibility to adapt them. We evaluate LMgMF on two real-world datasets: (1) a large-scale Korean dataset that contains 1.6M instances from 10K pre-high school students; (2) ASSISTments2009 covering 346K interactions between students and math questions. Throughout extensive validation, we demonstrate that LMgMF consistently outperforms baseline methods in various SPP scenarios, including the cold-start.