Knowledge Tracing (KT) aims to predict students’ future performance on answering questions based on their historical exercise sequences. To alleviate the problem of data sparsity in KT, recent works have introduced auxiliary information to mine question similarity, resulting in the enhancement of question embeddings. Nonetheless, there remains a gap in developing an approach that effectively incorporates various forms of auxiliary information, including relational information (e.g., question-student , question-skill relation), relationship attributes (e.g., correctness indicating a student's performance on a question), and node attributes (e.g., student ability ). To tackle this challenge, the Similarity-enhanced Question Embedding (SimQE) method for KT is proposed, with its central feature being the utilization of weighted and attributed meta-paths for extracting question similarity. To capture multi-dimensional question similarity semantics by integrating multiple relations, various meta-paths are constructed for learning question embeddings separately. These embeddings, each encoding different similarity semantics, are then fused to serve the task of KT. To capture finer-grained similarity by leveraging the relationship attributes and node attributes on the meta-paths, the biased random walk algorithm is designed. In addition, the auxiliary node generation method is proposed to capture high-order question similarity. Finally, extensive experiments conducted on 6 datasets demonstrate that SimQE performs the best among 10 representative question embedding methods. Furthermore, SimQE proves to be more effective in alleviating the problem of data sparsity.
Read full abstract