Abstract
The exponential growth of academic papers necessitates sophisticated classification systems to effectively manage and navigate vast information repositories. Despite the proliferation of such systems, traditional approaches often rely on embeddings that do not allow for easy interpretation of classification decisions, creating a gap in transparency and understanding. To address these challenges, we propose an innovative explainable paper classification system that combines Latent Semantic Analysis (LSA) for topic modeling with explainable artificial intelligence (XAI) techniques. Our objective is to identify which topics significantly influence the classification outcomes, incorporating Shapley additive explanations (SHAP) as a key XAI technique. Our system extracts topic assignments and word assignments from paper abstracts using latent semantic analysis (LSA) topic modeling. Topic assignments are then employed as embeddings in a multilayer perceptron (MLP) classification model, with the word assignments further utilized alongside SHAP for interpreting the classification results at the corpus, document, and word levels, enhancing interpretability and providing a clear rationale for each classification decision. We applied our model to a dataset from the Web of Science, specifically focusing on the field of nanomaterials. Our model demonstrates superior classification performance compared to several baseline models. Ultimately, our proposed model offers a significant advancement in both the performance and explainability of the system, validated by case studies that illustrate its effectiveness in real-world applications.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have