Abstract

AbstractEducational concepts are the core of teaching and learning. From the perspective of educational technology, concepts are essential meta-data, representative terms that can connect different learning materials, and are the foundation for many downstream tasks. Some studies on automatic concept extraction have been conducted, but there are no studies looking at the K-12 level and focused on the Swedish language. In this paper, we use a state-of-the-art Swedish BERT model to build an automatic concept extractor for the Biology subject using fine-annotated digital textbook data that cover all content for K-12. The model gives a recall measure of 72% and has the potential to be used in real-world settings for use cases that require high recall. Meanwhile, we investigate how input data features influence model performance and provide guidance on how to effectively use text data to achieve the optimal results when building a named entity recognition (NER) model.KeywordsConcept extractionNLPBERTSequence modelNER

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call