Abstract

In this paper, exercises on electronic major courses of college are classified, further, compared with the accuracy of algorithms based on the Naive Bayes and Support Vector Machine (SVM) in the classification of exercises. Data sets from exercises related to the content on the course of Data Structure are selected. According to knowledge content of course and characteristics of exercises, judgment and multiple-choice questions are used in experimental data, and these exercises are divided into seven categories depending on content of chapters. Firstly, the package of Jieba participle is applied to segment text of exercises, and the proper nouns of knowledge content are identified by importing a custom dictionary. Because characteristic words of short text are quite sparse, weight method about TF-IDF is directly adopted to feature representation of text instead of feature selection. Finally, algorithms based on the Nave Bayes and SVM are employed to train and classify text of exercises. The experimental results show that algorithm based on SVM has lower error rate and higher classification accuracy than the Naive Bayes.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call