Abstract

As the number of students in universities continues to grow, the university academic management system has a large amount of data on student performance. However, the utilization of these data is only limited to simple query and statistical work, and there is no precedent of using these data for improving English teaching mode. With the application of fuzzy theory in machine learning and artificial intelligence, the fuzzy decision tree algorithm was born by integrating fuzzy set theory with decision tree algorithm. In this paper, we propose a way to obtain the centroids of continuous attribute clustering by K-means algorithm and combine the triangular fuzzy number to fuzzy the continuous data. In addition, this paper analyzes the influence of nearest neighbor distance on classification, introduces Gaussian weight function, gives different voting weights to the neighborhood according to the distance, and establishes a weighted K-nearest neighbor classification algorithm. To address the problem of low classification efficiency of K-nearest neighbor algorithm when the dataset is large, this paper further improves the algorithm and establishes the partitioned weighted K-nearest neighbor algorithm. The classification time was shortened from 11.39 seconds to 5.22 seconds, and the classification efficiency greatly improved.

Highlights

  • It is well known that English is the most widely spoken language in the world and its importance cannot be overstated [1]

  • In order to improve the English teaching model and predict the results of the college English IV exam, this paper firstly preprocesses the data to fill in the missing values of the entrance English scores, factorizes the categorical variables, and normalizes the numerical variables; it selects the prediction input features and the number of nearest neighbors K based on various indicators and establishes the K-nearest neighbor classification model and the weighted Kneighbor classification model

  • E K-nearest neighbor algorithm and the split-weighted K-nearest neighbor algorithm are applied to the classification and prediction of university English IV test results, and by filtering the input using statistical techniques, relevant factors that affect the English IV test results are investigated

Read more

Summary

Introduction

It is well known that English is the most widely spoken language in the world and its importance cannot be overstated [1]. In order to use the data more effectively for classification and prediction, the data samples with scores below 30 on the entrance English test were removed from this paper.

Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call