Abstract

Mass education in Russian universities in specialties (direction of study) related to the exact and technical sciences is characterized by a high dropout rate, starting from the first year of study. The current level of school education, the system for selecting applicants through the USE procedure, in many cases does not guarantee that future students will be able to successfully master science-intensive specialties. An emphasis on student-centered, individual learning is possible only after students have proven themselves in the early stages of their studies. Therefore, the anticipatory identification of the ability of yesterday's applicants to study effectively is a very urgent task. In this paper, we consider methods for constructing decision trees designed to classify students, highlighting from them a lot of those (risk group) who, with a high degree of probability, will be expelled after the first academic cycle (trimester). At the same time, the minimum information about the freshmen, recorded in their personal file, is used as input data. The construction of the model was carried out according to the data on students of the applied mathematics and computer science direction of the Perm State National Research University for a five-year period of sets of 2014-2018. At the same time, the information from 2014-2017 was used for training, and the flow of 2018 was used as a test one. At the stage of machine learning, several models of decision trees were considered, which were optimized using balancing, restrictions on the maximum tree depth and the minimum number of elements in a leaf. The effectiveness of the binary classification was assessed using a matrix of inaccuracies and a number of numerical criteria obtained on its basis. As a result of machine learning, a decision tree was built, which predicted 16 out of 17 people expelled from the first trimester into the risk group. That is, for a number of reasons, they turned out to be incapable of learning in the direction of applied mathematics and computer science. In addition, it was possible to determine the level of significance of various types of initial data, showing that the results of the USE largely determine the success of students at this stage of training. The definition of the risk group provides certain guidelines for the purposeful activity of teachers and university psychologists, which ultimately can serve as a basis for improving the quality of education and reducing dropout rates. The work performed demonstrates the capabilities of data mining methods in solving poorly formalized tasks characteristic of this type of human activity.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.