Abstract

<p style="text-align: justify;">The article aims to develop a machine-learning algorithm that can predict student’s graduation in the Industrial Engineering course at the Federal University of Amazonas based on their performance data. The methodology makes use of an information package of 364 students with an admission period between 2007 and 2019, considering characteristics that can affect directly or indirectly in the graduation of each one, being: type of high school, number of semesters taken, grade-point average, lockouts, dropouts and course terminations. The data treatment considered the manual removal of several characteristics that did not add value to the output of the algorithm, resulting in a package composed of 2184 instances. Thus, the logistic regression, MLP and XGBoost models developed and compared could predict a binary output of graduation or non-graduation to each student using 30% of the dataset to test and 70% to train, so that was possible to identify a relationship between the six attributes explored and achieve, with the best model, 94.15% of accuracy on its predictions.</p>

Highlights

  • The development of technologies using machine learning has shown explosive growth in the processes of creating products or services currently delivered to the market. This area of research emerges as a branch of artificial intelligence and exists as a basic principle of technologies aimed at speech recognition on smartphones, forecasting prices for the stock exchange, recommending films on streaming platforms, identifying diseases from the recognition of ultrasound images, among other applications, so that it can be described as “the science of programming computers in such a way that they can learn from data” (Géron, 2017)

  • The development of a machine-learning model through supervised learning algorithms is part of the range of predictive methodologies created under the aspect of artificial intelligence

  • This possibility is based primarily on the development of computer systems that are capable of storing a large amount of data

Read more

Summary

Introduction

The development of technologies using machine learning has shown explosive growth in the processes of creating products or services currently delivered to the market. Learning methods, in general terms, seeks to find answers to certain types of problems: those in which the relationships between the input data generate a continuous response (such as car prices), those in which this relationship generates a discrete response (such as vehicle types) or even those in which the answer is unknown so that it is necessary to search for a standardization by the algorithm according to the characteristics delivered to it In this context, a problem common to several undergraduate courses at public universities is identified: the low rate of student graduations. A problem common to several undergraduate courses at public universities is identified: the low rate of student graduations. Veenstra et al (2009) emphasizes that when it comes to sciences and engineering fields it turns out the retention rate is even lower, and the reasons can be cognitive, such as GPA or High School grades in general, noncognitive, such as family support, financial difficulties, healthy or characteristics related to the color, gender, habits or expectations of each student

Methods
Findings
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call