Abstract
Random forest algorithm allows for building better CART models. However, the disadvantage of this method is often underfitting, especially for small node sizes. Therefore, the double random forest method was developed to overcome this problem. The research was conducted by utilising Education Management Information System (EMIS) data, which is related to the incidence of school dropout. The data used consists of 2 data, namely MTs and MA dropout data. The initial testing procedure was carried out using the random forest algorithm for each data set, then the data was evaluated using the double random forest method. From this study, the underfitting case can be overcome well using the double random forest algorithm, while in the fit case, the difference in the goodness-of-fit value of the model is relatively the same. The results obtained show that MTs prioritise school quality more than MA, although family factors are more important at the MA level. Although the total number of factors used is basically the same, it should be noted that the two school levels have different relevance variables. It should be noted that no forecasting was done in this study given that the methodology used two different types of data.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have