Data mining has been widely used in different areas of knowledge, and education is no exception. Data mining uses computer models to analyze data and answer research questions to help in decision making. This article uses data from the PLANEA 2015 Mathematics in Middle school (last year of Middle school) test, which measures the academic achievement and provides a personal, family and school context, in order to find those characteristics that are related to the academic level of the tested students. In this article, an interactive visualization system was developed that allows observing interesting patterns and association rules by combining relevant attributes (variables) and the States. To reduce the analysis space, the Correlation-Based Feature Selection method was used to reduce categorical and numerical attributes. The results show a significant reduction (93%) in the number of attributes, with very little loss of information, when certain attributes are eliminated. Particularly, the 232 categorical attributes obtained from each student are reduced to only 18 attributes, which are correlated with the results of students in the PLANEA test. In addition, empirically it was discovered that choosing the mode from the labels of plausible values as the target class increases the accuracy in classifiers used to show the goodness of the reduction obtained. Some of the relevant attributes are the "AcademicAspiration", "FamilyResources", "MotherStudies" and "FatherStudies". From the 30 States with information, only 8 are in the Basic level, the other States are in the Below Basic level .
Read full abstract