Unfortunately, sustainability is an issue very poorly used when developing software and hardware systems. Lately, and in order to contribute to the earth sustainability, a new concept emerged named Green software which is computer software that can be developed and used efficiently and effectively with minimal or no impact to the environment. Currently, new teaching methods based on students’ learning process are being developed in the European Higher Education Area. Most of them are oriented to promote students’ interest in the course’s contents and offer personalized feedback. Online judging is a promising method for encouraging students’ participation in the e-learning process, although it still has to be researched and developed to be widely used and in a more efficient way. The great amount of data available in an online judging tool provides the possibility of exploring some of the most indicative attributes (e.g., running time, memory) for learning programming concepts, techniques and languages. So far, the most applied methods for automatically gathering information from the judging systems are based on statistical methods and, although providing reasonable correlations, these methods have not been proven to provide enough information for predicting grades when dealing with a huge amount of data. Therefore, the great novelty of this paper is to develop a data mining approach to predict program correctness as well as the grades of the students’ practices. For this purpose, powerful data mining technologies taken from the artificial intelligence domain have been used. In particular, in this study, we have used logistic regression, decision trees, artificial neural network and support vector machines; which have been properly identified as the most suitable ones for predicting activities in the e-learning domains. The results have achieved an accuracy of around 74%, both in the prediction of the program correctness as well as in the practice grades’ prediction. Another relevant issue provided in this paper is a comparison among these four techniques to obtain the best accuracy in predicting grades based on the availability of data as well as their taxonomy. The Decision Trees classifier has obtained the best confusion matrix, and time and memory efficiency were identified as the most important predictor variables. In view of these results, we can conclude that the development of green software leads programmers to implement correct software.
Read full abstract