The proliferation of mobile devices and the rapid development of information and communication technologies (ICT) have seen increasingly large volume and variety of data being generated at an unprecedented pace. Big data have started to demonstrate significant values in higher education. This paper gives several contributions to the state-of-the-art for Big data in higher education and learning technologies research. Currently, there is no comprehensive survey or literature review for Big educational data. Most literature reviews from a few authors have focused on one of these fields: educational mining, learning analytics with discussions on one or two aspects such as Big data technologies without educational focus, social media data in education, etc. Most of these literature reviews are short and insufficient to provide more inclusive reviews for Big educational data. In this paper, we present a comprehensive literature review of the current and emerging paradigms for Big educational data. The survey is presented in five parts: (1) The first part presents an overview and classification of Big education research to show the full landscape in this field, which also gives a concise summary of the overall scope of this paper; (2) The second part presents a discussion for the various data sources from education platforms or systems including learning management systems (LMS), massive open online courses (MOOC), learning object repository (LOR), OpenCourseWare (OCW), open educational resources (OER), social media, linked data and mobile learning contributing to Big education data; (3) The third part presents the data collection, data mining and databases in Big education data; (4) The fourth part presents the technological aspects including Big data platforms and architectures such as Hadoop, Spark, Samza and Big data tools for Big education data; and (5) The fifth part presents different approaches of data analytics for Big education data. This part provides a more inclusive discussion on data analytics which is beyond traditional forms of learning analysis in higher education. This includes predictive analytics, learning analytics including collaborative, behavior, personal learnings and assessment, followed by recommendation systems, graph analytics, visual analytics, immersive learning and analytics, etc. The final part of the paper discusses social (e.g. privacy and ethical issues) and technological challenges for Big data in education. This part also illustrates the technological challenges faced by giving an example for utilizing graph-based analytics for a cross-institution learning analytics scenario.
Read full abstract