Putting It All Together: Combining Learning Analytics Methods and Data Sources to Understand Students’ Approaches to Learning Programming
Learning programming is a complex and challenging task for many students. It involves both understanding theoretical concepts and acquiring practical skills. Hence, analyzing learners’ data from online learning environments alone fails to capture the full breadth of students’ actions if part of their learning process takes place elsewhere. Moreover, existing studies on learning analytics applied to programming education have mainly relied on frequency analysis to classify students according to their approach to programming or to predict academic achievement. However, frequency analysis provides limited insights into the individual time-related characteristics of the learning process. The current study examines students’ strategies when learning programming, combining data from the learning management system and from an automated assessment tool used to support students while solving the programming assignments. The study included the data of 292 engineering students (228 men and 64 women, aged 20–26) from the two aforementioned sources. To gain an in-depth understanding of students’ learning process as well as of the types of learners, we used learning analytics methods that account for the temporal order of learning actions. Our results show that students have special preferences for specific learning resources when learning programming, namely, slides that support search, and copy and paste. We also found that videos are relatively less consumed by students, especially while working on programming assignments. Lastly, students resort to course forums to seek help only when they struggle.
- Research Article
14
- 10.18608/jla.2015.23.7
- Feb 18, 2016
- Journal of Learning Analytics
Accounts of the nature and role of productive dialogue in fostering educational outcomes are now well established in the learning sciences and are underpinned by bodies of strong empirical research and theorising. Allied to this there has been longstanding interest in fostering computer-supported collaborative learning (CSCL) in support of such dialogue. Learning analytic environments such as massive open online courses (moocs) and online learning environments (such as virtual learning environments, VLEs and learning management systems, LMSs) provide ripe potential spaces for learning dialogue. In prior research, preliminary steps have been taken to detect occurrences of productive dialogue automatically through the use of automated analysis techniques. Such advances have the potential to foster effective dialogue through the use of learning analytic techniques that scaffold, give feedback on, and provide pedagogic contexts promoting, such dialogue. However, the translation of learning science research to the online context is complex, requiring the operationalization of constructs theorized in different contexts (often face to face), and based on different data-sets and structures (often spoken dialogue).. In this paper we explore what could constitute the effective analysis of this kind of productive dialogue, arguing that it requires consideration of three key facets of the dialogue: features indicative of productive dialogue; the unit of segmentation; and the interplay of features and segmentation with the temporal underpinning of learning contexts. We begin by outlining what we mean by ‘productive educational dialogue’, before going on to discuss prior work that has been undertaken to date on its manual and automated analysis. We then highlight ongoing challenges for the development of computational analytic approaches to such data, discussing the representation of features, segments, and temporality in computational modelling. The paper thus foregrounds, to both learning-science-oriented and computationally-oriented researchers, key considerations in respect of the analysis dialogue data in emerging learning analytics environments. The paper provides a novel, conceptually driven, stance on the state of the contemporary analytic challenges faced in the treatment of dialogue as a form of data across on and offline sites of learning.
- Book Chapter
- 10.1007/978-3-030-70258-8_4
- Jan 1, 2021
Understanding of learning analytics has many advantages. Learners can obtain information about their progress and learning gaps in real-time; training programmers and facilitators can be better informed on how students and groups of students perform learning opportunities to modify their learning programs. The basic premise of learning analytics is to use the data collected to gain insight into learners’ learning patterns and attitudes, analyze the data, and give recommendations and predictions. The bulk of the data is usually derived from the learning management system (LMS), informal learning networks, face-to-face accounts, and evaluations of participatory work sessions, knowledge sources, and even smartphone use. There are many ways and models for characterizing usage data to allow user behavior in the learning management system and overall systems. Based on this data, learning analytics carries out various assessments and offers personalized and valuable insights to enhance the learning and teaching processes. There is a range of data formats for use with current systems already effectively. The benefits and disadvantages of these different user data types need to be discussed using Learning Analytics. Also, data sharing or study creates privacy risks for the data subjects, often students. In this chapter, various data and data usage formats are presented and analyzed in the Learning Analytics context to select the best data model for use. This chapter also provides a report on data privacy problems in the learning analytics functions.
- Research Article
49
- 10.1111/bjet.13276
- Sep 12, 2022
- British Journal of Educational Technology
As universities around the world have begun to use learning management systems (LMSs), more learning data have become available to gain deeper insights into students' learning processes and make data‐driven decisions to improve student learning. With the availability of rich data extracted from the LMS, researchers have turned much of their attention to learning analytics (LA) applications using educational data mining techniques. Numerous LA models have been proposed to predict student achievement in university courses. To design predictive LA models, researchers often follow a data‐driven approach that prioritizes prediction accuracy while sacrificing theoretical links to learning theory and its pedagogical implications. In this study, we argue that instead of complex variables (e.g., event logs, clickstream data, timestamps of learning activities), data extracted from online formative assessments should be the starting point for building predictive LA models. Using the LMS data from multiple offerings of an asynchronous undergraduate course, we analysed the utility of online formative assessments in predicting students' final course performance. Our findings showed that the features extracted from online formative assessments (e.g., completion, timestamps and scores) served as strong and significant predictors of students' final course performance. Scores from online formative assessments were consistently the strongest predictor of student performance across the three sections of the course. The number of clicks in the LMS and the time difference between first access and due dates of formative assessments were also significant predictors. Overall, our findings emphasize the need for online formative assessments to build predictive LA models informed by theory and learning design. Practitioner notes What is already known about this topic Higher education institutions often use learning analytics for the early identification of low‐performing students or students at risk of dropping out. Most predictive models in learning analytics rely on immutable student characteristics (e.g., gender, race and socioeconomic status) and complex variables extracted from log data within a learning management system. Prioritizing prediction accuracy without theory orientation often yields “black‐box” models that fail to inform educators on what remedies need to be taken to improve student learning. What this paper adds Predictive models in learning analytics should consider learning theory, pedagogy and learning design to identify key predictors of student learning. Online formative assessments can be a starting point for building predictive models that are not only accurate but also provide educators with actionable insights on how student learning can be improved. Time‐related and score‐related features extracted from online formative assessments are particularly useful for predicting students' course performance. Implications for practice and/or policy This study provides strong evidence for using online formative assessments as the foundation for predictive models in learning analytics. Student data from online formative assessments can help educators provide students with feedback while informing future formative assessment cycles. Higher education institutions should avoid the hype around complex data from learning management systems and instead rely on effective learning tools such as online formative assessments to revolutionize the use of learning analytics.
- Research Article
4
- 10.28945/5182
- Jan 1, 2023
- Journal of Information Technology Education: Research
Aim/Purpose: This article proposes a framework based on a sequential explanatory mixed-methods design in the learning analytics domain to enhance the models used to support the success of the learning process and the learner. The framework consists of three main phases: (1) quantitative data analysis; (2) qualitative data analysis; and (3) integration and discussion of results. Furthermore, we illustrated the application of this framework by examining the relationships between learning process metrics and academic performance in the subject of Computer Programming coupled with content analysis of the responses to a students’ perception questionnaire of their learning experiences in this subject. Background: There is a prevalence of quantitative research designs in learning analytics, which limits the understanding of students’ learning processes. This is due to the abundance and ease of collection of quantitative data in virtual environments and learning management systems compared to qualitative data. Methodology: This study uses a mixed-methods, non-experimental, research design. The quantitative phase of the framework aims to analyze the data to identify behaviors, trends, and relationships between measures using correlation or regression analysis. On the other hand, the qualitative phase of the framework focuses on conducting a content analysis of the qualitative data. This framework was applied to historical quantitative and qualitative data from students’ use of an automated feedback and evaluation platform for programming exercises in a programming course at the National University of Colombia during 2019 and 2020. The research question of this study is: How can mixed-methods research applied to learning analytics generate a better understanding of the relationships between the variables generated throughout the learning process and the academic performance of students in the subject of Computer Programming? Contribution: The main contribution of this work is the proposal of a mixed-methods learning analytics framework applicable to computer programming courses, which allows for complementing, corroborating, or refuting quantitatively evidenced results with qualitative data and generating hypotheses about possible causes or explanations for student behavior. In addition, the results provide a better understanding of the learning processes in the Computer Programming course at the National University of Colombia. Findings: A framework based on sequential explanatory mixed-methods design in the field of learning analytics has been proposed to improve the models used to support the success of the learning process and the learner. The answer to the research question posed corresponds to that the mixed methods effectively complement quantitative and qualitative data. From the analysis of the data of the application of the framework, it appears that the qualitative data, representing the perceptions of the students, generally supported and extended the quantitative data. The consistency between the two phases allowed us to generate hypotheses about the possible causes of student behavior and provide a better understanding of the learning processes in the course. Recommendations for Practitioners: We suggest implementing the proposed mixed-methods learning analytics framework in various educational contexts and populations. By doing so, practitioners can gather more diverse data and insights, which can lead to a better understanding of learning processes in different settings and with different groups of learners. Recommendation for Researchers: Researchers can use the proposed approach in their learning analytics projects, usually based exclusively on quantitative data analysis, to complement their results, find explanations for their students’ behaviors, and understand learning processes in depth thanks to the information provided by the complementary analysis of qualitative data. Impact on Society: The prevalence of exclusively quantitative research designs in learning analytics can limit our understanding of students’ learning processes. Instead, the mixed-methods approach we propose suggests a more comprehensive approach to learning analytics that includes qualitative data, which can provide deeper insight into students’ learning experiences and processes. Ultimately, this can lead to more effective interventions and improvements in teaching and learning practices. Future Research: Potential lines of research to continue the work on mixed-method learning analytics methodology include the following: first, implementing the framework on a different population sample, such as students from other universities or other knowledge areas; second, using techniques to correct unbalanced data sets in learning analytics studies; third, analyzing student interactions with the automated grading platform and their academic activities in relation with their activity grades; last, using the findings to design interventions that positively impact academic performance and evaluating the impact statistically through experimental study designs. In the context of introductory programming education, AI/large language models have the potential to revolutionize teaching by enhancing the learning experience, providing personalized support, and enabling more efficient assessment and feedback mechanisms. Future research in this area is to implement the proposed framework on data from an introductory programming course using these models.
- Book Chapter
3
- 10.1007/978-3-031-54464-4_1
- Jan 1, 2024
The unique position of learning analytics at the intersection of education and computer science while reaching out to several other disciplines such as statistics, psychometrics, econometrics, mathematics, and linguistics has accelerated the growth and expansion of the field. Therefore, it is a crucial endeavor for learning analytics researchers to stay abreast of the latest methodological and computational advances to drive their research forward. The diversity and complexity of the existing methods can make this task overwhelming both for newcomers to the learning analytics field and for experienced researchers. With the motivation to accompany researchers in this challenging journey, the book “Learning Analytics Methods and Tutorials—A Practical Guide Using R” aims to provide a methodological guide for researchers to study, consult, and take the first steps toward innovation in the learning analytics field. Thanks to the unique wealth of authors’ backgrounds and expertise, which include authors of R packages and experts in methods and applications, the book offers a comprehensive array of methods that are described thoroughly with a primer on their usage in prior research in education. These methods include sequence analysis, Markov models, factor analysis, process mining, network analysis, predictive modeling, and cluster analysis among others. A step-by-step tutorial using the R programming language with real-life datasets and case studies is presented for each method. In addition, the initial chapters are devoted to getting novice researchers up to speed with the R programming learners and the basics of data analysis. The present chapter serves as an introduction to the book describing its main aim and intended audience. It describes the structure of the book and the methods covered by each chapter. It also points the readers to the companion code and data repositories to facilitate following the tutorials present in the book chapter.
- Research Article
3
- 10.3390/app14093615
- Apr 24, 2024
- Applied Sciences
The identification of heterogeneous and homogeneous groups of students using clustering analysis in learning analytics is still rare. The paper describes a study in which the students’ performance data stored in the micro-learning platform Priscilla are analyzed using learning analytics methods. This study aims to identify the groups of students with similar performances in micro-learning courses focused on learning programming and uncover possible changes in the number and composition of the identified groups of students. The CRISP-DM methodology was used to navigate through the complexity of the knowledge discovery process. Six different datasets representing different types of graded activities or term periods were prepared and analyzed for that purpose. The clustering analysis using the K-Means method found two clusters in all cases. Subsequently, performance metrics, the internal composition, and transfers of the students between clusters identified in different datasets were analyzed. As a result, this study confirms that analyzing student performance data from a micro-learning platform using learning analytics methods can reveal distinct groups of students with different academic performances, and these groups change over time. These findings align with teachers’ assumptions that the micro-learning platform with automated evaluation of programming assignments highlights how the students perceive the role of learning tools during learning programming in different term periods. Simultaneously, this study acknowledges that clustering, as an exploratory method, provides a solid basis for further research and can identify distinct groups of students with similar characteristics.
- Book Chapter
2
- 10.1007/978-981-19-5331-6_67
- Nov 8, 2022
Learning analytics is becoming increasingly popular in higher education. During this COVID-19 pandemic, in particular, e-learning has grown increasingly popular. Students will have additional options for learning based on their interests and requirements as a result of this. To be effective in online and hybrid learning environments, students must develop self-regulation skills and self-directed learning. In online learning, learning analytics appear to have the ability to deliver tailored feedback, personalized recommendations, and support. It is possible to assist students in managing their learning process and self-evaluating their performance through external support and supervision in online learning environments, as well as the usage of learning analytics tools. Sixty-nine undergraduate students participated in the study. Students were challenged on a variety of programming topics. The goal of this research is to use the KNN algorithm to provide automated personalized feedback to novices by conducting continuous assessments and generating recommendations for further improvement using the decision tree algorithm in order to improve their overall skill development in the Java programming language. As a finding of the research, the benefits of individualized recommendations and guided feedback related to learning analytics in increasing students’ overall performance were determined.KeywordsPersonalized feedbackPersonalized recommendationsLearning analyticsQuality educationPerformanceKNNNEP 2020Lifelong learningCOVID-19 pandemic
- Research Article
- 10.17576/jqma.2104.2025.20
- Dec 12, 2025
- Journal of Quality Measurement and Analysis
In recent years, there has been growing interest in online education, including e-learning, Massive Open Online Courses (MOOCs), Intelligent Tutoring System (ITS) and universitylevel distance learning.These online environments generate vast amounts of data, such as activity logs, course interaction data and assessment results, making learning analytics essential to predicting student performance.This study examines the efficacy of the probability-based model, Bayesian Networks (BNs), in predicting academic performance using learning analytics data collected through the Learning Management Systems (LMS).It focuses on how BN is capable of predicting student performance in an online learning environment by modeling complex relationships among various learning analytics factors that contribute to academic success.Using LMS data from Universiti Sains Malaysia's distance learning Mathematics course, the study incorporates key learning analytics variables such as engagement metrics, resource utilization, self-directed learning activities and assessment, and academic performance to develop a BN-based predictive model.BN model revealed that low engagement significantly hinders academic success, demonstrating its potential for early intervention and educational improvement.The model performance was measured using classification metrics such as accuracy, precision, recall, and F1-score.The developed model shows overall good performance, marked by strong precision and balanced recall in predicting the target classes with some variability.The results revealed that BN effectively captured dependencies among key learning analytics variables, providing actionable insights for designing personalized interventions in online education.
- Research Article
2
- 10.31763/ijele.v4i2.641
- Aug 21, 2022
- International Journal of Education and Learning
In recent years, the interest in Massive Open Online Courses (MOOCs) and Learning Analytics research have highly increased in the areas of educational technologies. The emergence of new learning technologies requires new perspectives on Educational Design. When the areas of MOOCs, Learning Analytics and Instructional Design developed, the interest and connection between these three concepts became important for research. Learning Analytics provides progress information and other individualized support in MOOC settings where teachers are not able to provide learners with individual attention, which would be possible in a traditional face-to-face setting. Through collective views over the learning process, the overall progress and performance are indicated. Moreover, results can lead to Educational Design improvements. Every time a learner interacts with the system, data is created and collected. Many Educational Designers do not take advantage of this data and thereby, losing the possibility to impact the course design in a powerful way. This research work strongly focuses on the implication of Learning Analytics for Educational Design in MOOCs. Many methods and algorithms are used in the analytical learning process in MOOCs. Currently, a great variety of learning data exists. First, well-known Instructional Design patterns from different models were collected and listed. In a second step, through the collected data is used to point out which of these patterns can be answered by using Learning Analytics methods. The findings of the study show that it is possible to better understand which environments and experiences are best suited for learning by analyzing students' behaviors online. These results have great potential for a rapidly and easier understanding and optimization of the learning process for educators.
- Research Article
7
- 10.1007/s10758-024-09808-4
- Dec 31, 2024
- Technology, Knowledge and Learning
In digitalized learning processes, learning analytics (LA) can help teachers make pedagogically sound decisions and support pupils’ self-regulated learning (SRL). However, research on the role of the pedagogical dimensions of learning design (LD) in influencing the possibilities of LA remains scarce. Primary school presents a unique LA context characterized by blended learning environments and pupils’ various abilities to regulate their learning, underscoring teachers’ vital importance. This study explores how pedagogically diverse LDs influence pupils’ SRL behaviors and learning management system (LMS) usage, as well as how this is reflected in LA visualizations. Two LDs were designed and implemented in two primary school classes of fifth (n = 30) and sixth (n = 22) graders within authentic pedagogical and technological contexts. We used sequence analysis to examine the pupils’ SRL actions during LDs, using LMS log data and observation data to contextualize these actions. The results show that LA offers less accurate feedback in more open, collaborative LDs as pupils tend to rely less on the LMS to regulate their learning. Furthermore, the teacher powerfully influences LMS usage in blended primary school classrooms. To maximize the potential of using LA to support SRL, its design needs to be grounded in the LD through an understanding of how the regulation of learning is promoted in diverse learning processes.
- Conference Article
- 10.1109/tenconspring.2017.8070017
- Jul 1, 2017
- 2017 IEEE Region 10 Symposium (TENSYMP)
For long, online learning has received criticism on not being able to provide helpful insight on students' learning, for a lack of necessary and appropriate interaction between instructor and students. Despite the fact that learning analytics (LA) are on the rise in terms of their popularity in academic institutions, the usefulness of LA come into question, especially the elements intended to provide effective feedbacks on students' learning process. The main problem for this seems to be a lack of analysis of the characteristics of individuals. Therefore, more needs to be done to actually help students' learning process. The study aims to produce a significant body data by analyzing the content available in online learning environments using Bloom's Taxonomy. Initial results show that most content found in online learning environments consist of offline learning material that was transferred to online learning platforms by taking very limited advantage of the resources of the Internet. It is hypothesized that in this study learning content in online environments do not support learners efficiently transfer to higher cognitive levels described in Bloom's taxonomy.
- Research Article
19
- 10.37134/ajatel.vol8.4.2018
- Dec 23, 2018
- Asian Journal of Assessment in Teaching and Learning
Learning Analytics is a new field of research that appears as a link between educator data and students. The Learning Analytics is also able to provide information about decision making to understand and optimise the learning process. Early childhood education is believed to be very important because the learner is open and always tries new things and is considered very meaningful for future processes in the development of all aspects of their personality. In this study, we aimed at investigating the application of learning analytics and how the learning process on child development in early childhood education. The Article Search Process is carried out on two databases, ScienceDirect and IEEE. In this study, the most important keywords are Learning Analytics and early childhood. The results of the search are 45 articles: (31/45) ScienceDirect and (14/45) IEEE, from 2012 to 2017. They are thoroughly explored in the Learning Analytics process, data collection and pre-processing, analysis and action, and post- processing. The process of data collection is done by implementing Online Systems or Game-Based Learning: 58% e-Learning systems, 27% Learning Analytics systems, and 15% Game-based learning. Many research was conducted on samples from Post graduate, High school and Elementary school. The results showed that early childhood education had the advantage of the use of the new technology and in enchancing the child’s knowledge and skills. Such as creativity and logical intelligence in the introduction of shapes and numbers. In the further study, the concept of Learning Analytics in the form of assessment and feedback that is given to support the improvement of the objectivity in the learning process by collaborating with educational games which can be another beneficial to the early childhood education. Objective assessment and feedback may be the monitor and prediction which will also be analised for the efficiency and effectiveness in the learning proces through the use ofthe technology.
- Research Article
- 10.23917/khif.v10i2.4142
- Mar 24, 2025
- Khazanah Informatika : Jurnal Ilmu Komputer dan Informatika
Business Intelligence (BI) represents a pivotal advancement in leveraging information technology to enhance organizational performance. BI tools serve as crucial aids in decision-making processes by furnishing requisite insights. In higher education institutions, BI can contribute to leaders and managers in providing perspectives related to academics, learning, and management. Central to BI development is the meticulous gathering of requirements, a process pivotal in identifying organizational informational and knowledge needs. This involves employing various methods such as interviews, observation, and analysis, including leveraging learning analytics to discern data utility for enhanced learning processes. Various studies show that learning analytics contributes to improving the learning and education process. On the other hand, learning analytics requires activity data that is integrated, subject oriented, and time series which are aligned with the characteristics of the data warehouse (DWH) as the main component of BI. This research endeavors to develop BI utilizing academic and e-learning data, exemplified through a case study of Telkom University's Academic Systems and Learning Management Systems (LMS). This study aims to provide actionable insights into the intersection of BI and learning analytics, ultimately enhancing educational processes and organizational decision-making capabilities. By integrating learning analytics into BI development, the resultant BI systems can cater not only to current managerial demands but also anticipate future analytical needs. The implementation of the multidimensional schema was successfully executed. This process involved mapping data from the academic information system and the LMS as data sources to the data warehouse, the Extract, Transform, and Load (ETL) process, and development of the prototype. The testing on the prototype indicated that the prototype meets the intended requirements and provides valuable insights through its comprehensive reporting capabilities. This demonstrates the effectiveness of the implemented multidimensional schema, ETL process, and the overall design of the reporting dashboard.
- Research Article
3
- 10.3991/ijoe.v18i14.35073
- Nov 22, 2022
- International Journal of Online and Biomedical Engineering (iJOE)
Digital learning environments, such as online laboratories offer many opportunities for collecting data for Learning Analytics (LA). This article presents a systematic literature review for LA in laboratory based learning environments for Higher Engineering Education, which yielded 23 key references. The focus of the study was formed by the following research questions (RQ): What types of data are currently collected in online laboratories (RQ 1)? How is LA used to support learning and teaching processes as well as the design of the online-laboratory environment (RQ 2)? What design recommendations for the use of LA in laboratory-based learning environments can be derived (RQ 3)? The gained results show that LA can be used to provide feedback for simple as well as for complex learning processes in online laboratories. Moreover, it assists data-informed decision making for teaching and learning processes as well as for the design of the lab environment. Implications for future research projects were derived based on the findings and should contribute to the advancement of research on LA in online laboratories.
- Research Article
9
- 10.9743/jeo.2019.12.2.13
- Jul 1, 2019
- The Journal of Educators Online
The research aims at a specific analysis of how learning analytics as a metacognitive tool can be used as a method by teachers as reflective professionals and how it can help teachers learn to think and come down to decisions about learning design and curriculum, learning and teaching process, and its success. Not only does it build on previous research results by interpreting the description of learning analytics as a metacognitive tool for teachers as reflective professionals, but also lays out new prospects for investigation into the process of learning analytics application in open and online learning and teaching. The research leads to the use of learning analytics data for the implementation of teacher inquiry cycle and reflections on open and online teaching, eventually aiming at an improvement of curriculum and learning design. The results of the research demonstrate how learning analytics method can support teachers as reflective professionals, to help understand different learning habits of their students, recognize learners’ behavior, assess their thinking capacities, willingness to engage in the course and, based on the information, make real time adjustments to their course curriculum.