Common Code Writing Errors Made by Novice Programmers: Implications for the Teaching of Introductory Programming

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

Abstract Novices tend to make unnecessary errors when they write programming code. Many of these errors can be attributed to the novices’ fragile knowledge of basic programming concepts. Programming instructors also find it challenging to develop teaching and learning strategies that are aimed at addressing the specific programming challenges experienced by their students. This paper reports on a study aimed at (1) identifying the common programming errors made by a select group of novice programmers, and (2) analyzing how these common errors changed at different stages during an academic semester. This exploratory study employed a mixed-methods approach based on the Framework of Integrated Methodologies (FraIM). Manual, structured content analysis of 684 programming artefacts, created by 38 participants and collected over an entire semester, lead to the identification of 21 common programming errors. The identified errors were classified into four categories: syntax, semantic, logic, and type errors. The results indicate that semantic and type errors occurred most frequently. Although common error categories are likely to remain the same from one assignment to the next, the introduction of more complex programming concepts towards the end of the semester could lead to an unexpected change in the most common error category. Knowledge of these common errors and error categories could assist programming instructors in adjusting their teaching and learning approaches for novice programmers.KeywordsNovice programmerCommon programming errorsCS1Computer Science education

Similar Papers
  • Research Article
  • Cite Count Icon 9
  • 10.4018/ijopcd.306686
Fostering the Learning Process in a Programming Course With a Chatbot
  • Aug 5, 2022
  • International Journal of Online Pedagogy and Course Design
  • Sohail Iqbal Malik + 5 more

Novice programmers have to focus and learn different programming skills in programming 1 course at the same time. Therefore, they need more support to answer their queries related to the programming domain. This study developed and offered a chatbot in programming 1 course. The chatbot focuses on course details, fundamental programming concepts, and common programming errors. The perception of programming 1 students and instructors regarding the chatbot in programming education were collected through the survey and focus group respectively. The results of the students’ survey revealed that the chatbot supports students in learning programming and common programming errors in the course. The focus group participants agreed that the chatbot provides one-to-one teaching experience to novices. The chatbot serves as a virtual teaching assistant and promotes students-centered learning. The focus group participants also agreed that the chatbot approach provides additional support to students in their learning process of programming domain.

  • Conference Article
  • Cite Count Icon 26
  • 10.1109/fie44824.2020.9274114
Explaining Causes Behind SQL Query Formulation Errors
  • Oct 21, 2020
  • Toni Taipalus

This Full Research Paper presents the most prominent query formulation errors in Structured Query Language (SQL), and maps these errors to their cognitive explanations. Understanding query formulation errors is a key to teaching SQL. more effectively. However, studies on what kind of errors novices struggle with are relatively scarce when compared to, for example, programming languages. Although committing errors is a crucial part in learning, some errors are relatively easy to fix, and their commonness is not necessarily an indication of their difficulty. Other errors, however, halt the learning process, and are never fixed by the query writer. Using a previously established error taxonomy and queries from four cohorts with a total of 987 students, we set out to identify common errors which students are unable to correct, i.e., errors that are likely to cause query formulation failures. Our results indicate that on a general level, logical errors are the most common cause for query formulation failures, while syntax and semantic errors are usually fixed by query writers. Although query concepts, for example, expressions, joins and grouping, have a strong influence on what types of errors are committed, some errors are common regardless of query concepts. Specifically, our results indicate that missing expressions, extraneous or omitted grouping columns, incorrect comparison operators, missing joins, and missing ordering columns are the most common errors that novices are unable to fix. Based on the results, we speculate on the reasons behind the most common persistent errors using previously identified cognitive explanations. Finally, we present that solutions for mitigating the causes behind query formulation errors are already available. In order to more effectively teach query formulation, educators should emphasize natural language patterns, query planning, and increasingly ambiguous exercises.

  • Research Article
  • 10.5114/jos.2024.143586
Correlation of root canal morphology of endodontically treated premolar and molar teeth with procedural errors using cone-beam computed tomography in an Iranian population
  • Jan 1, 2024
  • Journal of Stomatology
  • Maryam Foroozandeh + 3 more

Introduction:The prevalence of different root canal morphologies and types of endodontic procedural errors has been previously evaluated.Objectives: This study aimed to assess the quality of endodontic treatments and most common procedural errors as well as to evaluate the correlation of procedural errors with the root canal morphology of premolar and molar teeth according to Vertucci's classification using cone-beam computed tomography (CBCT). Material and methods:In this cross-sectional study, 230 endodontically treated teeth were evaluated on 287 CBCT of patients presenting Hamadan Dental School from 2019 to 2021.Canal type was determined according to Vertucci's classification system.Endodontic procedural errors, including under-filling, over-filling, nonhomogenous filling, perforation, and missed canals were also assessed.Correlation between the type of procedural error and root canal morphology was analyzed with chi-square test using SPSS version 23, at 5% level of significance.Results: Under-filling was the most common procedural error in Vertucci's type I canals (27.69%, p = 0.92).In types II and IV canals, the most common procedural error was missed canals, with a prevalence rate of 18.36% and 13.63%, respectively.Non-homogenous filling (p = 0.68) and missed canals (p = 0.003) were the most common errors in type III canals (25%).Over-filling (p = 0.05) and non-homogenous (p = 0.68) filling were the most frequent errors in type V canals (28.58%).Conclusions: Over-filling and missed canals showed a significant correlation with canal type.Moreover, the type of canal demonstrated a significant correlation with the presence of periapical lesions, i.e., periapical lesions showed the highest prevalence in canal types III and V.

  • Research Article
  • 10.52783/jisem.v10i24s.3942
Extracting Error Resolution Patterns for Novice Programming Students using Apriori Algorithm
  • Mar 24, 2025
  • Journal of Information Systems Engineering and Management
  • Niel Francis B Casillano

Introduction: Novice programmers often struggle with error resolution, affecting their learning and performance. This study analyzes error resolution patterns among first-year programming students using the Apriori algorithm. Objectives: The study aims to identify common programming errors, analyze resolution difficulty and time, and uncover patterns using association rules. It also seeks to provide data-driven recommendations to enhance programming education. Methods: A dataset of 150 first-year students was analyzed, focusing on error frequency, severity, and resolution time. The Apriori algorithm was applied to identify associations between error type, resolution attempts, and time required. Results: Syntax errors (319 occurrences) were the most frequent and resolved quickly, while logical (193) and runtime errors (164) were more challenging. Association rules showed that highly difficult errors took over 30 minutes to resolve (80% confidence), whereas low-severity syntax errors were fixed within 30 minutes (75% confidence). Conclusions: The study revealed the relationship between error type, resolution attempts, and correction time. Findings suggest tiered instructional strategies, such as automated feedback for syntax errors and structured debugging workshops, to improve student proficiency and reduce dropout rates.

  • Research Article
  • Cite Count Icon 1
  • 10.1142/s0218194025500329
An Empirical Study of MBFL on Novice Programs Across Different Programming Languages
  • Jul 1, 2025
  • International Journal of Software Engineering and Knowledge Engineering
  • Yating Yang + 2 more

Programming education in computer science is growing rapidly, and debugging is a key challenge for novice programmers due to their limited experience. Mutation-Based Fault Localization (MBFL) is widely used in industry, but its effectiveness and challenges in novice programs need further study. While Python is a popular language in machine learning and data science, there is little research comparing fault localization in Python and Java for novice programmers. To bridge this gap, we conduct an empirical study to evaluate MBFL’s accuracy and execution overhead in common novice programming errors across different languages. We analyze how program features like code coverage and mutation score affect MBFL’s performance and whether these effects differ between languages. We also examine how MBFL’s effectiveness changes when suspiciousness scores are the same and how mutant noise and coincidental correct test cases vary across languages. Additionally, we propose a mutation confidence formula based on repair potential and behavioral difference to assess the usefulness of mutants in MBFL. Our study demonstrates that MBFL works well for novice fault localization in both Java and Python, with Python performing better. MBFL correctly identifies 45, 70, and 92 faults within the TOP-N (N = 1, 3, 5), proving its strong performance. However, tie problems, mutant noise, and coincidental correct test cases weaken MBFL, especially in Java. Results in both languages show a strong positive correlation between mutant confidence and fault localization accuracy, confirming the formula’s effectiveness across languages.

  • Conference Article
  • Cite Count Icon 1
  • 10.28945/4246
Concept–based Analysis of Java Programming Errors among Low, Average and High Achieving Novice Programmers
  • Jan 1, 2019
  • Philip Olu Jegede + 2 more

[This Proceedings paper was revised and published in the 2019 issue of the Journal of Information Technology Education: Innovations in Practice, Volume 18.] Aim/Purpose: The study examined types of errors made by novice programmers in different Java concepts with students of different ability levels in programming as well as the perceived causes of such errors. Background: To improve code writing and debugging skills, efforts have been made to taxonomize programming errors and their causes. However, most of the studies employed omnibus approaches, i.e. without consideration of different programing concepts and ability levels of the trainee programmers. Such concepts and ability specific errors identification and classifications are needed to advance appropriate intervention strategy. Methodology: A sequential exploratory mixed method design was adopted. The sample was an intact class of 124 Computer Science and Engineering undergraduate students grouped into three achievement levels based on first semester performance in a Java programming course. The submitted codes in the course of second semester exercises were analyzed for possible errors, categorized and grouped across achievement level. The resulting data were analyzed using descriptive statistics as well as Pearson product correlation coefficient. Qualitative analyses through interviews and focused group discussion (FGD) were also employed to identify reasons for the committed errors. Contribution:The study provides a useful concept-based and achievement level specific error log for the teaching of Java programming for beginners. Findings: The results identified 598 errors with Missing symbols (33%) and Invalid symbols (12%) constituting the highest and least committed errors respec-tively. Method and Classes concept houses the highest number of errors (36%) followed by Other Object Concepts (34%), Decision Making (29%), and Looping (10%). Similar error types were found across ability levels. A significant relationship was found between missing symbols and each of Invalid symbols and Inappropriate Naming. Errors made in Methods and Classes were also found to significantly predict that of Other Object concepts. Recommendations for Practitioners: To promote better classroom practice in the teaching of Java programming, findings for the study suggests instructions to students should be based on achievement level. In addition to this, learning Java programming should be done with an unintelligent editor. Recommendations for Researchers: Research could examine logic or semantic errors among novice programmers as the errors analyzed in this study focus mainly on syntactic ones. Impact on Society: The digital age is code-driven, thus error analysis in programming instruction will enhance programming ability, which will ultimately transform novice programmers into experts, particularly in developing countries where most of the software in use is imported. Future Research: Researchers could look beyond novice or beginner programmers as codes written by intermediate or even advanced programmers are still not often completely error free.

  • Research Article
  • Cite Count Icon 7
  • 10.28945/4322
Concept–based Analysis of Java Programming Errors among Low, Average and High Achieving Novice Programmers
  • Jan 1, 2019
  • Journal of Information Technology Education: Innovations in Practice
  • Philip Olu Jegede + 3 more

Aim/Purpose: The study examined types of errors made by novice programmers in different Java concepts with students of different ability levels in programming as well as the perceived causes of such errors. Background: To improve code writing and debugging skills, efforts have been made to taxonomize programming errors and their causes. However, most of the studies employed omnibus approaches, i.e. without consideration of different programing concepts and ability levels of the trainee programmers. Such concepts and ability specific errors identification and classifications are needed to advance appropriate intervention strategy. Methodology: A sequential exploratory mixed method design was adopted. The sample was an intact class of 124 Computer Science and Engineering undergraduate students grouped into three achievement levels based on first semester performance in a Java programming course. The submitted codes in the course of second semester exercises were analyzed for possible errors, categorized and grouped across achievement level. The resulting data were analyzed using descriptive statistics as well as Pearson product correlation coefficient. Qualitative analyses through interviews and focused group discussion (FGD) were also employed to identify reasons for the committed errors. Contribution:The study provides a useful concept-based and achievement level specific error log for the teaching of Java programming for beginners. Findings: The results identified 598 errors with Missing symbols (33%) and Invalid symbols (12%) constituting the highest and least committed errors respec-tively. Method and Classes concept houses the highest number of errors (36%) followed by Other Object Concepts (34%), Decision Making (29%), and Looping (10%). Similar error types were found across ability levels. A significant relationship was found between missing symbols and each of Invalid symbols and Inappropriate Naming. Errors made in Methods and Classes were also found to significantly predict that of Other Object concepts. Recommendations for Practitioners: To promote better classroom practice in the teaching of Java programming, findings for the study suggests instructions to students should be based on achievement level. In addition to this, learning Java programming should be done with an unintelligent editor. Recommendations for Researchers: Research could examine logic or semantic errors among novice programmers as the errors analyzed in this study focus mainly on syntactic ones. Impact on Society: The digital age is code-driven, thus error analysis in programming instruction will enhance programming ability, which will ultimately transform novice programmers into experts, particularly in developing countries where most of the software in use is imported. Future Research: Researchers could look beyond novice or beginner programmers as codes written by intermediate or even advanced programmers are still not often completely error free.

  • Conference Article
  • Cite Count Icon 10
  • 10.1145/3209635.3209652
Visualizing Code Patterns in Novice Programmers
  • May 4, 2018
  • Jeff Bulmer + 2 more

Many researchers have investigated the difficulties faced by novice programmers. However, these approaches have so far focused primarily on the identification and correction of common syntax errors, or that of topic difficulty in the CS1 curriculum. Meanwhile, poor coding practices adopted by students have gone mostly unaddressed. While these practices may not necessarily lead to erroneous code, they may nonetheless indicate areas of difficulty and lead to poorly structured programs. To address these issues, our project examines students' coding habits and common errors in CS1 exercises gathered from 77 first-year students. This data was collected in real time so that we may later reconstruct the thought process of the student while solving the programming exercises. To assist our analysis, we built a code visualizer that animates the programming process dynamically and summarizes error metrics simultaneously. Our ultimate goal is to use the code visualizer to help either an instructor or a student to identify poor programming practices during the coding process. With the error metrics gathered, an instructor can inspect potential improvements in coding behaviors for an individual student at a given point in time or over time, and identify bad coding habits common to populations of students.

  • Research Article
  • Cite Count Icon 72
  • 10.1145/3335814
A New Look at Novice Programmer Errors
  • Jul 11, 2019
  • ACM Transactions on Computing Education
  • Davin Mccall + 1 more

The types of programming errors that novice programmers make and struggle to resolve have long been of interest to researchers. Various past studies have analyzed the frequency of compiler diagnostic messages. This information, however, does not have a direct correlation to the types of errors students make, due to the inaccuracy and imprecision of diagnostic messages. Furthermore, few attempts have been made to determine the severity of different kinds of errors in terms other than frequency of occurrence. Previously, we developed a method for meaningful categorization of errors, and produced a frequency distribution of these error categories; in this article, we extend the previous method to also make a determination of error difficulty, in order to give a better measurement of the overall severity of different kinds of errors. An error category hierarchy was developed and validated, and errors in snapshots of students source code were categorized accordingly. The result is a frequency table of logical error categories rather than diagnostic messages. Resolution time for each of the analyzed errors was calculated, and the average resolution time for each category of error was determined; this defines an error difficulty score. The combination of frequency and difficulty allow us to identify the types of error that are most problematic for novice programmers. The results show that ranking errors by severity—a product of frequency and difficulty—yields a significantly different ordering than ranking them by frequency alone, indicating that error frequency by itself may not be a suitable indicator for which errors are actually the most problematic for students.

  • Research Article
  • Cite Count Icon 4
  • 10.1080/02687038.2024.2361961
The pattern of phonological, semantic, and circumlocution naming errors for nouns and verbs in primary progressive aphasia
  • Jun 8, 2024
  • Aphasiology
  • Aaron M Meyer + 7 more

Background In the diagnostic criteria for lvPPA (Gorno-Tempini et al. 2011), “speech (phonologic) errors in spontaneous speech and naming” is a secondary criterion, but studies of naming error patterns in PPA have not found evidence to support this criterion. Furthermore, only a few studies have examined naming error patterns in PPA. Aims In the current study, we examined the pattern of naming errors for nouns and verbs in all three subtypes of PPA, as well as unclassifiable PPA and typical (amnestic) Alzheimer’s disease (AD). Statistical analyses focused on three common error types: phonological, semantic, and circumlocution errors. Methods & Procedures The final sample included 35 participants with PPA and four participants with typical AD. Participants were asked to name 284 noun pictures and 116 verb pictures. Separately for nouns and verbs, repeated-measures ANCOVA was used to examine the interaction between Error Type and Diagnostic Subtype. Twenty of the participants also completed a structural MRI scan. For these participants, we examined the relationships between naming errors and brain volume within ten left hemisphere regions of interest (ROIs). Outcomes & Results In lvPPA, the proportion of phonological errors was significantly lower than the proportion of semantic errors for verbs. In svPPA, uPPA, and typical AD, semantic errors were significantly greater than phonological errors for both nouns and verbs. In between-subtype analyses, the proportion of semantic errors for nouns was significantly greater for participants with svPPA and uPPA, compared to those with nfvPPA. For nouns, the MRI analyses revealed significant negative correlations between the proportion of circumlocution errors and volume in the left inferior temporal gyrus and the left fusiform gyrus. For verbs, there were significant negative correlations between circumlocution errors and volume in the left insula, and between semantic errors and volume in the left superior temporal pole. Conclusions The findings of this study indicate that semantic naming errors may be common for both nouns and verbs in typical AD and all subtypes of PPA, with the possible exception of nouns in nfvPPA. In contrast, phonological naming errors were not significantly more common than semantic errors in any diagnostic subtype. Furthermore, phonological naming errors were not significantly more common in lvPPA, compared to any other diagnostic subtype.

  • Research Article
  • 10.22108/rall.2020.113932.1174
Analysis of Linguistic and Written Errors of Arabic Language Learners in Iranian Universities (A Case Study of MA Theses Submitted to Tarbiat Modares University in Field of Arabic language Teaching)
  • Oct 1, 2021
  • DOAJ (DOAJ: Directory of Open Access Journals)
  • Hadi Nazari Monazam + 2 more

Writing skill is one of the language skills in teaching Arabic in Iranian Universities. Some courses are dedicated to teaching this skill in Arabic. Nevertheless, students still face problems in writing in Arabic and experience linguistic and written errors at various educational levels. Therefore, in the present study, the authors tried to analyze linguistic and writing errors of 20 MA theses submitted to the Department of Arabic Language Teaching Tarbiat Modares University. The study was conducted to provide solutions for reforming the Arabic teaching curriculum, especially the writing skill in Iran, and reducing errors of students in the future. The descriptive-analytical and error analysis methods and statistical analyses were used. The results showed that the linguistic errors in these theses were 1005 errors. Syntactic errors were the most common language errors, and the errors related to the additional words (prepositions) were mostly grammatical. Other common errors were spelling errors, errors related to the humazat, alqate morphological errors, errors related to the detection of definite (marefah) from indefinite (nakarah), semantic errors, and the use of non-functional words and phrases in the Arabic language. Then, the interpretation of the errors revealed the traces of interlingual interference. Some of the most important causes for making errors by students in their theses were Arabic language teaching methods, educational environment, Arabic language difficulties and rules, the indolence of students, and lacking sufficient knowledge about some language rules.

  • Dissertation
  • Cite Count Icon 1
  • 10.37099/mtu.dc.etdr/981
MatlabTA: A Style Critiquer For Novice Engineering Students
  • Jan 1, 2020
  • Marissa L Walther

Novice programmers, considered to be those who have yet to understand the fundamentals of programming, exist in both engineering and computing fields. Within computing, various resources exist to help novice programmers understand fundamentals and style guidelines such as WebTA, a code critique program that gives Java students feedback about their error and style issues. There is, however, a gap in automated code critique for MATLAB, a programming language that is popular in the engineering community. When it comes to MATLAB, there are not many programs that help novices understand their errors, and even fewer that help them understand style guidelines. To help assist these engineering novices, I created a program called MatlabTA. Based on feedback from Engineering Fundamentals instructors on the most common errors they encounter in student code, MatlabTA exists to give novices more intuitive feedback for a few of the most common MATLAB errors, along with providing them different style guidelines for different MATLAB antipatterns such as inconsistent tabbing and function output variable matching. This report will provide an overview of the process in developing MatlabTA, along with examples of the different outputs the application produces.

  • Dissertation
  • 10.18130/v34v52
Validating a Mathematics Interim Assessment with Cognitively Diagnostic Error Categories
  • Jan 1, 2014
  • Libra
  • Christine Hutchison

The stressors of the No Child Left Behind Act have thrust educators into a data-driven accountability culture. As school divisions are racing to keep up with increasingly higher achievement demands, educators are scrambling to find testing and instructional methods for improving mathematics achievement prior to students sitting for end-of-grade (EOG) and end-of-course (EOC) tests. Over the past several years, interim assessments have emerged as a possible solution, although there is a paucity of empirical research to support interim assessments as vehicles for improving mathematics achievement. The purpose of this mixed methods study was to create and validate a 7th grade mathematics interim assessment which incorporated cognitively diagnostic error categories. The interim assessment followed an ordered multiple-choice test design where distractors represented students’ common errors. Inspiration for the development of the error categories came from the cognitive and school improvement literature. The error categories comprise: conceptual, procedural, and attention errors. Validity evidence was gathered from qualitative sources (i.e., student cognitive think-alouds, expert teacher reviews), and quantitative sources (i.e., classical test theory analysis, distractor analysis, differential item functioning, and a partial credit item response theory analysis). Results suggest that there is validity evidence to support the development of the cognitively diagnostic error categories and the overall test design. Of the three error categories, the attention error category was the most problematic and erratic. Validity evidence to support the ordering of the error categories was not consistent. More research needs to be done in the development of the attention error category and the ordering of all three error categories. Limitations to the study and opportunities for future research were discussed.

  • PDF Download Icon
  • Research Article
  • 10.9744/katakita.12.1.26-33
Errors on Sentence Structures Made by Students in Writing 2 and Writing 4 Classes
  • Mar 4, 2024
  • k@ta kita
  • Manuela Octaviani + 1 more

This study analyzed the types of errors on sentence structures by students in Writing 2 and Writing 4 classes in the English Department, Petra Christian University, and their similarities and differences. Eight categories of common errors on sentence structures proposed by Ho (2005) were used to analyze thedrafts. The types of common errors in this study were limited to four: Run-on Sentence, Fragmented Sentence, Inappropriate Subordinate Conjunction, and Misordering. The findings showed that both classes made all types of the errors and the most common error made was run-on sentences in the form of comma splice. Fragmented sentences with missing verb and subject were only found on writing 2 drafts while subordinating clauses in fragmented sentences were found on writing 4 drafts. Moreover, errors in relation to subordinating conjunction were more prominent in Writing 4. In conclusion, difficulties in utilizing conjunctions, and the different complexity of the drafts might affect the errors on sentence structures.

  • Research Article
  • Cite Count Icon 2
  • 10.58245/ipsi.tir.2401.05
Common Errors in High School Novice Programming
  • Jan 1, 2024
  • IPSI Transactions on Internet Research
  • Davorka Radaković + 1 more

Identifying and classifying the commonness of errors made by novices learning to write computer programs has long been of interest to both: researchers and educators. When teachers understand the nature of these errors and how students correct them, instruction can be more effective. Some errors occur more frequently than others. In this paper, we examine the most common programming errors made by beginning first-year high school gifted mathematics students in Mathematical High School. Notwithstanding the extensive coverage of these error types in lectures and learning materials, we found that these errors still occur when students write programs. Our results suggest that students who habitually make all common errors have lower grades, but even excellent students make logical errors in loop conditions. Therefore, we advise more practice in logical reasoning for novice programmers and an introduction to formal semantics.

Save Icon
Up Arrow
Open/Close
Notes

Save Important notes in documents

Highlight text to save as a note, or write notes directly

You can also access these Documents in Paperpal, our AI writing tool

Powered by our AI Writing Assistant