Accelerate Literature Icon
Want to do a literature review? Try our new Literature Review workflow

Unit test code generator for lua programming language

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

Software testing is an important step in the software development lifecycle. One of the main process that take lots of time is developing the test code. We propose an automatic unit test code generation to speed up the process and helps avoiding repetition. We develop the unit test code generator using Lua programming language. Lua is a fast, lightweight, embeddable scripting language. It has been used in many industrial applications with focuses on embedded systems and games. Unlike other popular scripting language like JavaScript, Python, and Ruby, Lua does not have any unit test generator developed to help its software testing process. The final product, Lua unit test generator (LUTG), integrated to one of the most popular Lua IDE, ZeroBrane Studio, as a plugin to seamlessly connect the coding and testing process. The code generator can generate unit test code, save test cases data on Lua and XML file format, and generate the test data automatically using search-based technique, genetic algorithm, to achieve full branch coverage test criteria. Using this generator to test several Lua source code files shows that the developed unit test generator can help the unit testing process. It was expected that the unit test generator can improve productivity, quality, consistency, and abstraction of unit testing process.

Similar Papers
  • Conference Article
  • Cite Count Icon 4
  • 10.1109/mercon50084.2020.9185378
Unit Test Code Generation Tool Support for Lower Level Programming Languages
  • Jul 1, 2020
  • Rasika Bandara + 1 more

In software development lifecycle, the most likely sub-phase to be overlooked within the testing phase is unit testing. One of the main reasons for such negligence is the cost overhead of unit testing. Often, project managers and tech-leads, either ignore unit testing or carry out it in a shallow level taking the trade-off between carrying out unit testing and the cost it would incur. This research suggests a model-based unit testing specification and code generator based on model specifications. While formalisms such as the huge amount of unit test inputs, complex specifications and complex technologies exist can be used, one must consider the practical usability of the proposed solution in the industry. Generic spreadsheet-based tool is used to create the unit test specification; C++ unit test code generate for Google Test. It provides comprehensive unit test specifications, complete unit test codes and informative unit test reports. The tool is applied to five different industrial software projects with each having six target functions, (sum n=36 target functions). Results have been further validated by experienced expert architects. The evaluation confirmed that the proposed solution provides an efficient and rapid way to write error-free unit test cases and generate unit test code.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 7
  • 10.15388/lmitt.2024.20
Unit Test Generation Using Large Language Models: A Systematic Literature Review
  • May 13, 2024
  • Vilnius University Open Series
  • Dovydas Marius Zapkus + 1 more

Unit testing is a fundamental aspect of software development, ensuring the correctness and robustness of code implementations. Traditionally, unit tests are manually crafted by developers based on their understanding of the code and its requirements. However, this process can be time-consuming, errorprone, and may overlook certain edge cases. In recent years, there has been growing interest in leveraging large language models (LLMs) for automating the generation of unit tests. LLMs, such as GPT (Generative Pre-trained Transformer), CodeT5, StarCoder, LLaMA, have demonstrated remarkable capabilities in natural language understanding and code generation tasks. By using LLMs, researchers aim to develop techniques that automatically generate unit tests from code snippets or specifications, thus optimizing the software testing process. This paper presents a literature review of articles that use LLMs for unit test generation tasks. It also discusses the history of the most commonly used large language models and their parameters, including the first time they have been used for code generation tasks. The result of this study presents the large language models for code and unit test generation tasks and their increasing popularity in code generation domain, indicating a great promise for the future of unit test generation using LLMs.

  • Conference Article
  • Cite Count Icon 23
  • 10.1145/3422392.3422412
An empirical study of automatically-generated tests from the perspective of test smells
  • Oct 21, 2020
  • Tássio Virgínio + 5 more

Developing test code can be as or more expensive than developing production code. Commonly, developers use automated unit test generators to speed up software testing. The purpose of such tools is to shorten production time without decreasing code quality. Nonetheless, unit tests usually do not have a quality check layer above testing code, which might be hard to guarantee the quality of the generated tests. A strategy to verify the tests quality is to analyze the presence of test smells in test code. Test smells are characteristics in the test code that possibly indicate weaknesses in test design and implementation. Their presence could be used as a quality indicator. In this paper, we present an empirical study to analyze the quality of unit test code generated by automated test tools. We compare the tests generated by two tools (Randoop and Evo- Suite) with the existing unit test suite of twenty-one open-source Java projects. We analyze the unit test code to detect the presence of nineteen types of test smells. The results indicated significant differences in the unit test quality when comparing data from the automated unit test generators and existing unit test suites.

  • Research Article
  • Cite Count Icon 80
  • 10.1145/3660783
Evaluating and Improving ChatGPT for Unit Test Generation
  • Jul 12, 2024
  • Proceedings of the ACM on Software Engineering
  • Zhiqiang Yuan + 6 more

Unit testing plays an essential role in detecting bugs in functionally-discrete program units ( e.g. , methods). Manually writing high-quality unit tests is time-consuming and laborious. Although the traditional techniques are able to generate tests with reasonable coverage, they are shown to exhibit low readability and still cannot be directly adopted by developers in practice. Recent work has shown the large potential of large language models (LLMs) in unit test generation. By being pre-trained on a massive developer-written code corpus, the models are capable of generating more human-like and meaningful test code. In this work, we perform the first empirical study to evaluate the capability of ChatGPT ( i.e ., one of the most representative LLMs with outstanding performance in code generation and comprehension) in unit test generation. In particular, we conduct both a quantitative analysis and a user study to systematically investigate the quality of its generated tests in terms of correctness, sufficiency, readability, and usability. We find that the tests generated by ChatGPT still suffer from correctness issues, including diverse compilation errors and execution failures (mostly caused by incorrect assertions); but the passing tests generated by ChatGPT almost resemble manually-written tests by achieving comparable coverage, readability, and even sometimes developers’ preference. Our findings indicate that generating unit tests with ChatGPT could be very promising if the correctness of its generated tests could be further improved. Inspired by our findings above, we further propose ChatTester , a novel ChatGPT-based unit test generation approach, which leverages ChatGPT itself to improve the quality of its generated tests. Chat Tester incorporates an initial test generator and an iterative test refiner. Our evaluation demonstrates the effectiveness of ChatTester by generating 34.3 % more compilable tests and 18.7 % more tests with correct assertions than the default ChatGPT. In addition to ChatGPT, we further investigate the generalization capabilities of ChatTester by applying it to two recent open-source LLMs ( i.e. , CodeLlama-Instruct and CodeFuse) and our results show that ChatTester can also improve the quality of tests generated by these LLMs.

  • Research Article
  • Cite Count Icon 1
  • 10.1145/3765758
Reference-Based Retrieval-Augmented Unit Test Generation
  • Dec 3, 2025
  • ACM Transactions on Software Engineering and Methodology
  • Zhe Zhang + 5 more

Automated unit test generation has been widely studied, with Large Language Models (LLMs) recently showing significant potential. LLMs like GPT-4, trained in vast text and code data, excel in various code-related tasks, including unit test generation. However, existing LLM-based approaches often focus solely on the context within the code itself, such as referenced variables, while neglecting broader task-specific contexts, such as the utility of referring to existing tests of relevant methods in unit test generation. Moreover, in the context of unit test generation, these tools prioritize high code coverage, often at the expense of practical usability, correctness, and maintainability. In response, we propose Reference-Based Retrieval Augmentation , a novel mechanism that extends LLM-based Retrieval-Augmented Generation (RAG) to retrieve relevant information by considering task-specific context. In the unit test generation task, for a given focal method, the reference relationships is defined as the reusability or referentiality of tests between the focal method and other methods. To generate high-quality unit tests for the focal method, the test reference relationships are then used to retrieve relevant methods and their existing unit tests. Specifically, we account for the unique structure of unit tests by dividing the test generation process into Given , When , and Then phases. When generating unit tests for a focal method, we retrieve pre-existing tests of other relevant methods, which can provide valuable insights for any of the Given , When , and Then phases. We implement this approach in a tool called RefTest , which sequentially performs preprocessing, test reference retrieval, and unit test generation, using an incremental strategy in which newly generated tests guide the creation of subsequent ones. We evaluated RefTest on 12 open-source projects with 1515 methods, and the results demonstrate that RefTest consistently outperforms existing tools in terms of correctness, completeness, and maintainability of the generated tests.

  • Research Article
  • Cite Count Icon 1
  • 10.29207/resti.v6i2.3940
Towards Generating Unit Test Codes Using Generative Adversarial Networks
  • Apr 29, 2022
  • Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi)
  • Muhammad Johan Alibasa + 3 more

Unit testing is one of the important software development steps to ensure the software’s quality. Despite its importance, unit testing is often neglected since it requires a significant amount of time and effort from the software developers to write them. Existing automated testing generating systems from past research still have shortcomings due to the Genetic Algorithm (GA) limitations to generate the appropriate unit test codes. This study explores the feasibility of using Generative Adversarial Networks (GAN) models to generate unit test code with the ability of GAN to cover GA’s drawbacks. We perform experimentations using four state-of-the-art GAN models to generate basic unit test codes and compare the results by analyzing the generated output codes using novel metrics proposed from past studies as well as performing qualitative evaluation on the generated outputs. The results show that the generated codes have satisfactory quality scores (BLEU-2 of around 99%) from the models and adequate diversity score (NLL-Div and NLL-Gen) in most models. Our study shows positive indications and potential in the use of GAN for automatic unit test code generation and suggests recommendations for future studies in GAN-based unit test code generation systems

  • Research Article
  • Cite Count Icon 4
  • 10.1145/3728970
STRUT: Structured Seed Case Guided Unit Test Generation for C Programs using LLMs
  • Jun 22, 2025
  • Proceedings of the ACM on Software Engineering
  • Jinwei Liu + 5 more

Unit testing plays a crucial role in bug detection and ensuring software correctness. It helps developers identify errors early in development, thereby reducing software defects. In recent years, large language models (LLMs) have demonstrated significant potential in automating unit test generation. However, using LLMs to generate unit tests faces many challenges. 1) The execution pass rate of the test cases generated by LLMs is low. 2) The test case coverage is inadequate, making it challenging to detect potential risks in the code. 3) Current research methods primarily focus on languages such as Java and Python, while studies on C programming are scarce, despite its importance in the real world. To address these challenges, we propose STRUT, a novel unit test generation method. STRUT utilizes structured test cases as a bridge between complex programming languages and LLMs. Instead of directly generating test code, STRUT guides LLMs to produce structured test cases, thereby alleviating the limitations of LLMs when generating code for programming languages with complex features. First, STRUT analyzes the context of focal methods and constructs structured seed test cases for them. These seed test cases then guide LLMs to generate a set of structured test cases. Subsequently, a rule-based approach is employed to convert the structured set of test cases into executable test code. We conducted a comprehensive evaluation of STRUT, which achieved an impressive execution pass rate of 96.01%, along with 77.67% line coverage and 63.60% branch coverage. This performance significantly surpasses that of the LLMs-based baseline methods and the symbolic execution tool SunwiseAUnit. These results highlight STRUT's superior capability in generating high-quality unit test cases by leveraging the strengths of LLMs while addressing their inherent limitations.

  • Research Article
  • Cite Count Icon 4
  • 10.1145/3763791
CITYWALK : Enhancing LLM-Based C++ Unit Test Generation via Project-Dependency Awareness and Language-Specific Knowledge
  • Aug 26, 2025
  • ACM Transactions on Software Engineering and Methodology
  • Yuwei Zhang + 8 more

Unit testing plays a pivotal role in the software development lifecycle, as it ensures code quality. However, writing high-quality unit tests remains a time-consuming task for developers in practice. More recently, the application of large language models (LLMs) in automated unit test generation has demonstrated promising results. Existing approaches primarily focus on interpreted programming languages (e.g., Java), while mature solutions tailored to compiled programming languages like C++ are yet to be explored. The intricate language features of C++, such as pointers, templates, and virtual functions, pose particular challenges for LLMs in generating both executable and high-coverage unit tests. To tackle the aforementioned problems, this paper introduces CITYWALK , a novel LLM-based framework for C++ unit test generation. CITYWALK enhances LLMs by providing a comprehensive understanding of the dependency relationships within the project under test via program analysis. Furthermore, CITYWALK incorporates language-specific knowledge about C++ derived from project documentation and empirical observations, significantly improving the correctness of the LLM-generated unit tests. We implement CITYWALK by employing the widely popular LLM GPT-4o. The experimental results show that CITYWALK outperforms current state-of-the-art approaches on a collection of ten popular C++ projects. Our findings demonstrate the effectiveness of CITYWALK in generating high-quality C++ unit tests.

  • Conference Article
  • Cite Count Icon 3
  • 10.1145/3593434.3593443
NxtUnit: Automated Unit Test Generation for Go
  • Jun 14, 2023
  • Siwei Wang + 5 more

Automated test generation has been extensively studied for dynamically compiled or typed programming languages like Java and Python. However, Go, a popular statically compiled and typed programming language for server application development, has received limited support from existing tools. To address this gap, we present NxtUnit, an automatic unit test generation tool for Go that uses random testing and is well-suited for microservice architecture. NxtUnit employs a random approach to generate unit tests quickly, making it ideal for smoke testing and providing quick quality feedback. It comes with three types of interfaces: an integrated development environment (IDE) plugin, a command-line interface (CLI), and a browser-based platform. The plugin and CLI tool allow engineers to write unit tests more efficiently, while the platform provides unit test visualization and asynchronous unit test generation. We evaluated NxtUnit by generating unit tests for 13 open-source repositories and 500 ByteDance in-house repositories, resulting in a code coverage of 20.74% for in-house repositories. We conducted a survey among Bytedance engineers and found that NxtUnit can save them 48% of the time on writing unit tests. We have made the CLI tool available at https://github.com/bytedance/nxt_unit.

  • Dissertation
  • 10.31979/etd.kddt-d7ms
Multi-Model Unit Test Generation Framework With Reinforcement Learning
  • Jan 1, 2025
  • Tasman Kuang

Unit test generation is a critical step in the software development lifecycle to ensure code quality and reduce the likelihood of bugs. Manually writing unit tests can be time-consuming and require an experienced developer. However with the emergence of generative AI, large language models (LLMs) in particular have demonstrated their effectiveness in generating code, which naturally brings up the question of the possibility of applying this capability to automate unit test generation. One of the newer techniques in this field is using Reinforcement Learning (RL) to train a model to generate quality unit tests. RL is the practice of training an agent to take optimal actions to maximize a reward signal. By treating the LLM as an agent and fine-tuning its parameters through feedback from the reward signal, it offers an adaptive and flexible method for improving LLM performance instead of relying on pre-trained models. This project explores different methodologies to augment a multi-model unit test generation framework including the use of RL to train its test generation capabilities. Using datasets derived from LeetCode and PyMethods2Test, our tool is evaluated against strong baseline LLMs like Gemini and Claude. The results show that the PPO-trained DeepSeek model consistently outperforms baseline generation, achieving higher test pass rates, fewer syntax errors, and improved coverage and mutation scores across both datasets, demonstrating that our framework presents an effective unit test generation method.

  • Conference Article
  • Cite Count Icon 2
  • 10.1109/coginf.2006.365691
Theoretical Study of the Personal Capability Improvement in Unit Test
  • Jul 1, 2006
  • Yuyu Yuan + 1 more

Unit test is major part of the software quality assurance. The unit test quality directly affects the software quality. The software enterprise takes more and more attention on unit test. However, beside the mature technical, tools and process of unit test, the effect of unit test is not very good. Base on an overall analysis of the unit test work, the paper indicates that people is the kernel and ties of the test jobs. The key point to improve the unit test level is the improvement of the personal capability of test personnel. The paper introduces the concepts of capability and personal capability in unit test. The three characters of personal capability for unit test are test efficiency, test quality and forecast. Personal capability is reflected and developed by specific activities and practices. Three key practices for capability improvements are plan practice, track practice and evaluate and improve practice. The paper introduces three key practices in detail. An enterprise case is studied and the result shows that the experience of personal capability improvement process will provide foundation for next turn of improvement activities, and act as reference to the software enterprise that want to improve their unit test process

  • Research Article
  • Cite Count Icon 1
  • 10.21609/jiki.v17i1.1198
Implementation Genetic Algorithm for Optimization of Kotlin Software Unit Test Case Generator
  • Feb 25, 2024
  • Jurnal Ilmu Komputer dan Informasi
  • Mohammad Andiez Satria Permana + 2 more

Unit testing has a significant role in software development and its impacts depend on the quality of test cases and test data used. To reduce time and effort, unit test generator systems can help automatically generate test cases and test data. However, there is currently no unit test generator for Kotlin programming language even though this language is popularly used for android application developments. In this study, we propose and develop a test generator system that utilizes genetic algorithm (GA) and ANTLR4 parser. GA is used to obtain the most optimal test cases and data for a given Kotlin code. ANTLR4 parser is used to optimize the mutation process in GA so that the mutation process is not totally random. Our model results showed that the average value of code coverage in generated unit tests against instruction coverage is 95.64%, with branch coverage of 76.19% and line coverage of 96.87%. In addition, only two out of eight generated classes produced duplicate test cases with a maximum of one duplication in each class. Therefore, it can be concluded that our optimization with GA on the unit test generator is able to produce unit tests with high code coverage and low duplication.

  • Conference Article
  • Cite Count Icon 36
  • 10.1109/icse43902.2021.00138
Automatic Unit Test Generation for Machine Learning Libraries: How Far Are We?
  • May 1, 2021
  • Song Wang + 5 more

Automatic unit test generation that explores the input space and produces effective test cases for given programs have been studied for decades. Many unit test generation tools that can help generate unit test cases with high structural coverage over a program have been examined. However, the fact that existing test generation tools are mainly evaluated on general software programs calls into question about its practical effectiveness and usefulness for machine learning libraries, which are statistically orientated and have fundamentally different nature and construction from general software projects. In this paper, we set out to investigate the effectiveness of existing unit test generation techniques on machine learning libraries. To investigate this issue, we conducted an empirical study on five widely used machine learning libraries with two popular unit testcase generation tools, i.e., EVOSUITE and Randoop. We find that (1) most of the machine learning libraries do not maintain a high-quality unit test suite regarding commonly applied quality metrics such as code coverage (on average is 34.1%) and mutation score (on average is 21.3%), (2) unit test case generation tools, i.e., EVOSUITE and Randoop, lead to clear improvements in code coverage and mutation score, however, the improvement is limited, and (3) there exist common patterns in the uncovered code across the five machine learning libraries that can be used to improve unit test case generation tasks.

  • Conference Article
  • 10.5753/sast.2025.14036
On the Energy Footprint of Using a Small Language Model for Unit Test Generation
  • Sep 22, 2025
  • Rafael S Durelli + 2 more

Context. Manual unit test creation is a cognitively intensive and time-consuming activity, prompting researchers and practitioners to increasingly adopt automated testing tools. Recent advancements in language models have expanded automation possibilities, including unit test generation, yet these models raise substantial sustainability concerns due to their energy consumption compared to conventional, specialized tools. Goal. Our research investigates whether the energy overhead associated with employing a small language model (SLM) for unit test generation is justified compared to a conventional, lightweight testing tool. We compare and analyze the energy consumption incurred during test suite generation, as well as the fault-finding effectiveness of the resulting test suites, for an SLM (Phi-3.1 Mini 128k) and Pynguin, a purpose-built tool for unit test generation. Method.We posed two research questions: (i) What is the difference in energy usage between Phi and Pynguin during the generation of unit test suites for Python programs?; and (ii) To what extent do unit test suites generated by Phi and Pynguin differ in their fault-finding effectiveness? To rigorously address the first research question, we employed Bayesian Data Analysis (BDA). For the second research question, we conducted a complementary empirical analysis using descriptive statistics. Results. Our Bayesian analysis provides robust evidence indicating that Phi consistently consumes significantly more energy than Pynguin during test suite generation. Conclusions. These findings underscore significant sustainability concerns associated with employing even SLMs for routine Software Engineering tasks such as unit test generation. The results challenge the assumption of universal energy efficiency benefits from smaller-scale models and emphasize the necessity for careful energy consumption evaluations in the adoption of automated software testing approaches.

  • Research Article
  • Cite Count Icon 10
  • 10.1002/stvr.1838
JUGE: An infrastructure for benchmarking Java unit test generators
  • Dec 20, 2022
  • Software Testing, Verification and Reliability
  • Xavier Devroey + 6 more

SummaryResearchers and practitioners have designed and implemented various automated test case generators to support effective software testing. Such generators exist for various languages (e.g., Java, C#, or Python) and various platforms (e.g., desktop, web, or mobile applications). The generators exhibit varying effectiveness and efficiency, depending on the testing goals they aim to satisfy (e.g., unit‐testing of libraries versus system‐testing of entire applications) and the underlying techniques they implement. In this context, practitioners need to be able to compare different generators to identify the most suited one for their requirements, while researchers seek to identify future research directions. This can be achieved by systematically executing large‐scale evaluations of different generators. However, executing such empirical evaluations is not trivial and requires substantial effort to select appropriate benchmarks, setup the evaluation infrastructure, and collect and analyse the results. In this Software Note, we present ourJUnit Generation Benchmarking Infrastructure(JUGE) supporting generators (search‐based, random‐based, symbolic execution, etc.) seeking to automate the production of unit tests for various purposes (validation, regression testing, fault localization, etc.). The primary goal is to reduce the overall benchmarking effort, ease the comparison of several generators, and enhance the knowledge transfer between academia and industry by standardizing the evaluation and comparison process. Since 2013, several editions of a unit testing tool competition, co‐located with the Search‐Based Software Testing Workshop, have taken place whereJUGEwas used and evolved. As a result, an increasing amount of tools (over 10) from academia and industry have been evaluated onJUGE, matured over the years, and allowed the identification of future research directions. Based on the experience gained from the competitions, we discuss the expected impact ofJUGEin improving the knowledge transfer on tools and approaches for test generation between academia and industry. Indeed, theJUGEinfrastructure demonstrated an implementation design that is flexible enough to enable the integration of additional unit test generation tools, which is practical for developers and allows researchers to experiment with new and advanced unit testing tools and approaches.

Save Icon
Up Arrow
Open/Close
Notes

Save Important notes in documents

Highlight text to save as a note, or write notes directly

You can also access these Documents in Paperpal, our AI writing tool

Powered by our AI Writing Assistant