Implementation Genetic Algorithm for Optimization of Kotlin Software Unit Test Case Generator

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

Unit testing has a significant role in software development and its impacts depend on the quality of test cases and test data used. To reduce time and effort, unit test generator systems can help automatically generate test cases and test data. However, there is currently no unit test generator for Kotlin programming language even though this language is popularly used for android application developments. In this study, we propose and develop a test generator system that utilizes genetic algorithm (GA) and ANTLR4 parser. GA is used to obtain the most optimal test cases and data for a given Kotlin code. ANTLR4 parser is used to optimize the mutation process in GA so that the mutation process is not totally random. Our model results showed that the average value of code coverage in generated unit tests against instruction coverage is 95.64%, with branch coverage of 76.19% and line coverage of 96.87%. In addition, only two out of eight generated classes produced duplicate test cases with a maximum of one duplication in each class. Therefore, it can be concluded that our optimization with GA on the unit test generator is able to produce unit tests with high code coverage and low duplication.

Similar Papers
  • Research Article
  • 10.3390/e28010074
IGTG&R: An Intent Analysis-Guided Unit Test Generation and Refinement Framework
  • Jan 9, 2026
  • Entropy
  • Xiaojian Liu + 1 more

Code coverage-guided unit test generation (CGTG) and large language model-based test generation (LLMTG) are two principal approaches for the generation of unit tests. Each of these approaches has its inherent advantages and drawbacks. Tests generated by CGTG have been shown to exhibit high code coverage and high executability. However, they lack the capacity to comprehend code intent, which results in an inability to identify deviations between code implementation and design intent (i.e., functional defects). Conversely, although LLMTG demonstrates an advantage in terms of code intent analysis, it is generally characterized by low executability and necessitates iterative debugging. In order to enhance the ability of unit test generation to identify functional defects, a novel framework has been proposed, entitled the intent analysis-guided unit test generation and refinement (IGTG&R) model. The IGTG&R model consists of a two-stage process for test generation. In the first stage, we introduce coverage path entropy to enhance CGTG to achieve high executability and code coverage of test cases. The second stage refines the test cases using LLMs to identify functional defects. We quantify and verify the interference of incorrect code implementation on intent analysis through conditional entropy. In order to reduce this interference, the focal method body is excluded from the code context information during intent analysis. Using these two-stage process, IGTG&R achieves a more profound comprehension of the intent of the code and the identification of functional defects. The IGTG&R model has been demonstrated to achieve an identification rate of functional defects ranging from 65% to 89%, with an execution success rate of 100% and a code coverage rate of 75.8%. This indicates that IGTG&R is superior to the CGTG and LLMTG approaches in multiple aspects.

  • Conference Article
  • Cite Count Icon 36
  • 10.1109/icse43902.2021.00138
Automatic Unit Test Generation for Machine Learning Libraries: How Far Are We?
  • May 1, 2021
  • Song Wang + 5 more

Automatic unit test generation that explores the input space and produces effective test cases for given programs have been studied for decades. Many unit test generation tools that can help generate unit test cases with high structural coverage over a program have been examined. However, the fact that existing test generation tools are mainly evaluated on general software programs calls into question about its practical effectiveness and usefulness for machine learning libraries, which are statistically orientated and have fundamentally different nature and construction from general software projects. In this paper, we set out to investigate the effectiveness of existing unit test generation techniques on machine learning libraries. To investigate this issue, we conducted an empirical study on five widely used machine learning libraries with two popular unit testcase generation tools, i.e., EVOSUITE and Randoop. We find that (1) most of the machine learning libraries do not maintain a high-quality unit test suite regarding commonly applied quality metrics such as code coverage (on average is 34.1%) and mutation score (on average is 21.3%), (2) unit test case generation tools, i.e., EVOSUITE and Randoop, lead to clear improvements in code coverage and mutation score, however, the improvement is limited, and (3) there exist common patterns in the uncovered code across the five machine learning libraries that can be used to improve unit test case generation tasks.

  • Research Article
  • Cite Count Icon 5
  • 10.1145/2507288.2507309
Critical components testing using hybrid genetic algorithm
  • Aug 26, 2013
  • ACM SIGSOFT Software Engineering Notes
  • D Jeya Mala + 2 more

As quality of software plays a vital role in real time systems, it is essential to identify the crucial parts in the system and to test them effectively. In the proposed approach, the critical components are identified by means of mutation based impact analysis. The next task is to test the critical components using the Hybrid Genetic Algorithm (HGA) based test case generation and optimization approach. The mutants are automatically generated by seeding faults into each method of all the components in the Software Under Test (SUT). The initial set of test cases is generated using randomized test data. The generated test cases are executed over the original and the mutant to identify whether the test case detects the error or not. Based on the results, the Mutation Score (MS) is calculated, which always lies between 0 and 1. The best test cases are chosen based on having higher mutation scores and are executed on mutants to analyze how each component affects the other components in the SUT. Based on the analysis, the critical components are identified and they need rigorous testing using the test cases generated by the HGA. The algorithm uses the RemoveTop and LocalBest improvement heuristics to achieve near optimal solutions. In unit testing, the test cases are executed against the original and the mutant. The test case optimization is done by evaluating the effectiveness of test suites using the Mutation Score and the Branch Coverage Value (BCV). In pair-wise testing, the effective test cases are selected based on the higher mutation scores and branch coverage values. The components are executed against these test cases and the execution traces are recorded. The traced results are compared against the expected outputs which were previously stored in the repository and the statuses are updated. Based on the statuses, the faulty methods are revealed. The efficiency of the proposed approach is compared with Genetic Algorithm (GA) and we concluded that the final test suite size and the total execution time are reduced in the proposed approach. Finally various graphs and PDF reports are generated for visualization purposes.

  • Research Article
  • Cite Count Icon 1
  • 10.1145/3765758
Reference-Based Retrieval-Augmented Unit Test Generation
  • Dec 3, 2025
  • ACM Transactions on Software Engineering and Methodology
  • Zhe Zhang + 5 more

Automated unit test generation has been widely studied, with Large Language Models (LLMs) recently showing significant potential. LLMs like GPT-4, trained in vast text and code data, excel in various code-related tasks, including unit test generation. However, existing LLM-based approaches often focus solely on the context within the code itself, such as referenced variables, while neglecting broader task-specific contexts, such as the utility of referring to existing tests of relevant methods in unit test generation. Moreover, in the context of unit test generation, these tools prioritize high code coverage, often at the expense of practical usability, correctness, and maintainability. In response, we propose Reference-Based Retrieval Augmentation , a novel mechanism that extends LLM-based Retrieval-Augmented Generation (RAG) to retrieve relevant information by considering task-specific context. In the unit test generation task, for a given focal method, the reference relationships is defined as the reusability or referentiality of tests between the focal method and other methods. To generate high-quality unit tests for the focal method, the test reference relationships are then used to retrieve relevant methods and their existing unit tests. Specifically, we account for the unique structure of unit tests by dividing the test generation process into Given , When , and Then phases. When generating unit tests for a focal method, we retrieve pre-existing tests of other relevant methods, which can provide valuable insights for any of the Given , When , and Then phases. We implement this approach in a tool called RefTest , which sequentially performs preprocessing, test reference retrieval, and unit test generation, using an incremental strategy in which newly generated tests guide the creation of subsequent ones. We evaluated RefTest on 12 open-source projects with 1515 methods, and the results demonstrate that RefTest consistently outperforms existing tools in terms of correctness, completeness, and maintainability of the generated tests.

  • Conference Article
  • Cite Count Icon 8
  • 10.1109/icodse.2015.7437005
Unit test code generator for lua programming language
  • Nov 1, 2015
  • Junno Tantra Pratama Wibowo + 2 more

Software testing is an important step in the software development lifecycle. One of the main process that take lots of time is developing the test code. We propose an automatic unit test code generation to speed up the process and helps avoiding repetition. We develop the unit test code generator using Lua programming language. Lua is a fast, lightweight, embeddable scripting language. It has been used in many industrial applications with focuses on embedded systems and games. Unlike other popular scripting language like JavaScript, Python, and Ruby, Lua does not have any unit test generator developed to help its software testing process. The final product, Lua unit test generator (LUTG), integrated to one of the most popular Lua IDE, ZeroBrane Studio, as a plugin to seamlessly connect the coding and testing process. The code generator can generate unit test code, save test cases data on Lua and XML file format, and generate the test data automatically using search-based technique, genetic algorithm, to achieve full branch coverage test criteria. Using this generator to test several Lua source code files shows that the developed unit test generator can help the unit testing process. It was expected that the unit test generator can improve productivity, quality, consistency, and abstraction of unit testing process.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 5
  • 10.1007/s10664-024-10451-x
Toward granular search-based automatic unit test case generation
  • May 17, 2024
  • Empirical Software Engineering
  • Fabiano Pecorelli + 4 more

Unit testing verifies the presence of faults in individual software components. Previous research has been targeting the automatic generation of unit tests through the adoption of random or search-based algorithms. Despite their effectiveness, these approaches aim at creating tests by solely optimizing metrics like code coverage, without ensuring that the resulting tests have granularities that would allow them to verify both the behavior of individual production methods and the interaction between methods of the class under test. To address this limitation, we propose a two-step systematic approach to the generation of unit tests: we first force search-based algorithms to create tests that cover individual methods of the production code, hence implementing the so-called intra-method tests; then, we relax the constraints to enable the creation of intra-class tests that target the interactions among production code methods. The assessment of our approach is conducted through a mixed-method research design that combines statistical analyses with a user study. The key results report that our approach is able to keep the same level of code and mutation coverage while providing test suites that are more structured, more understandable and aligned to the design principles of unit testing.

  • Research Article
  • Cite Count Icon 4
  • 10.1145/3728970
STRUT: Structured Seed Case Guided Unit Test Generation for C Programs using LLMs
  • Jun 22, 2025
  • Proceedings of the ACM on Software Engineering
  • Jinwei Liu + 5 more

Unit testing plays a crucial role in bug detection and ensuring software correctness. It helps developers identify errors early in development, thereby reducing software defects. In recent years, large language models (LLMs) have demonstrated significant potential in automating unit test generation. However, using LLMs to generate unit tests faces many challenges. 1) The execution pass rate of the test cases generated by LLMs is low. 2) The test case coverage is inadequate, making it challenging to detect potential risks in the code. 3) Current research methods primarily focus on languages such as Java and Python, while studies on C programming are scarce, despite its importance in the real world. To address these challenges, we propose STRUT, a novel unit test generation method. STRUT utilizes structured test cases as a bridge between complex programming languages and LLMs. Instead of directly generating test code, STRUT guides LLMs to produce structured test cases, thereby alleviating the limitations of LLMs when generating code for programming languages with complex features. First, STRUT analyzes the context of focal methods and constructs structured seed test cases for them. These seed test cases then guide LLMs to generate a set of structured test cases. Subsequently, a rule-based approach is employed to convert the structured set of test cases into executable test code. We conducted a comprehensive evaluation of STRUT, which achieved an impressive execution pass rate of 96.01%, along with 77.67% line coverage and 63.60% branch coverage. This performance significantly surpasses that of the LLMs-based baseline methods and the symbolic execution tool SunwiseAUnit. These results highlight STRUT's superior capability in generating high-quality unit test cases by leveraging the strengths of LLMs while addressing their inherent limitations.

  • Conference Article
  • Cite Count Icon 227
  • 10.1109/ase.2015.86
Do Automatically Generated Unit Tests Find Real Faults? An Empirical Study of Effectiveness and Challenges (T)
  • Nov 1, 2015
  • Sina Shamshiri + 5 more

Rather than tediously writing unit tests manually, tools can be used to generate them automatically - sometimes even resulting in higher code coverage than manual testing. But how good are these tests at actually finding faults? To answer this question, we applied three state-of-the-art unit test generation tools for Java (Randoop, EvoSuite, and Agitar) to the 357 real faults in the Defects4J dataset and investigated how well the generated test suites perform at detecting these faults. Although the automatically generated test suites detected 55.7% of the faults overall, only 19.9% of all the individual test suites detected a fault. By studying the effectiveness and problems of the individual tools and the tests they generate, we derive insights to support the development of automated unit test generators that achieve a higher fault detection rate. These insights include 1) improving the obtained code coverage so that faulty statements are executed in the first instance, 2) improving the propagation of faulty program states to an observable output, coupled with the generation of more sensitive assertions, and 3) improving the simulation of the execution environment to detect faults that are dependent on external factors such as date and time.

  • Research Article
  • Cite Count Icon 80
  • 10.1145/3660783
Evaluating and Improving ChatGPT for Unit Test Generation
  • Jul 12, 2024
  • Proceedings of the ACM on Software Engineering
  • Zhiqiang Yuan + 6 more

Unit testing plays an essential role in detecting bugs in functionally-discrete program units ( e.g. , methods). Manually writing high-quality unit tests is time-consuming and laborious. Although the traditional techniques are able to generate tests with reasonable coverage, they are shown to exhibit low readability and still cannot be directly adopted by developers in practice. Recent work has shown the large potential of large language models (LLMs) in unit test generation. By being pre-trained on a massive developer-written code corpus, the models are capable of generating more human-like and meaningful test code. In this work, we perform the first empirical study to evaluate the capability of ChatGPT ( i.e ., one of the most representative LLMs with outstanding performance in code generation and comprehension) in unit test generation. In particular, we conduct both a quantitative analysis and a user study to systematically investigate the quality of its generated tests in terms of correctness, sufficiency, readability, and usability. We find that the tests generated by ChatGPT still suffer from correctness issues, including diverse compilation errors and execution failures (mostly caused by incorrect assertions); but the passing tests generated by ChatGPT almost resemble manually-written tests by achieving comparable coverage, readability, and even sometimes developers’ preference. Our findings indicate that generating unit tests with ChatGPT could be very promising if the correctness of its generated tests could be further improved. Inspired by our findings above, we further propose ChatTester , a novel ChatGPT-based unit test generation approach, which leverages ChatGPT itself to improve the quality of its generated tests. Chat Tester incorporates an initial test generator and an iterative test refiner. Our evaluation demonstrates the effectiveness of ChatTester by generating 34.3 % more compilable tests and 18.7 % more tests with correct assertions than the default ChatGPT. In addition to ChatGPT, we further investigate the generalization capabilities of ChatTester by applying it to two recent open-source LLMs ( i.e. , CodeLlama-Instruct and CodeFuse) and our results show that ChatTester can also improve the quality of tests generated by these LLMs.

  • Conference Article
  • Cite Count Icon 49
  • 10.1145/3624032.3624035
An initial investigation of ChatGPT unit test generation capability
  • Sep 25, 2023
  • Vitor Guilherme + 1 more

Context: Software testing ensures software quality, but developers often disregard it. The use of automated testing generation is pursued to reduce the consequences of overlooked test cases in a software project. Problem: In the context of Java programs, several tools can completely automate generating unit test sets. Additionally, studies are conducted to offer evidence regarding the quality of the generated test sets. However, it is worth noting that these tools rely on machine learning and other AI algorithms rather than incorporating the latest advancements in Large Language Models (LLMs). Solution: This work aims to evaluate the quality of Java unit tests generated by an OpenAI LLM algorithm, using metrics like code coverage and mutation test score. Method: For this study, 33 programs used by other researchers in the field of automated test generation were selected. This approach was employed to establish a baseline for comparison purposes. For each program, 33 unit test sets were generated automatically, without human interference, by changing Open AI API parameters. After executing each test set, metrics such as code line coverage, mutation score, and success rate of test execution were collected to evaluate the efficiency and effectiveness of each set. Summary of Results: Our findings revealed that the OpenAI LLM test set demonstrated similar performance across all evaluated aspects compared to traditional automated Java test generation tools used in the previous research. These results are particularly remarkable considering the simplicity of the experiment and the fact that the generated test code did not undergo human analysis.

  • Conference Article
  • Cite Count Icon 4
  • 10.1109/icicos56336.2022.9930600
Understandable Automatic Generated Unit Tests using Semantic and Format Improvement
  • Sep 28, 2022
  • Novi Setiani + 2 more

Unit testing is the important yet the most laborious testing activity because the developer must create and execute unit tests for each class that is created. Unit tests can be created manually by the developer or automatically by using code-based test case generation techniques, such as random, search-based, or symbolic execution techniques. The automatic generated testcases reducing developer effort in writing the unit test for every method and class. However, this automated unit test case more difficult to understand compared to manual unit test. After the unit tests are executed, the results must be validated or maintained by the developer team and they must understand the content of unit tests. This applies not only to the development phase but also to the software maintenance phase. Therefore, understandability is an important aspect that test cases need to have. To explore what kind of test cases are easy for developers to understand, a requirement gathering activities related to understandability improvement in unit test cases is conducted. This research involved the developers and experts in software development by exploring their opinion in semantic, format and function of automatic unit test cases. Based on expert's opinion and developer's interview, requirement list is mapped to the Evosuite and Randoop generated unit test case

  • Conference Article
  • Cite Count Icon 3
  • 10.1145/3593434.3593443
NxtUnit: Automated Unit Test Generation for Go
  • Jun 14, 2023
  • Siwei Wang + 5 more

Automated test generation has been extensively studied for dynamically compiled or typed programming languages like Java and Python. However, Go, a popular statically compiled and typed programming language for server application development, has received limited support from existing tools. To address this gap, we present NxtUnit, an automatic unit test generation tool for Go that uses random testing and is well-suited for microservice architecture. NxtUnit employs a random approach to generate unit tests quickly, making it ideal for smoke testing and providing quick quality feedback. It comes with three types of interfaces: an integrated development environment (IDE) plugin, a command-line interface (CLI), and a browser-based platform. The plugin and CLI tool allow engineers to write unit tests more efficiently, while the platform provides unit test visualization and asynchronous unit test generation. We evaluated NxtUnit by generating unit tests for 13 open-source repositories and 500 ByteDance in-house repositories, resulting in a code coverage of 20.74% for in-house repositories. We conducted a survey among Bytedance engineers and found that NxtUnit can save them 48% of the time on writing unit tests. We have made the CLI tool available at https://github.com/bytedance/nxt_unit.

  • Research Article
  • Cite Count Icon 27
  • 10.1145/1764810.1764824
Quality improvement and optimization of test cases
  • May 11, 2010
  • ACM SIGSOFT Software Engineering Notes
  • D Jeya Mala + 1 more

Software development organizations spend considerable portion of their budget and time in testing related activities. The effectiveness of the verification and validation process depends upon the number of errors found and rectified before releasing the software to the customer side. This in turn depends upon the quality of test cases generated. The solution is to choose the most important and effective test cases and removing the redundant and unnecessary ones; which in turn leads to test case optimization. To achieve test case optimization, this paper proposed a heuristics guided population based search approach namely Hybrid Genetic Algorithm (HGA) which combines the features of Genetic Algorithm (GA) and Local Search (LS) techniques to reduce the number of test cases by improving the quality of test cases during the solution generation process. Also, to evaluate the performance of the proposed approach, a comparative study is conducted with Genetic Algorithm and Bacteriologic Algorithm (BA) and concluded that, the proposed HGA based approach produces better results.

  • Research Article
  • Cite Count Icon 3
  • 10.1016/j.fcij.2018.02.004
An optimization approach for automated unit test generation tools using multi-objective evolutionary algorithms
  • May 30, 2018
  • Future Computing and Informatics Journal
  • Samar Ali Abdallah + 2 more

An optimization approach for automated unit test generation tools using multi-objective evolutionary algorithms

  • Book Chapter
  • Cite Count Icon 15
  • 10.1007/978-981-13-0761-4_36
Test Case Optimization and Prioritization Based on Multi-objective Genetic Algorithm
  • Aug 24, 2018
  • Deepti Bala Mishra + 3 more

The validation of modified software depends on the success of Regression testing. For this, test cases are selected in such a way that can detect a maximum number of faults at the earliest stage of software development. The selection process in which the most beneficial test case are executed first is known as test case prioritization which improves the performance of execution of test cases in a specific or appropriate order. Many optimizing techniques like greedy algorithm, genetic algorithm, and metaheuristic search techniques have been used by many researchers for test case prioritization and optimization. This research paper presents a test case prioritization and optimization method using genetic algorithm by taking different factors of test cases like statement coverage data, requirements factors, risk exposure, and execution time.

Save Icon
Up Arrow
Open/Close
Notes

Save Important notes in documents

Highlight text to save as a note, or write notes directly

You can also access these Documents in Paperpal, our AI writing tool

Powered by our AI Writing Assistant