• All Solutions All Solutions Caret
    • Editage

      One platform for all researcher needs

    • Paperpal

      AI-powered academic writing assistant

    • R Discovery

      Your #1 AI companion for literature search

    • Mind the Graph

      AI tool for graphics, illustrations, and artwork

    • Journal finder

      AI-powered journal recommender

    Unlock unlimited use of all AI tools with the Editage Plus membership.

    Explore Editage Plus
  • Support All Solutions Support
    discovery@researcher.life
Discovery Logo
Sign In
Paper
Search Paper
Cancel
Pricing Sign In
  • My Feed iconMy Feed
  • Search Papers iconSearch Papers
  • Library iconLibrary
  • Explore iconExplore
  • Ask R Discovery iconAsk R Discovery Star Left icon
  • Chat PDF iconChat PDF Star Left icon
  • Chrome Extension iconChrome Extension
    External link
  • Use on ChatGPT iconUse on ChatGPT
    External link
  • iOS App iconiOS App
    External link
  • Android App iconAndroid App
    External link
  • Contact Us iconContact Us
    External link
Discovery Logo menuClose menu
  • My Feed iconMy Feed
  • Search Papers iconSearch Papers
  • Library iconLibrary
  • Explore iconExplore
  • Ask R Discovery iconAsk R Discovery Star Left icon
  • Chat PDF iconChat PDF Star Left icon
  • Chrome Extension iconChrome Extension
    External link
  • Use on ChatGPT iconUse on ChatGPT
    External link
  • iOS App iconiOS App
    External link
  • Android App iconAndroid App
    External link
  • Contact Us iconContact Us
    External link

Related Topics

  • Automatic Test Data Generation
  • Automatic Test Data Generation
  • Automated Software Testing
  • Automated Software Testing
  • Automatic Test
  • Automatic Test

Articles published on Automatic test generation

Authors
Select Authors
Journals
Select Journals
Duration
Select Duration
253 Search results
Sort by
Recency
  • New
  • Research Article
  • 10.17759/mda.2025150406
Оптимизация процесса создания автотестов веб-приложений с использованием LLM и структурного анализа HTML
  • Dec 28, 2025
  • Моделирование и анализ данных
  • A.M Titeev

<p><strong>Context and relevance.</strong><strong> </strong>Modern web application development requires continuous testing, but maintaining automated tests is becoming increasingly labor-intensive due to locator instability and growing interface complexity. The emergence of Large Language Models (LLM) opens new opportunities for test creation automation, but their practical application faces challenges in processing large HTML documents and the need to create maintainable code. <strong>Objective. </strong>To develop and evaluate the effectiveness of a method for automatic generation of maintainable web application tests using LLM based on HTML structure analysis and the Page Object Model(POM) pattern. <strong>Hypotheses. </strong>Primary hypothesis: combining LLM with a two-stage generation approach and the POM pattern will enable the creation of maintainable tests, reducing development time by at least one-third (to 67% or less) while preserving code readability. Secondary hypothesis: the success rate of automatic generation will be inversely proportional to the complexity of interface components. <strong>Methods and materials. </strong>The study employed an approach based on Playwright, LLM, and a two-stage generation procedure with intermediate validation. Testing was conducted on four components of an SPA application for virtual infrastructure management. Validation of results was performed by a team of three testers who assessed the correctness and readability of generated tests. <strong>Results. </strong>The proposed method achieved high success rates in automatic test generation and substantial reduction in time costs for test creation. The two-stage procedure with intermediate validation enabled localization of a significant portion of errors at the early stage of Page Object creation. Automatically generated tests provided coverage of most required functionality while maintaining code readability. An inverse relationship between generation success and interface component complexity was confirmed: standardized interfaces demonstrated significantly higher success rates. <strong>Conclusions.</strong> The proposed method provides substantial time savings in creating a baseline test suite while maintaining quality and maintainability. The approach is recommended for early stages of feature development with expert control retained for validating critical scenarios. The method is particularly effective for projects with frequent interface changes, large volumes of regression testing, and components with standardized interfaces.</p>

  • Research Article
  • 10.36548/jucct.2025.4.003
LLM Driven Unit Test Case Generation Using Agentic AI
  • Dec 1, 2025
  • Journal of Ubiquitous Computing and Communication Technologies
  • Baskaran S + 2 more

Unit testing plays a crucial role in application software development by validating module functionality in isolation before system integration. Manually writing and reviewing unit test cases is time-consuming and defect-prone. Complex logic and boundary conditions are not tested thoroughly, leading to higher rework costs. Automated test generation using Large Language Models (LLMs) reduces development effort but faces challenges such as ensuring meaningful test coverage, handling invalid inputs, and addressing missing imports. This study aims to leverage LLMs in combination with the Autogen Agentic AI framework to generate high-quality Python unit tests by effectively prompting them, fixing failed test cases, validating them through test execution, analyzing results, and improving code coverage and mutation score. For experiments conducted on the Insurance Management Application, branch coverage improved from 98% to 99%, and the mutation score improved from 83.9% to 95.8%. The proposed approach significantly reduces manual effort while improving test suite effectiveness and software quality.

  • Research Article
  • 10.53360/2788-7995-2025-3(19)-13
SCALING PHYSICS TEST ITEMS FOR COMPUTERIZED ADAPTIVE TESTING BASED ON THE RASCH MODEL
  • Nov 3, 2025
  • Bulletin of Shakarim University. Technical Sciences
  • A Iskakova + 3 more

Adaptive testing is one of the most effective approaches to digital knowledge assessment, providing personalization through the automated selection of test items tailored to the examinee’s proficiency level. The key components of such testing include: a bank of scaled test items, an adaptation algorithm, and specialized software. Developing a high-quality item bank requires preliminary psychometric analysis to evaluate their suitability for use in adaptive systems.This article presents an empirical analysis of a set of physics test items using the Rasch model. The study involved piloting the items on a representative sample of students, followed by scaling using the Winsteps software. For each item, difficulty parameters, model-fit indices, and correlation characteristics were determined. Items that did not meet the requirements of adaptive testing were identified and excluded from the final bank. As a result, a set of items with stable statistical properties was formed, suitable for further use in computerized adaptive knowledge assessment systems.The findings confirm the feasibility of integrating the developed item bank into educational information systems and digital platforms. Future publications will present real-time adaptive testing algorithms and the development of software for automated test generation based on scaled parameters. This study lays the groundwork for creating effective digital tools for assessing learning outcomes.

  • Research Article
  • 10.1109/tc.2025.3587515
Automatic Generation of System-Level Test for Un-Core Logic of Large Automotive SoC
  • Sep 1, 2025
  • IEEE Transactions on Computers
  • Francesco Angione + 4 more

Automatic Generation of System-Level Test for Un-Core Logic of Large Automotive SoC

  • Research Article
  • 10.3390/wevj16080417
AI-Driven Automated Test Generation Framework for VCU: A Multidimensional Coupling Approach Integrating Requirements, Variables and Logic
  • Jul 24, 2025
  • World Electric Vehicle Journal
  • Guangyao Wu + 2 more

This paper proposes an AI-driven automated test generation framework for vehicle control units (VCUs), integrating natural language processing (NLP) and dynamic variable binding. To address the critical limitation of traditional AI-generated test cases lacking executable variables, the framework establishes a closed-loop transformation from requirements to executable code through a five-layer architecture: (1) structured parsing of PDF requirements using domain-adaptive prompt engineering; (2) construction of a multidimensional variable knowledge graph; (3) semantic atomic decomposition of requirements and logic expression generation; (4) dynamic visualization of cause–effect graphs; (5) path-sensitization-driven optimization of test sequences. Validated on VCU software from a leading OEM, the method achieves 97.3% variable matching accuracy and 100% test case executability, reducing invalid cases by 63% compared to conventional NLP approaches. This framework provides an explainable and traceable automated solution for intelligent vehicle software validation, significantly enhancing efficiency and reliability in automotive testing.

  • Research Article
  • 10.3390/electronics14142835
Automated Test Generation and Marking Using LLMs
  • Jul 15, 2025
  • Electronics
  • Ioannis Papachristou + 2 more

This paper presents an innovative exam-creation and grading system powered by advanced natural language processing and local large language models. The system automatically generates clear, grammatically accurate questions from both short passages and longer documents across different languages, supports multiple formats and difficulty levels, and ensures semantic diversity while minimizing redundancy, thus maximizing the percentage of the material that is covered in the generated exam paper. For grading, it employs a semantic-similarity model to evaluate essays and open-ended responses, awards partial credit, and mitigates bias from phrasing or syntax via named entity recognition. A major advantage of the proposed approach is its ability to run entirely on standard personal computers, without specialized artificial intelligence hardware, promoting privacy and exam security while maintaining low operational and maintenance costs. Moreover, its modular architecture allows the seamless swapping of models with minimal intervention, ensuring adaptability and the easy integration of future improvements. A requirements–compliance evaluation, combined with established performance metrics, was used to review and compare two popular multilingual LLMs and monolingual alternatives, demonstrating the system’s effectiveness and flexibility. The experimental results show that the system achieves a grading accuracy within a 17% normalized error margin compared to that of human experts, with generated questions reaching up to 89.5% semantic similarity to source content. The full exam generation and grading pipeline runs efficiently on consumer-grade hardware, with average inference times under 30 s.

  • Research Article
  • 10.1145/3748505
Advancing Code Coverage: Incorporating Program Analysis with Large Language Models
  • Jul 14, 2025
  • ACM Transactions on Software Engineering and Methodology
  • Chen Yang + 4 more

Automatic test generation plays a critical role in software quality assurance. While the recent advances in Search-Based Software Testing (SBST) and Large Language Models (LLMs) have shown promise in generating useful tests, these techniques still struggle to cover certain branches. Reaching these hard-to-cover branches usually requires constructing complex objects and resolving intricate inter-procedural dependencies in branch conditions, which poses significant challenges for existing techniques. In this work, we propose TELPA, a novel technique aimed at addressing these challenges. Its key insight lies in extracting real usage scenarios of the target method under test to learn how to construct complex objects and extracting methods entailing inter-procedural dependencies with hard-to-cover branches to learn the semantics of branch constraints. To enhance efficiency and effectiveness, TELPA identifies a set of ineffective tests as counter-examples for LLMs and employs a feedback-based process to iteratively refine these counter-examples. Then, TELPA integrates program analysis results and counter-examples into the prompt, guiding LLMs to gain deeper understandings of the semantics of the target method and generate diverse tests that can reach the hard-to-cover branches. Our experimental results on 27 open-source Python projects demonstrate that TELPA significantly outperforms the state-of-the-art SBST and LLM-enhanced techniques, achieving an average improvement of 34.10% and 25.93% in terms of branch coverage.

  • Research Article
  • 10.1145/3747189
Learning by Viewing: Generating Test Inputs for Games by Integrating Human Gameplay Traces in Neuroevolution
  • Jul 4, 2025
  • ACM Transactions on Evolutionary Learning and Optimization
  • Patric Feldmeier + 1 more

Although automated test generation is common in many programming domains, games still challenge test generators due to their heavy randomisation and hard-to-reach program states. Neuroevolution combined with search-based software testing principles has been shown to be a promising approach for testing games, but the co-evolutionary search for optimal network topologies and weights involves unreasonably long search durations. Humans, on the other hand, tend to be quick in picking up basic gameplay. In this paper, we therefore aim to improve the evolutionary search for game input generators by integrating knowledge about human gameplay behaviour. To this end, we propose a novel way of systematically recording human gameplay traces, and integrating these traces into the evolutionary search for networks using traditional gradient descent as a mutation operator. Experiments conducted on ten diverse Scratch games demonstrate that the proposed approach reduces the average search time from five hours down to only 97 minutes and helps the test generator achieve higher program coverage by reaching the winning states of games more often.

  • Research Article
  • 10.62724/202520305
АВТОМАТИЗИРОВАННОЕ ТЕСТИРОВАНИЕ С ИСПОЛЬЗОВАНИЕМ ИСКУССТВЕННОГО ИНТЕЛЛЕКТА
  • Jun 30, 2025
  • Батыс Қазақстан инновациялық-технологиялық университетінің Хабаршысы
  • Альбина Кайранбаева

This paper explores the application of artificial intelligence (AI) technologies in the automation of software testing processes. With the increasing complexity of software systems and the shortening of product delivery timelines, traditional testing approaches are becoming less effective. The use of AI opens up new opportunities to optimize testing processes, improve flexibility, and reduce costs. The article analyzes key technologies such as machine learning, neural networks, natural language processing, and intelligent decision-making systems. It also provides an overview of modern AI-based testing tools, including Testim.io, Applitools, Functionize, and Mabl. The main advantages of integrating AI into QA include automatic test generation, adaptability to changes, detection of complex defects, and expanded coverage. At the same time, certain challenges are discussed, such as the "black box" problem, the demand for high-quality data, implementation costs, and the difficulty of supporting unstable interfaces. The conclusion outlines promising directions such as the use of generative models, Explainable AI, and the development of autonomous testing agents. The analysis shows that the integration of AI into testing can make the process more intelligent, efficient, and adaptive, with great potential for the future.

  • Research Article
  • 10.20295/2413-2527-2025-242-93-102
Система автоматического тестирования технологического программного обеспечения систем микропроцессорной централизации
  • Jun 26, 2025
  • Intellectual Technologies on Transport
  • Oleg Nasedkin + 2 more

Automated testing of technological software for computer-based interlocking systems is of critical importance in ensuring the safety of railway traffic. Introduction: as the CBI software components become more complex, so manual testing methods are no longer adequate. Purpose: to develop an automated testing system for the CBI software based on a scripting approach that ensures the overall verification of functional requirements and the correctness of the algorithms. Methods: a hybrid approach combining the Lua scripting language for describing test scenarios, a virtual environment for simulating the operation of outdoor equipment, automatic test generation, and the integration with an expert protocol analysis system. Results: a modular testing system that includes a library of test scripts, an interpreter with a specialized API for interacting with the computer software, and automatic validation mechanisms have been designed. Practical significance: the approach demonstrated its effectiveness in real CBI configurations. The development directions have been outlined as follows: integration with CI/CD and expansion of coverage with fault tolerance tests. Discussion: the research has revealed the advantages of the script approach. These include the independence of tests from a specific station and the possibility of reusing scripts.

  • Research Article
  • 10.26906/sunz.2025.2.102
AUTOMATED TEST GENERATION TECHNIQUES FOR C++ SOFTWARE
  • Jun 19, 2025
  • Системи управління, навігації та зв’язку. Збірник наукових праць
  • Mykhailo Hulevych + 1 more

Automation of test script generation is critically important in modern software quality assurance, particularly for complex languages such as C++, where manual testing becomes resource-intensive and error-prone. Automated approaches significantly reduce the testing effort, improve test effectiveness, and enhance overall software reliability. The CIDER tool, introduced by the author, offers a promising automated solution by generating test scenarios based on recorded program executions using harmony search for inputs optimization. Despite its benefits, tool faces next limitations: incomplete coverage of complex branching logic, inefficient exploration of large input spaces, and difficulties in handling semantically rich or contextually complex input data. The overview article aims to systematically explore, compare, and evaluate various automated test generation techniques such as symbolic execution, concolic testing, evolutionary algorithms, reinforcement learning, and model-based testing. The objectives are: to classify these methods based on specified criteria and identify suitable approaches that could enhance the tool's capability for automated test generation in terms of coverage efficiency.

  • Research Article
  • 10.1145/3715741
Understanding and Characterizing Mock Assertions in Unit Tests
  • Jun 19, 2025
  • Proceedings of the ACM on Software Engineering
  • Hengcheng Zhu + 5 more

Mock assertions provide developers with a powerful means to validate program behaviors that are unobservable to test assertions. Despite their significance, they are rarely considered by automated test generation techniques. Effective generation of mock assertions requires understanding how they are used in practice. Although previous studies highlighted the importance of mock assertions, none provide insight into their usages. To bridge this gap, we conducted the first empirical study on mock assertions, examining their adoption, the characteristics of the verified method invocations, and their effectiveness in fault detection. Our analysis of 4,652 test cases from 11 popular Java projects reveals that mock assertions are mostly applied to validating specific kinds of method calls, such as those interacting with external resources and those reflecting whether a certain code path was traversed in systems under test. Additionally, we find that mock assertions complement traditional test assertions by ensuring the desired side effects have been produced, validating control flow logic, and checking internal computation results. Our findings contribute to a better understanding of mock assertion usages and provide a foundation for future related research such as automated test generation that support mock assertions.

  • Research Article
  • 10.1145/3729359
TerzoN: Human-in-the-Loop Software Testing with a Composite Oracle
  • Jun 19, 2025
  • Proceedings of the ACM on Software Engineering
  • Matthew C Davis + 3 more

Software testing is difficult, tedious, and may consume 28%–50% of software engineering labor. Automatic test generators aim to ease this burden but have important trade-offs. Fuzzers use an implicit oracle that can detect obviously invalid results, but the oracle problem has no general solution, and an implicit oracle cannot automatically evaluate correctness. Test suite generators like EvoSuite use the program under test as the oracle and therefore cannot evaluate correctness. Property-based testing tools evaluate correctness, but users have difficulty coming up with properties to test and understanding whether their properties are correct. Consequently, practitioners create many test suites manually and often use an example-based oracle to tediously specify correct input and output examples. To help bridge the gaps among various oracle and tool types, we present the Composite Oracle, which organizes various oracle types into a hierarchy and renders a single test result per example execution. To understand the Composite Oracle’s practical properties, we built TerzoN, a test suite generator that includes a particular instantiation of the Composite Oracle. TerzoN displays all the test results in an integrated view composed from the results of three types of oracles and finds some types of test assertion inconsistencies that might otherwise lead to misleading test results. We evaluated TerzoN in a randomized controlled trial with 14 professional software engineers with a popular industry tool, fast-check, as the control. Participants using TerzoN elicited 72% more bugs (p < 0.01), accurately described more than twice the number of bugs (p < 0.01) and tested 16% more quickly (p < 0.05) relative to fast-check.

  • Research Article
  • 10.52783/jisem.v10i52s.10768
Architecting Agentic AI for Modern Software Testing: Capabilities, Foundations, and a Proposed Scalable Multi-Agent System for Automated Test Generation
  • Jun 1, 2025
  • Journal of Information Systems Engineering and Management
  • Twinkle Joshi

The progression of software testing has evolved from manual processes to automated systems. However, the emergence of Agentic AI-driven testing represents the next transformative leap. These intelligent agents autonomously generate, execute, and optimize tests, redefining the quality assurance (QA) landscape. Agentic AI—defined by its capacity to independently perceive, plan, execute, and learn—has emerged as a transformative force in software testing. This article examines the impact of Agentic AI on the software testing lifecycle, highlighting its core capabilities, such as dynamic test generation, autonomous execution, intelligent root-cause analysis, multi-modal command interpretation, and context-aware decision-making. These capabilities enable a significant shift from brittle test scripts and reactive maintenance to proactive, adaptive, and self-optimizing testing systems. We further introduce a novel architectural framework that applies Agentic AI principles to automated test scenario generation. This multi-agent system comprises a Perception Module for requirement and code understanding, a Cognitive Module for strategic planning and intelligent scenario design, and an Action Module for executing, analyzing, and learning from tests. Built on state-of-the-art technologies—including large language models (LLMs), retrieval-augmented generation (RAG), deep learning, and vector databases—our framework enables seamless integration with CI/CD pipelines, supports multi-format output generation, and incorporates continuous learning for test optimization. The proposed architecture demonstrates how Agentic AI can enhance test coverage, improve software reliability, and reduce the cost and effort of maintaining large-scale testing infrastructures. It provides an intelligent, scalable, and future-ready solution for quality assurance in fast-paced, modern development environments.

  • Research Article
  • 10.30574/wjaets.2025.15.1.0215
Ethical implications of AI-driven financial systems
  • Apr 30, 2025
  • World Journal of Advanced Engineering Technology and Sciences
  • Krishna Chaitanya Saride

This article examines how artificial intelligence technologies are revolutionizing the maintenance and modernization of legacy software systems in large organizations. Legacy systems, despite their outdated architectures, continue to power critical business operations while posing significant challenges due to poor documentation, obsolete programming paradigms, and the loss of original developer knowledge. The article demonstrates how AI-driven solutions address these challenges through automated documentation generation and code modernization strategies. These technologies enable comprehensive system understanding through semantic code analysis, facilitate incremental modernization through intelligent refactoring, and reduce risks through automated test generation. By implementing hybrid human-AI workflows and following incremental modernization strategies, organizations can transform aging codebases into well-documented, maintainable systems while avoiding the pitfalls of complete rewrites. The economic benefits include reduced maintenance costs, improved system agility, faster time-to-market, and enhanced developer productivity, making AI-assisted modernization a strategic imperative for organizations seeking to remain competitive in rapidly evolving markets.

  • Research Article
  • 10.30574/wjaets.2025.15.1.0367
Automating documentation and legacy code modernization: Revitalizing legacy systems with AI
  • Apr 30, 2025
  • World Journal of Advanced Engineering Technology and Sciences
  • Praveen Kumar Manchikoni Surendra

This article examines how artificial intelligence technologies are revolutionizing the maintenance and modernization of legacy software systems in large organizations. Legacy systems, despite their outdated architectures, continue to power critical business operations while posing significant challenges due to poor documentation, obsolete programming paradigms, and the loss of original developer knowledge. The article demonstrates how AI-driven solutions address these challenges through automated documentation generation and code modernization strategies. These technologies enable comprehensive system understanding through semantic code analysis, facilitate incremental modernization through intelligent refactoring, and reduce risks through automated test generation. By implementing hybrid human-AI workflows and following incremental modernization strategies, organizations can transform aging codebases into well-documented, maintainable systems while avoiding the pitfalls of complete rewrites. The economic benefits include reduced maintenance costs, improved system agility, faster time-to-market, and enhanced developer productivity, making AI-assisted modernization a strategic imperative for organizations seeking to remain competitive in rapidly evolving markets.

  • Research Article
  • 10.30574/wjarr.2025.26.1.1122
The role of automated testing in scaling global E-commerce operations: A technical deep dive
  • Apr 30, 2025
  • World Journal of Advanced Research and Reviews
  • Ajay Seelamneni

E-commerce platforms face mounting challenges in maintaining reliable operations across global markets, particularly in managing cross-border trade and peak traffic events. AI-enhanced performance testing methodologies are revolutionizing how online retailers handle these challenges by integrating machine learning with traditional testing tools. The evolution spans from automated test generation to predictive analytics, enabling organizations to proactively identify and address potential issues. Through distributed testing architectures and comprehensive monitoring solutions, platforms can now ensure seamless performance across diverse geographic regions while maintaining regulatory compliance and optimal user experience. The integration of artificial intelligence not only transforms technical testing capabilities but also delivers substantial improvements in business metrics, setting new standards for e-commerce operations globally.

  • Research Article
  • 10.1609/aaai.v39i28.35246
QAagent: A Multiagent System for Unit Test Generation via Natural Language Pseudocode (Student Abstract)
  • Apr 11, 2025
  • Proceedings of the AAAI Conference on Artificial Intelligence
  • Akhil Deo

Unit testing is essential for ensuring software quality, but it is often time-consuming and prone to developer oversight. With the rise of large language models (LLMs) in code generation, there is an increasing need for reliable and automated test generation systems. This work presents QAagent, a multi-agent system designed to generate unit tests using natural language pseudocode. QAagent leverages LLMs to create a detailed natural language plan of a function's implementation and then generates a comprehensive suite of test cases covering both base and edge scenarios. Experiments conducted on two widely-used benchmarks, HumanEval and MBPP, show that QAagent consistently outperforms existing frameworks in terms of code coverage, although its accuracy varies across datasets, demonstrating the potential for utilizing natural language pseudocode to to enhance automated test generation in LLM-driven coding environments.

  • Research Article
  • 10.1145/3720448
Metamorph: Synthesizing Large Objects from Dafny Specifications
  • Apr 9, 2025
  • Proceedings of the ACM on Programming Languages
  • Aleksandr Fedchin + 2 more

Program synthesis aims to produce code that adheres to user-provided specifications. In this work, we focus on synthesizing sequences of calls to formally specified APIs to generate objects that satisfy certain properties. This problem is particularly relevant in automated test generation, where a test engine may need an object with specific properties to trigger a given execution path. Constructing instances of complex data structures may require dozens of method calls, but reasoning about consecutive calls is computationally expensive, and existing work typically limits the number of calls in the solution. In this paper, we focus on synthesizing such long sequences of method calls in the Dafny programming language. To that end, we introduce Metamorph, a synthesis tool that uses counterexamples returned by the Dafny verifier to reason about the effects of method calls one at a time, limiting the complexity of solver queries. We also aim to limit the overall number of SMT queries by comparing the counterexamples using two distance metrics we develop for guiding the synthesis process. In particular, we introduce a novel piecewise distance metric, which puts a provably correct lower bound on the number of method calls in the solution and allows us to frame the synthesis problem as weighted A* search. When computing piecewise distance, we view object states as conjunctions of atomic constraints, identify constraints that each method call can satisfy, and combine this information using integer programming. We evaluate Metamorph’s ability to generate large objects on six benchmarks defining key data structures: linked lists, queues, arrays, binary trees, and graphs. Metamorph can successfully construct programs that require up to 57 method calls per instance and compares favorably to an alternative baseline approach. Additionally, we integrate Metamorph with DTest, Dafny’s automated test generation toolkit, and show that Metamorph can synthesize test inputs for parts of the AWS Cryptographic Material Providers Library that DTest alone is not able to cover. Finally, we use Metamorph to generate executable bytecode for a simple virtual machine, demonstrating that the techniques described here are more broadly applicable in the context of specification-guided synthesis.

  • Research Article
  • Cite Count Icon 1
  • 10.1007/s10664-025-10635-z
Enriching automatic test case generation by extracting relevant test inputs from bug reports
  • Mar 24, 2025
  • Empirical Software Engineering
  • Wendkûuni C Ouédraogo + 6 more

The quality of software is closely tied to the effectiveness of the tests it undergoes. Manual test writing, though crucial for bug detection, is time-consuming, which has driven significant research into automated test case generation. However, current methods often struggle to generate relevant inputs, limiting the effectiveness of the tests produced. To address this, we introduce BRMiner, a novel approach that leverages Large Language Models (LLMs) in combination with traditional techniques to extract relevant inputs from bug reports, thereby enhancing automated test generation tools. In this study, we evaluate BRMiner using the Defects4J benchmark and test generation tools such as EvoSuite and Randoop. Our results demonstrate that BRMiner achieves a Relevant Input Rate (RIR) of 60.03% and a Relevant Input Extraction Accuracy Rate (RIEAR) of 31.71%, significantly outperforming methods that rely on LLMs alone. The integration of BRMiner’s input enhances EvoSuite ability to generate more effective test, leading to increased code coverage, with gains observed in branch, instruction, method, and line coverage across multiple projects. Furthermore, BRMiner facilitated the detection of 58 unique bugs, including those that were missed by traditional baseline approaches. Overall, BRMiner’s combination of LLM filtering with traditional input extraction techniques significantly improves the relevance and effectiveness of automated test generation, advancing the detection of bugs and enhancing code coverage, thereby contributing to higher-quality software development.

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • .
  • .
  • .
  • 10
  • 1
  • 2
  • 3
  • 4
  • 5

Popular topics

  • Latest Artificial Intelligence papers
  • Latest Nursing papers
  • Latest Psychology Research papers
  • Latest Sociology Research papers
  • Latest Business Research papers
  • Latest Marketing Research papers
  • Latest Social Research papers
  • Latest Education Research papers
  • Latest Accounting Research papers
  • Latest Mental Health papers
  • Latest Economics papers
  • Latest Education Research papers
  • Latest Climate Change Research papers
  • Latest Mathematics Research papers

Most cited papers

  • Most cited Artificial Intelligence papers
  • Most cited Nursing papers
  • Most cited Psychology Research papers
  • Most cited Sociology Research papers
  • Most cited Business Research papers
  • Most cited Marketing Research papers
  • Most cited Social Research papers
  • Most cited Education Research papers
  • Most cited Accounting Research papers
  • Most cited Mental Health papers
  • Most cited Economics papers
  • Most cited Education Research papers
  • Most cited Climate Change Research papers
  • Most cited Mathematics Research papers

Latest papers from journals

  • Scientific Reports latest papers
  • PLOS ONE latest papers
  • Journal of Clinical Oncology latest papers
  • Nature Communications latest papers
  • BMC Geriatrics latest papers
  • Science of The Total Environment latest papers
  • Medical Physics latest papers
  • Cureus latest papers
  • Cancer Research latest papers
  • Chemosphere latest papers
  • International Journal of Advanced Research in Science latest papers
  • Communication and Technology latest papers

Latest papers from institutions

  • Latest research from French National Centre for Scientific Research
  • Latest research from Chinese Academy of Sciences
  • Latest research from Harvard University
  • Latest research from University of Toronto
  • Latest research from University of Michigan
  • Latest research from University College London
  • Latest research from Stanford University
  • Latest research from The University of Tokyo
  • Latest research from Johns Hopkins University
  • Latest research from University of Washington
  • Latest research from University of Oxford
  • Latest research from University of Cambridge

Popular Collections

  • Research on Reduced Inequalities
  • Research on No Poverty
  • Research on Gender Equality
  • Research on Peace Justice & Strong Institutions
  • Research on Affordable & Clean Energy
  • Research on Quality Education
  • Research on Clean Water & Sanitation
  • Research on COVID-19
  • Research on Monkeypox
  • Research on Medical Specialties
  • Research on Climate Justice
Discovery logo
FacebookTwitterLinkedinInstagram

Download the FREE App

  • Play store Link
  • App store Link
  • Scan QR code to download FREE App

    Scan to download FREE App

  • Google PlayApp Store
FacebookTwitterTwitterInstagram
  • Universities & Institutions
  • Publishers
  • R Discovery PrimeNew
  • Ask R Discovery
  • Blog
  • Accessibility
  • Topics
  • Journals
  • Open Access Papers
  • Year-wise Publications
  • Recently published papers
  • Pre prints
  • Questions
  • FAQs
  • Contact us
Lead the way for us

Your insights are needed to transform us into a better research content provider for researchers.

Share your feedback here.

FacebookTwitterLinkedinInstagram
Cactus Communications logo

Copyright 2026 Cactus Communications. All rights reserved.

Privacy PolicyCookies PolicyTerms of UseCareers