CoDefeater: Using LLMs To Find Defeaters in Assurance Cases
Constructing assurance cases is a widely used and sometimes required process toward demonstrating that safety-critical systems will operate safely in their planned environment. To mitigate the risk of errors and missing edge cases, the concept of defeaters - challenges to claims in an assurance case - has been introduced. Defeaters can detect weaknesses in the arguments, prompting further investigation and timely mitigations. However, capturing defeaters relies on expert judgment, experience, and creativity and must be done iteratively due to evolving requirements and regulations. In this paper, we propose CoDefeater, an automated process to leverage large language models (LLMs) for finding defeaters. Initial results on two systems show that LLMs can efficiently find known and unforeseen feasible defeaters to support safety analysts in enhancing the completeness and confidence of assurance cases.
- Research Article
4
- 10.2345/0899-8205-46.3.195
- May 1, 2012
- Biomedical Instrumentation & Technology
S.195-200
- Conference Article
- 10.1109/dasc55683.2022.9925789
- Sep 18, 2022
Aviation is a highly sophisticated and complex System-of-Systems (SoSs) with equally complex safety oversight. As novel products with autonomous functions and interactions between component systems are adopted, the number of interdependencies within and among the SoS grows. These interactions may not always be obvious. Understanding how proposed products (component systems) fit into the context of a larger SoS is essential to promote the safe use of new as well as conventional technology.UL 4600, is a Standard for Safety for the Evaluation of Autonomous Products specifically written for completely autonomous Load vehicles. The goal-based, technology-neutral features of this standard make it adaptable to other industries and applications.This paper, using the philosophy of UL 4600, gives guidance for creating an assurance case for products in an SoS context. An assurance argument is a cogent structured argument concluding that an autonomous aircraft system possesses all applicable through-life performance and safety properties. The assurance case process can be repeated at each level in the SoS: aircraft, aircraft system, unmodified components, and modified components. The original Equipment Manufacturer (OEM) develops the assurance case for the whole aircraft envisioned in the type certification process. Assurance cases are continuously validated by collecting and analyzing Safety Performance Indicators (SPIs). SPIs provide predictive safety information, thus offering an opportunity to improve safety by preventing incidents and accidents. Continuous validation is essential for risk-based approval of autonomously evolving (dynamic) systems, learning systems, and new technology. System variants, derivatives, and components are captured in a subordinate assurance case by their developer. These variants of the assurance case inherently reflect the evolution of the vehicle-level derivatives and options in the context of their specific target ecosystem. These subordinate assurance cases are nested under the argument put forward by the OEM of components and aircraft, for certification credit.It has become a common practice in aviation to address design hazards through operational mitigations. It is also common for hazards noted in an aircraft component system to be mitigated within another component system. Where a component system depends on risk mitigation in another component of the SoS, organizational responsibilities must be stated explicitly in the assurance case. However, current practices do not formalize accounting for these dependencies by the parties responsible for design; consequently, subsequent modifications are made without the benefit of critical safety-related information from the OEMs. The resulting assurance cases, including 3rd party vehicle modifications, must be scrutinized as part of the holistic validation process.When changes are made to a product represented within the assurance case, their impact must be analyzed and reflected in an updated assurance case. An OEM can facilitate this by integrating affected assurance cases across their customer’s supply chains to ensure their validity. The OEM is expected to exercise the sphere-of-control over their product even if it includes outsourced components. Any organization that modifies a product (with or without assurance argumentation information from other suppliers) is accountable for validating the conditions for any dependent mitigations. For example, the OEM may manage the assurance argumentation by identifying requirements and supporting SPI that must be applied in all component assurance cases. For their part, component assurance cases must accommodate all spheres-of-control that mitigate the risks they present in their respective contexts. The assurance case must express how interdependent mitigations will collectively assure the outcome. These considerations are much more than interface requirements and include explicit hazard mitigation dependencies between SoS components. A properly integrated SoS assurance case reflects a set of interdependent systems that could be independently developed..Even in this extremely interconnected environment, stakeholders must make accommodations for the independent evolution of products in a manner that protects proprietary information, domain knowledge, and safety data. The collective safety outcome for the SoS is based on the interdependence of mitigations by each constituent component and could not be accomplished by any single component. This dependency must be explicit in the assurance case and should include operational mitigations predicated on people and processes.Assurance cases could be used to gain regulatory approval of conventional and new technology. They can also serve to demonstrate consistency with a desired level of safety, especially in SoSs whose existing standards may not be adequate. This paper also provides guidelines for preserving alignment between component assurance cases along a product supply chain, and the respective SoSs that they support. It shows how assurance is a continuous process that spans product evolution through the monitoring of interdependent requirements and SPI. The interdependency necessary for a successful assurance case encourages stakeholders to identify and formally accept critical interconnections between related organizations. The resulting coordination promotes accountability for safety through increased awareness and the cultivation of a positive safety culture.
- Conference Article
8
- 10.1109/issre.2019.00045
- Oct 1, 2019
An assurance case (AC) captures explicit reasoning associated with assuring critical properties, such as safety. A vital attribute of an AC is that it facilitates the identification of fallacies in the validity of any claim. There is considerable published research related to confidence in ACs, which primarily relate to a measure of soundness of reasoning. Evaluation of an AC is more general than measuring confidence and considers multiple aspects of the quality of an AC. Evaluation criteria thus play a significant role in making the evaluation process more systematic. This paper contributes to the identification of effective evaluation criteria for ACs, the rationale for their use, and initial tests of the criteria on existing ACs. We classify these criteria as to whether they apply to the structure of the AC, or to the content of the AC. This paper focuses on safety as the critical property to be assured, but only a very small number of the criteria are specific to safety, and can serve as placeholders for evaluation criteria specific to other critical properties. All of the other evaluation criteria are generic. This separation is useful when evaluating ACs developed using different notations, and when evaluating ACs against safety standards. We explore the rationale for these criteria as well as the way they are used by the developers of the AC and also when they are used by a third-party evaluator.
- Research Article
10
- 10.47839/ijc.19.4.1995
- Dec 30, 2020
- International Journal of Computing
This paper presents a survey of Assurance Case implementation for applications which are not directly related to the usual for Assurance Case regulatory regime. The UK is the country which first developed the theory of Assurance Case as a response to big catastrophes, and most applies Assurance Case regime for many industrial domains. USA, Australia and EU countries apply Assurance Case approach for safety and security regulation and licensing. For the last two decades Assurance Case has been used mostly for confirmation analysis of critical systems with established set of regulatory requirements. There are proven standards of use, notations and tools to support Assurance Case methodology. However, many researchers have tried to find approach to expand Assurance Case application to communicating domains. We group the following directions of Assurance Case applications as the following ones: Assurance Case for attributes assessment such as quality, dependability and, first of all, safety and security, Assurance Case based certification, improvement of argumentation, assurance based development, and Assurance Case for knowledge management. The main challenges and solutions of development and application of Assurance Case methodology, techniques and tools have been analyzed.
- Conference Article
5
- 10.1109/mcsi.2016.063
- Aug 1, 2016
Assurance (Security and Safety) Case is a proven-in-use methodology to demonstrate a system compliance with security and safety critical requirements. An advance approach to improve Assurance Case is proposed in a view of Assurance Case Driven Design (AC DD). A practical using of AC DD lays in cost-effectiveness improvement of certification and licensing processes Assurance Case is based on graphical notations. These graphical notations are a part of formal methods, which originally are developed from classical mathematical models and methods. In this article we propose turn back to the set theory and graph theory which are the original fundamentals of Assurance Case. That allows as us to implement a kind of reverse engineering for a formal notation. We analyze basic mathematical models and methods to improve a known formal notation at the top level. As a result we develop Claim-Argument-Evidence-Criteria (CAEC) notation as well as Development-Verification & Validation-Assurance Case (DVA) notation for AC DD implementation.
- Book Chapter
6
- 10.1007/978-3-030-58920-2_1
- Jan 1, 2020
The Structured Assurance Case Metamodel (SACM) is a standard specified by the Object Management Group (OMG) that defines a metamodel for representing structured assurance cases. It is developed to support standardisation and interoperability in assurance case development. SACM provides a richer set of features than existing assurance case frameworks. By providing a standardised metamodel for assurance cases, SACM also provides a foundation for model-based assurance case development. For example, model merging can be used to bind packages in complex assurance cases and model validation can be used to check well-formedness of assurance cases. The uptake in the use of SACM has however been slow. The lack of a visual notation for representing SACM arguments has been a major factor in this. As part of the updates for version 2.1 of the SACM standard, we developed a graphical notation that addresses this need. Additionally, there are very few publicly available examples of how SACM may be used in practice, with the SACM standard providing only very limited examples. Moreover, there exists little literature that discusses the potential benefits that using SACM can bring for assurance cases. This paper provides, for the first time, an explanation and worked examples of how to use the SACM notation. The paper also discusses the potential benefits of using SACM for assurance case development and review and the need for empirically evaluating these benefits.
- Conference Article
8
- 10.1109/issrew.2019.00093
- Oct 1, 2019
Assurance cases are collections of standard-mandated documents that entail the specification of system's objectives and a collection of processes, development or verification evidence regarding the satisfaction of the respective objectives. A considerable amount of work has been done in the direction of modelling assurance cases, to support communication and reasoning regarding the system's safety. In this work, we present a set of features of ExplicitCase - a tool for modeling assurance cases. While there is a plethora of tools for creating and managing model-based assurance cases, the uniqueness of our tool is that it integrates assurance case models with system models created in AutoFOCUS3 (AF3) - an open-source model-based development tool for embedded software systems. While trying to keep up with state-of-the-art assurance case editors, the newly implemented features support assurance case creation using typed patterns, change impact analysis for assurance cases, assessment of the confidence in the created assurance arguments, export of the argumentation diagrams generated in ExplicitCase and integration of assurance case models with system models created in AutoFOCUS3. In particular, based on the integration with AF3 system models, we propose automatic support for detecting the impact of a change within system models on the assurance case model, thus enabling the integrated development of system and assurance case models.
- Conference Article
1
- 10.1109/dasc55683.2022.9925731
- Sep 18, 2022
Aviation safety has accrued and applied decades of understanding on known risks and effective mitigations. That knowledge captured in compliance standards - can be tested for predictable outcomes. Autonomy that involves learning systems tend to be dynamic and may continuously be adapting to their environment. Such continuous adaptation has inherent unknown risks depending upon the guardrails imposed on the learning systems. Design assurance and use of traditional standards are inadequate for these dynamic systems.Assurance Cases are used to present an argument for the assurance of systems. Dynamic systems require that assurance cases be continuously validated. One method of validation is using real time collection of Safety Performance Indicators (SPIs) which are crafted during the development of the system. This paper presents the need for SPIs and methods for creating and nurturing the SPIs to help all stakeholders. This method shadows regulations and allows risk-based approvals that may be applied for both conventional and for novel technology.Aviation is facing enormous growth in autonomous technology, reuse of components of unknown pedigree, and new aircraft designs that do not fit into Type classification. Mitigations are more heavily connected to operations, training, and other components of the ecosystem itself. The challenge is to make an assurance case for vehicles within the ecosystem. The automobile industry, which is similarly challenged by dynamically changing autonomous systems, is finding some possible solutions to build safer systems.UL 4600, a Standard for Safety for the Evaluation of Autonomous Products applies to fully autonomous road vehicles. The goal-based, technology-neutral features of UL 4600 have been extended to apply to aviation. So applied, the assurance process is adaptable to innovation and discovery while encouraging the current practices of standards compliance and taking a System of Systems (SoS) view. It proposes an assurance case that is an organized argument that a system is acceptable for its intended use with respect to specified concerns (such as safety, security, correctness). This paper gives guidance for validation of an assurance case through monitoring SPI within the operational context. The method by monitoring safety performance indices in the operational environment provides continued validation even as the ecosystem, components and controls change.For approval of novel systems including UAS and AAM, with features that do not lend themselves to traditional compliance methods, regulators have embraced the Safety Continuum perspective, which focuses on safety performance achieving expected outcomes. The performance-based assurance methods can be used with initially wider performance margins for certification of novel products, components of unknown pedigree, and autonomous vehicles. As the performance range is better known the margins can be decreased.Further, this paper recognizes that a one-time initial approval/acceptance is not adequate for learning systems and novel features. The continued validation through performance supports fast-paced development and product evolution. The initial assurance case for a product can limit risk through a closed environment until the margin for some unknowns is validated. for example, if the performance of collision avoidance function using new technology is not known, larger alert limits may be implemented until more confidence is gained after validating the assurance case via SPIs.The approach of monitored SPI throughout the life of the product is now feasible with the aid of big data processing. The aviation industry is already using similar methods for identifying maintenance problems. As systems grow more autonomous, more machine-to-machine exchanges are involved, making it easy to extend the monitoring and prediction practices to SPI.The method also allows for variants and derivatives of the baseline to have their own assurance case within the context of the baseline argument. The key is replacing design approval with through-life assurance that connects continuous operational safety into both the design and airworthiness determinations. The determination is predicated on the monitored SPIs and predicted performance of the product remaining consistent with the assurance argument predictions. This enables even complex automated products to be audited for airworthiness with an evolving ecosystem based on monitored and predictive data.Another advantage of the performance-based assurance case is the public comprehensibility of safety. With SPIs and predictions of performance the automobile segment has paved the way for public scrutiny of automated vehicles. The use of SPIs in aeronautical product assurance will facilitate transparency. This could be accomplished through appropriate dashboards to aid public perception and explain events and precautions taken during the evolution toward more autonomous aviation vehicles. This could reflect a stepwise evolution of complexity.This paper explores how the aviation industry can apply performance-based assurance case methods to assure new and novel as well as systems of unknown pedigree. The same framework could then be extended to autonomous systems and new types of aircraft which do not fit the current Type classification. One of the major benefits of this technology agnostic method are faster risk-based approvals of novel technology within a Safety Continuum.
- Conference Article
103
- 10.1109/hase.2015.25
- Jan 1, 2015
Assurance cases are used to demonstrate confidence in properties of interest for a system, e.g. For safety or security. A model-based assurance case seeks to bring the benefits of model-driven engineering, such as automation, transformation and validation, to what is currently a lengthy and informal process. In this paper we develop a model-based assurance approach, based on a weaving model, which allows integration between assurance case, design and process models and meta-models. In our approach, the assurance case itself is treated as a structured model, with the aim that all entities in the assurance case become linked explicitly to the models that represent them. We show how it is possible to exploit the weaving model for automated generation of assurance cases. Building upon these results, we discuss how a seamless model-driven approach to assurance cases can be achieved and examine the utility of increased formality and automation.
- Research Article
- 10.1145/3685936
- Dec 5, 2024
- Formal Aspects of Computing
Assurance cases are structured arguments used to demonstrate specific system properties such as safety or security. They are used in many industrial sectors including automotive, aviation and medical devices. Assurance cases are usually divided into modules which address goals allocated to specific system properties, components, functions, modes of operation or environmental conditions. Depending on the system and assurance process characteristics, assurance case modules may follow shared argument templates. The templates refer to the system, process or environment attributes, described collectively as an assurance case context and stored in external context models. Our goal is to manage all contextual relations at the level of assurance case templates and instantiated arguments with the use of a generic System Assurance Reference Model (SARM). We describe its structure and demonstrate how it can be used to automatically generate assurance case modules, based on templates and context models. The article also presents a prototype tool, SARMER, which implements the SARM model and enables automatic data flow between models and assurance cases. The use of SARM and the SARMER tool is illustrated with an example of a component-based system and a modular assurance case to demonstrate that allocated contracts are satisfied for each component.
- Research Article
- 10.1002/sys.70010
- Sep 14, 2025
- Systems Engineering
Assurance cases (ACs) are structured arguments designed to show that a system is sufficiently reliable to function properly in its operational environment. They are mandated by safety standards and are largely used in industry to support risk management for systems; however, ACs often contain proprietary information and are not publicly available. Therefore, the benefits of AC development are usually not rigorously documented, measured, or assessed. In this paper, we empirically evaluate the effectiveness of using ACs to show that a system is reliable using a case study over the CERN Large Hadron Collider (LHC) Machine Protection System (MPS). We used open‐source documentation to create an AC over the MPS and used the Eliminative Argumentation (EA) methodology for its development. The development involved four authors with considerable experience in AC development, three of whom work for Critical System Labs, a small enterprise specializing in ACs. Our findings show that (a) the cost and time required to develop our AC is negligible compared to the effort needed to develop the system, and (b) EA helped identify defeaters (i.e., doubts in the system's reliability) that were not detailed in the documentation used for creation of the AC.
- Book Chapter
13
- 10.1007/978-3-030-54549-9_3
- Jan 1, 2020
Safety assurance cases (ACs) are structured arguments that assert the safety of cyber-physical systems. ACs use reasoning steps, or strategies, to show how a safety claim is decomposed into subclaims which are then supported by evidence. In practice, ACs are informal, and thus it is difficult to check whether these decompositions are valid and no subclaims are missed. This may lead to the approval of fallacious safety arguments and thus the deployment of unsafe systems. Fully formalizing ACs to facilitate rigorous evaluation is not realistic due to the complexity of creating and comprehending such ACs. We take an intermediate approach by formalizing several types of decomposition strategies, proving the conditions under which they are deductive, and applying them as templates that guard against common errors in ACs. We demonstrate our approach on two scenarios: creation of ACs with deductive reasoning steps and evaluation and improvement of existing ACs.
- Research Article
- 10.1145/3796233
- Feb 24, 2026
- Formal Aspects of Computing
In critical software engineering, structured assurance cases (ACs) are used to demonstrate how key system properties are supported by evidence (e.g., test results, proofs). Creating rigorous ACs is particularly challenging in the context of software product lines (SPLs), i.e, sets of software products with overlapping but distinct features and behaviours. Since SPLs can encompass very large numbers of products, developing a rigorous AC for each product individually is infeasible. Moreover, if the SPL evolves, e.g., by the modification or introduction of features, it can be infeasible to assess the impact of this change. Instead, the development and maintenance of ACs ought to be lifted such that a single AC can be developed for the entire SPL simultaneously, and be analyzed for regression in a variability-aware fashion. In this article, we describe a formal approach to lifted AC development and regression analysis. We formalize a language of variability-aware ACs for SPLs and study the lifting of template-based AC development. We also define a regression analysis to determine the effects of SPL evolutions on variability-aware ACs. We describe a model-based assurance management tool which implements these techniques, and illustrate our contributions by developing an AC for a product line of medical devices.
- Conference Article
4
- 10.1109/aero.2013.6496958
- Mar 1, 2013
The current regulatory approach for assuring device safety primarily focuses on compliance with prescriptive safety regulations and relevant safety standards. This approach, however, does not always lead to a safe system design even though safety regulations and standards have been met. In the medical device industry, several high profile recalls involving infusion pumps have prompted the regulatory agency to reconsider how device safety should be managed, reviewed and approved. An assurance case has been cited as a promising tool to address this growing concern. Assurance cases have been used in safety-critical systems for some time. Most assurance cases, if not all, in literature today are developed in an ad hoc fashion, independent from risk management and requirement development. An assurance case is a resource-intensive endeavor that requires additional effort and documentation from equipment manufacturers. Without a well-organized requirements infrastructure in place, such “additional effort” can be substantial, to the point where the cost of adoption outweighs the benefit of adoption. In this paper, the authors present a Risk-Based Requirements and Assurance Management (RBRAM) methodology. The RBRAM is an elaborate framework that combines Risk-Based Requirements Management (RBRM) with assurance case methods. Such an integrated framework can help manufacturers leverage an existing risk management to present a comprehensive assurance case with minimal additional effort while providing a supplementary means to reexamine the integrity of the system design in terms of the mission objective. Although the example used is from the medical industry, the authors believe that the RBRAM methodology underlines the fundamental principle of risk management, and offers a simple, yet effective framework applicable to aerospace industry, perhaps, to any industry.
- Conference Article
12
- 10.1109/dsn.2004.1311964
- Jan 1, 2004
Summary form only given. The purpose of this workshop is to promote communication among groups with similar challenges in assurance cases - groups that often are unaware of what is being done in other similar communities. We hope that the workshop will start a process of continuing discussion across disciplines on the central challenges and opportunities for assurance cases, and initiate the development of a standard set of best practices and guidelines for developing and assessing assurance cases. In addition to specific summary outputs from the workshop, we will use the workshop as an opportunity to gauge interest and commitment in forming an IEEE sponsored planning group to address possible standards activities for assurance cases. At a minimum we would like to make this the first in a series of annual workshops on assurance cases, and begin to form a cross-discipline community of interest. We have the structured the workshop in three sessions. The first session will focus on current best practices, experiences, and recurring challenges. The second session will focus on promising research (tools, notations, techniques, etc.) and new opportunities for developing, reviewing, and maintaining assurance cases. The final session will include a summary and review of key points from the first two sessions, moderated discussion on important consensus and disagreements, and identification of next steps. Position papers, presentations, and follow up material after the workshop will be posted on the workshop web site only.