Automatic Verification of Database-Centric Systems Alin Deutsch Richard Hull Victor Vianu UC San Diego IBM Yorktown Research Center UC San Diego & INRIA Saclay deutsch@cs.ucsd.edu vianu@cs.ucsd.edu hull@us.ibm.com INTRODUCTION area of business process management, concomitantly with an evolution from the traditional process-centric approach towards data awareness. A notable expo- nent of this class is the business artifact model pio- neered in [63, 51], deployed by IBM in professional services. Business artifacts (or simply “artifacts”) model key business-relevant entities, which are up- dated by a set of services that implement business process tasks. A collection of artifacts and services is called an artifact system. This modeling approach has been successfully deployed in practice [7, 6, 21, 27, 69], and has been adopted in the OMG standard for Case Management [9]. Tools such as the above automatically generate the database-centric application code from the high- level specification. This not only allows fast proto- typing and improves programmer productivity but, as a side effect, provides new opportunities for au- tomatic verification. Indeed, the high-level specifi- cation is a natural target for verification, as it ad- dresses the most likely source of errors (the applica- tion’s specification, as opposed to the less likely er- rors in the automatic generator’s implementation). The theoretical and practical results obtained so far concerning the verification of such systems are quite encouraging. They suggest that, unlike arbi- trary software systems, significant classes of data- driven systems may be amenable to automatic veri- fication. This relies on a novel marriage of database and model checking techniques, and is relevant to both the database and the computer-aided verifica- tion communities. In this article, we describe several models and re- sults on automatic verification of database-driven systems, focusing on temporal properties of their underlying workflows. To streamline the presenta- tion, we focus on verification of business artifacts, and use it as a vehicle to introduce the main con- cepts and results. We then summarize some of the work pertaining to other applications such as data- driven web services. Software systems centered around a database are pervasive in numerous applications. They are en- countered in areas as diverse as electronic commerce, e-government, scientific applications, enterprise in- formation systems, and business process manage- ment. Such systems are often very complex and prone to costly bugs, whence the need for verifica- tion of critical properties. Classical software verification techniques that can be applied to such systems include model check- ing and theorem proving. However, both have se- rious limitations. Indeed, model checking usually requires performing finite-state abstraction on the data, resulting in loss of semantics for both the sys- tem and properties being verified. Theorem proving is incomplete, requiring expert user feedback. Recently, an alternative approach to verification of database-centric systems has taken shape, at the confluence of the database and computer-aided ver- ification areas. It aims to identify restricted but sufficiently expressive classes of database-driven ap- plications and properties for which sound and com- plete verification can be performed in a fully auto- matic way. This approach leverages another trend in database-driven applications: the emergence of high-level specification tools for database-centered systems, such as interactive web applications and data-driven business processes. We review next a few representative examples. A commercially successful high-level specification tool for web applications is Web Ratio [1], an out- growth of the earlier academic prototype WebML [20, 17]. Web Ratio allows to specify a Web ap- plication using an interactive variant of the E-R model augmented with a workflow formalism. Non- interactive variants of Web page specifications had already been proposed in Strudel [39], Araneus [58] and Weave [40], targeting the automatic generation of Web sites from an underlying database. High- level specification tools have also emerged in the