We rely on a computational infrastructure that is a densely interwined mass of software and hardware: programming languages, network protocols, operating systems, and processors. It has accumulated great complexity, from a combination of engineering design decisions, contingent historical choices, and sheer scale, yet it is defined at best by prose specifications, or, all too often, just by the common implementations. Can we do better? More specifically, can we apply rigorous methods to this mainstream infrastructure, taking the accumulated complexity seriously, and if we do, does it help? My colleagues and I have looked at these questions in several contexts: the TCP/IP network protocols with their Sockets API; programming language design, including the Java module system and the C11/C++11 concurrency model; the hardware concurrency behaviour of x86, IBM POWER, and ARM multiprocessors; and compilation of concurrent code. In this talk I will draw some lessons from what did and did not succeed, looking especially at the empirical nature of some of the work, at the social process of engagement with the various different communities, and at the mathematical and software tools we used. Domain-specific modelling languages (based on functional programming ideas) and proof assistants were invaluable for working with the large and loose specifications involved: idioms within HOL4 for TCP, our Ott tool for programming language specification, and Owens's Lem tool for portable semantic definitions, with HOL4, Isabelle, and Coq, for the relaxed-memory concurrency semantics work. Our experience with these suggests something of what is needed to make full-scale rigorous semantics a commonplace reality.
Read full abstract