The “big red button” is too late: an alternative model for the ethical evaluation of AI systems

Thomas Arnold,Matthias Scheutz

doi:10.1007/s10676-018-9447-7

Abstract

As a way to address both ominous and ordinary threats of artificial intelligence (AI), researchers have started proposing ways to stop an AI system before it has a chance to escape outside control and cause harm. A so-called “big red button” would enable human operators to interrupt or divert a system while preventing the system from learning that such an intervention is a threat. Though an emergency button for AI seems to make intuitive sense, that approach ultimately concentrates on the point when a system has already “gone rogue” and seeks to obstruct interference. A better approach would be to make ongoing self-evaluation and testing an integral part of a system’s operation, diagnose how the system is in error and to prevent chaos and risk before they start. In this paper, we describe the demands that recent big red button proposals have not addressed, and we offer a preliminary model of an approach that could better meet them. We argue for an ethical core (EC) that consists of a scenario-generation mechanism and a simulation environment that are used to test a system’s decisions in simulated worlds, rather than the real world. This EC would be kept opaque to the system itself: through careful design of memory and the character of the scenario, the system’s algorithms would be prevented from learning about its operation and its function, and ultimately its presence. By monitoring and checking for deviant behavior, we conclude, a continual testing approach will be far more effective, responsive, and vigilant toward a system’s learning and action in the world than an emergency button which one might not get to push in time.

Full Text