Çorba: crowdsourcing to obtain requirements from regulations and breaches

Hui Guo,Anne-Liz Jeukeng,Özgür Kafalı,Munindar P. Singh,Laurie Williams

doi:10.1007/s10664-019-09753-2

Abstract

Modern software systems are deployed in sociotechnical settings, combining social entities (humans and organizations) with technical entities (software and devices). In such settings, on top of technical controls that implement security features of software, regulations specify how users should behave in security-critical situations. No matter how carefully the software is designed and how well regulations are enforced, such systems are subject to breaches due to social (user misuse) and technical (vulnerabilities in software) factors. Breach reports, often legally mandated, describe what went wrong during a breach and how the breach was remedied. However, breach reports are not formally investigated in current practice, leading to valuable lessons being lost regarding past failures. Our research aim is to aid security analysts and software developers in obtaining a set of legal, security, and privacy requirements, by developing a crowdsourcing methodology to extract knowledge from regulations and breach reports. We present Corba, a methodology that leverages human intelligence via crowdsourcing, and extracts requirements from textual artifacts in the form of regulatory norms. We evaluate Corba on the US healthcare regulations from the Health Insurance Portability and Accountability Act (HIPAA) and breach reports published by the US Department of Health and Human Services (HHS). Following this methodology, we have conducted a pilot and a final study on the Amazon Mechanical Turk crowdsourcing platform. Corba yields high quality responses from crowd workers, which we analyze to identify requirements for the purpose of complementing HIPAA regulations. We publish a curated dataset of the worker responses and identified requirements. The results show that the instructions and question formats presented to the crowd workers significantly affect the response quality regarding the identification of requirements. We have observed significant improvement from the pilot to the final study by revising the instructions and question formats. Other factors, such as worker types, breach types, or length of reports, do not have notable effect on the workers’ performance. Moreover, we discuss other potential improvements such as breach report restructuring and text highlighting with automated methods.

Full Text