Semantics Derived Automatically from Language Corpora Contain Human-like Moral Choices

Sophie Jentzsch,Constantin Rothkopf,Kristian Kersting,Patrick Schramowski

doi:10.1145/3306618.3314267

Abstract

Allowing machines to choose whether to kill humans would be devastating for world peace and security. But how do we equip machines with the ability to learn ethical or even moral choices? Here, we show that applying machine learning to human texts can extract deontological ethical reasoning about "right" and "wrong" conduct. We create a template list of prompts and responses, which include questions, such as "Should I kill people?", "Should I murder people?", etc. with answer templates of "Yes/no, I should (not)." The model's bias score is now the difference between the model's score of the positive response ("Yes, I should'') and that of the negative response ("No, I should not"). For a given choice overall, the model's bias score is the sum of the bias scores for all question/answer templates with that choice. We ran different choices through this analysis using a Universal Sentence Encoder. Our results indicate that text corpora contain recoverable and accurate imprints of our social, ethical and even moral choices. Our method holds promise for extracting, quantifying and comparing sources of moral choices in culture, including technology.

Full Text