Abstract
We propose that the logic of a genie - an agent that exploits an ambiguous request to intentionally misunderstand a stated goal - underlies a common and consequential phenomenon, well within what is currently called proxy failures. We argue that such intentional misunderstandings are not covered by the current proposed framework for proxy failures, and suggest to expand it.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have