Abstract

I highlight a simple failure mode of state-of- the-art machine reading systems: when contexts do not align with commonly shared beliefs. For example, machine reading systems fail to answer "What did Elizabeth want?" correctly in the context of ’My kingdom for a cough drop, cried Queen Elizabeth.’ Biased by co-occurrence statistics in the training data of pretrained language models, systems predict 'my kingdom', rather than 'a cough drop'. I argue such biases are analogous to human belief biases and present a carefully designed challenge dataset for English machine reading, called AUTO-LOCKE, to quantify such effects. Evaluations of machine reading systems on AUTO-LOCKE show the pervasiveness of be- lief bias in machine reading.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call