Recent developments in language modeling have enabled large text encoders to derive a wealth of linguistic information from raw text corpora without supervision. Their success across natural language processing (NLP) tasks has called into question the role of man-made computational resources, such as verb lexicons, in supporting modern NLP. Still, probing analyses have concurrently exposed the limitations of the knowledge possessed by the large neural architectures, revealing them to be clever task solvers rather than self-taught linguists. Can human-designed lexical resources still help fill their knowledge gaps? Focusing on verb classification, we discuss approaches to generating verb classes multilingually and weigh the relative benefits of undertaking expensive lexicographic work and outsourcing the task to untrained native speakers. Then, we consider the evidence for the utility of augmenting pretrained language models with external verb knowledge and ponder the ways in which human expertise can continue to benefit multilingual NLP.
Read full abstract