Named entity recognition (NER) is a widely used natural language processing technique; it plays a key role in information extraction from sentences. To be able to test the correctness of NER systems is important, but it is expensive because an automated test oracle is normally unavailable. To address the oracle problem, this study proposes to apply metamorphic testing (MT). The authors conduct a case study with Litigant, an industrial NER system of the Ant Group, and show that MT can effectively detect real-life bugs in the absence of an ideal oracle. The authors further investigate the causes for a series of entity recognition failures detected. Outcomes of this research further justify the application of MT to the natural language processing domain as well as provide hints for practitioners to improve the quality process of their NER systems.
Read full abstract