Abstract

Understanding the rules of grammar and linguistic features is essential to understanding the context of a language, which helps to understand that language. Similarly, for Natural Language processing, the linguistic feature allows understanding of the language. This paper introduced how Coreference, Word-sense, and Semantic knowledge (CWS) of linguistic features work. It would improve the Natural Language Understanding (NLU) and Natural Language Processing (NLP) tasks of any NLP model and NLP applications (either existing or new). This paper proposed a CWS pipeline method to enhance the efficiency and performance of NLP applications like text summarization, information retrieval, question-answer, machine reading comprehension, etc. The proposed CWS pipeline model used a pre-trained CoNLL-2012 coreference dataset extracted from the famous Ontonotes-5.0 dataset for the English language. The model implementation is done in Python language. The performance evaluation is done using the standard CoNLL-2012 coreference dataset for the English language. The coreference marked output is evaluated against the manually tagged gold standard dataset. The proposed CWS pipeline model gives 78.98% of the average F1 score on the MUC metric, 1.78% higher than the previous models' top result. CWS pipeline model performs better than existing models.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call