Abstract

Since noun phrases are the most popular phrases in texts, noun phrase identification is one of vital subtasks of natural language processing. Generally Chinese noun phrases have hierarchical inner structures. This paper proposes an approach of defining various levels of granularity for noun phrases, catering for different application demands. Three levels of granularity noun phrases are proposed, that is, concept noun phrase, base noun phrase and entire noun phrase. The task of noun phrase identification is to label word sequences with phrase tags. All granularity noun phrase identifications are cast as classification problem under certain encoding schemes. The experimental dataset is acquired empirically from Chinese Penn Treebank 5.1. F, measure of concept noun phrase, base noun phrase and entire noun phrase identification reaches 92.12%, 84.13% and 85.32% respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call