Abstract

This paper presents chunking strategy of a contiguous nouns sequence using semantic class. We call contiguous nouns which can be treated like a noun the compound noun phrase. We use noun pairs extracted from a syntactic tagged corpus and their semantic class pairs for chunking of the compound noun phrase. For reliability, these noun pairs and semantic classes are built from a syntactic tagged corpus and detailed dictionary in the Sejong corpus. The compound noun phrase of arbitrary length can also be chunked by these information. The 38,940 pairs of `left noun - right noun`, 65,629 pairs of `left noun - semantic class of right noun`, 46,094 pairs of `semantic class of left noun - right noun`, and 45,243 pairs of `semantic class of left noun - semantic class of right noun` are used for compound noun phrase chunking. The test data are untrained 1,000 sentences with contiguous nouns of length more than 2randomly selected from Sejong morphological tagged corpus. Our experimental result is 86.89% precision, 80.48% recall, and 83.56% f-measure.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call