Abstract

Subtopic segmentation is a critical task in numerous applications, including information retrieval, automatic summarization, essay scoring, and others. Although several approaches have been developed, many are ineffective for specific domains with a small corpus because of the fuzziness of the semantics of words and sentences in the corpus. This paper explores the problem of subtopic segmentation by proposing a fuzzy model for the semantics of both words and sentences. The model has three characteristics. First, it can deal with the uncertainty in the semantics of words and sentences. Secondly, it can measure the fuzzy similarity between the fuzzy semantics of sentences. Thirdly, it can develop a fuzzy algorithm for segmenting a text into several subtopic segments. The experiments, especially for a short text with a small corpus in a specific domain, indicate that the method can efficiently increase the accuracy of subtopic segmentation over previous methods.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.