Abstract

Automatic identification of widely-known facts not only plays a crucial role in automatic curation of databases of common-sense knowledge, but also benefits many applications of natural language processing such as searching, question answering, and common-sense reasoning. For better performance of such automatic identification, it is important to understand the linguistic characteristics of the writings about widely-known facts, if there are such characteristics. In order to investigate these linguistic characteristics, we juxtapose the sentences that report an event and the database entries that contain information about that event. We designed a task of identifying the number of databases in which the event of interest appears. From the experiment, we found that the number of databases to which an event is registered is strongly correlated with the writing style in which the event is described in the literature. This means that the style of writing reveals multi-level understanding of the individual authors on the status of knowledge prominence. To the best our knowledge, our work is the first that suggests an extrinsic evidence supporting the existence of such multi-level cognition. (KAIST)

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call