Abstract

In this paper, we present a document visualization technique for data analysis based on the semantic representation of text in the form of a directed graph, referred to as semantic graph. It is derived using natural language processing as follows. Firstly subject– verb – object triplets are automatically extracted from the Penn Treebank parse tree obtained for each sentence in the document. Secondly, the triplets are further enhanced by linking them to their corresponding co-referenced named entity, by resolving pronominal anaphors as well as attaching the associated WordNet synset. Starting from the document's semantic graph and the list of extracted triplets we automatically generate the document summary, for which we also derive the semantic representation.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call