Abstract

This paper presents CSGM2, a text preprocessing technique for compression purposes. It converts the original text into a word net (graph representation) and can retain the detailed contextual information such as word proximity. Specific directed graph is proposed to model this word net where words are stored in vertices and edges represent word transitions. The word net is fully capable of holding the natural word order in the original text and hence can be used directly for encoding purposes.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call