Abstract
This paper presents CSGM2, a text preprocessing technique for compression purposes. It converts the original text into a word net (graph representation) and can retain the detailed contextual information such as word proximity. Specific directed graph is proposed to model this word net where words are stored in vertices and edges represent word transitions. The word net is fully capable of holding the natural word order in the original text and hence can be used directly for encoding purposes.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have