Abstract

This paper explores the application of arithmetic coding to systems involving the storage of a large body of text, along with a lexicon that lists the words and a concordance that indicates the exact locations at which each word can be found. A typical query might seek all sentences that contain a particular word or combination of words. The random-access requirement means that many current compression techniques are not directly applicable-particularly those using adaptive modelling. However, the static nature of the text and the existence of a lexicon give help that is not available in other compression scenarios. A number of different kinds of model developed for different parts of a full-text retrieval system are presented and evaluated. >

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.