Abstract

Both compression and decompression play important roles in a web service system. High compression ratio helps to save the storage, while fast decompression contributes to decreasing the response time of service. Specifically focusing on the news web service, this paper proposes a compression mechanism to improve the efficiency of compression and decompression simultaneously by taking advantage of the semantic relations among webpages. Firstly, webpages are clustered into news topics according to the similarity semantic relation among webpages. Webpages belonging to the same topic have much duplicate content, which can improve the compression ratio when using delta-compression. Secondly, associated news topics are detected with the help of multiple-semantics link network of news topics. Associated topics are compressed into the same zip file which may decrease the times of decompression according to the habit of a user’s reading news on the Web. The authors apply the proposed compression mechanism to a practical news search engine and the experimental results show that it has high compression ratio and fast decompression speed as well.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.