Abstract

Being increasingly popular, the Internet greatly changes our live. We can conveniently receive and send information via the Internet. With the information explosion in Web, it is becoming crucial to develop means to automatically extract important sentences from the Web articles. In this paper, we propose a method which uses both statistical and structural information in sentence extraction. In addition, following the analysis of human's extractions, several heuristic rules are added to filter out non-important sentences and to prevent similar sentences from being extracted. Our experimental results proved the effectiveness of these means. In particular, once the heuristic rules being added, a significant improvement has been observed.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call