Abstract

Text summarization has emerged as an increasingly established field over the course of the past ten years. We may soon reach a stage where researchers will be able to design, and provide everyday users with, robust text summarization systems. Users of text summarization are many and range from the Internet surfers lacking the time to locate and digest all the latest news available on the web to scientists unable to keep pace with the burgeoning number of technical publications who must, nonetheless, be familiar with the latest findings in their fields. Given texts to summarize, there is no a priori criteria for determining relevance for the summary. When humans summarize texts, they identify relevant information that they think will be of interest to the readers. Summarization is not only a function of the input documents but also of the reader’s mental state: who the reader is, what his knowledge before reading the summary consists of, and why he wants to know about the input texts. This fact has been long acknowledged by both the psycho-linguistic and the computational-linguistic communities. However, both communities agree that trying to model the reader’s mental state is far too complicated, if not entirely impossible. Given this dilemma, most of the computational linguistic research in summarization has assumed that the “reader variable” is a constant and has focused on defining a general notion of salience, valid for all readers. In my thesis, I investigate strategies to take user characteristics into account in the summarization process. Acquiring a user model is by itself a wide subject of research. I do not focus on ways to acquire a user model, and I assume that there is an existing user model in my framework. Rather, my focus is on the challenges entailed in incorporating knowledge about the user into summarization strategies and providing the user with a text relevant to his needs. In my work, two types of user tailoring are examined: individualized, i.e., the specific facts in which the reader is interested, and class-based, i.e., the degree of expertise of the reader. My research framework consists of PERSIVAL, a digital library that provides tailored access to medical technical literature for both physicians and patients. When treat-

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.