Abstract

Events are an integral part of our day-to-day search needs. Users search for various kinds of events such as political events, organizational announcements, policy changes, personal events, criminal activity and so on. In linguistics, events are often thought of as discourse entities with associated complex structure and attributes. Many professionals look for patterns that involve event occurrences. Journalists, financial analysts, intelligence analysts, attorneys conducting investigations, auditors examining corporate records are examples of users who may want to find such information and to arrange it in ways that can help to produce meaningful analyses. The goal of this work is to develop effective information retrieval systems that can help users to satisfy event-related information needs. My particular interest is in events that are decomposable into subevents in ways that can be anticipated. I am interested in modeling decomposable events, and in automatically recognizing references to subevents, both to help with finding relevant documents and to help with presenting diverse results to the user. I plan to pursue this broader goal in three stages, each of which involves creating a test collection. I have started by developing information retrieval test collections for news, building on existing collections of news stories from the Text Analysis Conference (TAC) and the Topic Detection and Tracking Evaluations [1]. Next, I plan to build a new event ontology for email, and use that ontology as a basis for building an information retrieval collection from the Avocado Email collection. Finally, I plan to extend one or both of these test collections to support research on characterizing event impact, thus perhaps providing an additional basis for ranking. In each stage of my work, new test collection(s) will enable new research. In my first stage, which is in progress, I have studied the effect of automatically detected subevents on ranking effectiveness. Using the Rich Entities, Relations and Events (ERE) ontology of event types and subtypes from the TAC Event track [5], and two existing automated event detection systems, I have developed a simple bag of words-and-events search system that uses the automatically detected event type information. I also built an information retrieval test collection from a Topic Detection and Tracking (TDT) collection, for which event-based topics exist. Evaluation results show promise when compared to baseline approaches. I plan to further develop this line of work to identify ways to automatically decompose high level events into their components. With the goal of extending the event retrieval work to an organizational setting, in my second line of research I plan to work with the Avocado email collection [3]. While newswire text and other Web pages are easily available for everyone's perusal, email content is interesting because it may contain organizational and personal events that differ from those found in news, and those events may be referred to in less fully contextualized ways. In this line of research, I want to build a retrieval system for events in this genre of personal and organizational content. The retrieval system will provide a diverse ranking covering the full range of subevents for which information is available. To do this, I will need a reusable test collection containing event-related topics, with the event nuggets within those documents annotated for relevance. This annotation at the event nugget level will support computing measures like α-nDCG [2] in which the gain reflects in part the number of different event nuggets in a document. The third stage of my work will be more exploratory, since there are many ways in which one might conceptualize event impact. My initial approach will be to explore alternative indicators for different types of impact. For example, the societal impact of news stories might be characterized by the number of readers, whereas the personal impact of an email might be characterized by the time before a reply is received. I am particularly interested in how sentiment analysis might be used to characterize event impact, particularly with reference to publicly available user-generated content.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.