Abstract

Noise in text can be defined as any kind of difference between the surface form of a coded representation of the text and the intended, correct, or original text. By its very nature, noisy text warrants moving beyond traditional text analytics techniques. Noise introduces challenges that need special handling, either through new methods or improved versions of existing ones. After the highly successful AND 2007, that was part of IJCAI 07, in this second edition that is part of SIGIR 08, the Information Retrieval community has added its perspective to this topic. The goal of the AND workshops is to focus on the problems encountered in analyzing noisy documents coming from various sources. This workshop brought together a diverse group of researchers to present current research and development in addressing this challenge. We were fortunate to assemble a diverse group of researchers from the Natural Language Processing, Machine Learning and Knowledge Management communities to help us in organizing this workshop. The workshop call for papers had a very good response. We received 25 submissions spanning a diverse set of issues relevant to noisy text analytics. Each submission was reviewed by at least three members of the program committee. Finally twelve papers were selected for oral and four for poster presentation. To encourage discussion, the workshop program was structured into topic-oriented oral and poster sessions. In addition to the contributed papers, the program also contained a keynote address by Donna Harman, NIST, an invited talk by John Tait, IRF, and discussion sessions spread through the day.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.