Abstract
With the rapid increase of textual information available electronically, there is an acute need for automatic textual analysis tools. Two communities have dealt with the problem of automatic textual analysis: information retrieval (IR), and information extraction (IE). Information retrieval has been very successful at the “document level”: locating, categorizing and filtering entire documents from large corpus, etc. Unfortunately, it is very difficult to extend the information retrieval paradigm so as to realize more complex tasks such as topic segmentation, summarization, template information extraction, etc. Information extraction has been relatively successful at the “word level”: extracting meaningful terms, finding their relationships, extracting information patterns, abstracting, etc. Unfortunately, information extraction techniques require the use of extensive domain-dependant knowledge and are, for this reason, difficult to apply. In the last decade, artificial intelligence, and in particular machine learning (ML), have been applied to both IR and IE. For the most part, ML techniques have been used to estimate or parameterize certain functions within the standard IR or IE paradigms. This leads often to an increase of performances and portability, but without a substantial change of the underlying model, the full potential of ML cannot be exploited. Our research concentrates on the application of ML techniques with the objective of extending the capabilities of IR models and in particular extending the range and complexity of the tasks that can be handled. (1 page)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.