State of the Art Language Technologies for Italian: The EVALITA 2014 Perspective

Giuseppe Attardi,Simonetta Montemagni Simonetta Montemagni,Viviana Patti,Valerio Basile,Cristina Bosco,Maria Simi,Rachele Sprugnoli,Felice Dell’Orletta,Tommaso Caselli

doi:10.3233/ia-150076

Giuseppe Attardi, Simonetta Montemagni Simonetta Montemagni + Show 7 more

Open Access

PDF Available

https://doi.org/10.3233/ia-150076

Copy DOI

Export

Save

Cite

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

Shared task evaluation campaigns represent a well established form of competitive evaluation, an important opportunity to propose and tackle new challenges for a specific research area and a way to foster the development of benchmarks, tools and resources. The advantages of this approach are evident in any experimental field, including the area of Natural Language Processing. An outlook on state–of–the–art language technologies for Italian can be obtained by reflecting on the results of the recently held workshop “Evaluation of NLP and Speech Tools for Italian”, EVALITA 2014. The motivations underlying individual shared tasks, the level of knowledge and development achieved within each of them, the impact on applications, society and economy at large as well as directions for future research will be discussed from this perspective.

Highlights

Evaluation of achieved results is a crucial process of scientific research
For the DPIE task, the standard evaluation in terms of Labeled Attachment Score (LAS)/Unlabelled Attachment Score (UAS) computed on individual attachments does not seem to always correlate with the evaluation based on semantically-oriented relations, which are more relevant for Information Extraction applications, as suggested among others by [79]
It can be observed that, in this case, there is a significant overlapping of the outlines: low scored relations are hard to predict for every participant system, at a different extent

Summary

Introduction

Evaluation of achieved results is a crucial process of scientific research. This applies to the area of Natural Language Processing (NLP): establishing a well–grounded evaluation methodology makes it easier to track advances in the field and to assess the impact of the work done. The comparison of the results of different systems is not a trivial task as many parameters can affect and influence this process. To overcome this issue, over the last ten years shared task evaluation campaigns started being increasingly popular as a competitive form of evaluation. Shared task evaluation campaigns represent an important opportunity to investigate ways to tackle the challenges a specific research area is facing, where different approaches to a well–defined problem are compared based on their performance on the same task with respect to the same dataset. The datasets used within evaluation campaigns become reference resources of the scientific community and are used to assess effectiveness and performance of a given system or technology with respect to a specific task

Methods

Results

Conclusion