Evaluating Multimedia and Language Tasks.

Ian Soboroff,Keith Curtis,Asad Butt,George Awad

doi:10.3389/frai.2020.00032

Abstract

Evaluating information access tasks, including textual and multimedia search, question answering, and understanding has been the core mission of NIST's Retrieval Group since 1989. The TRECVID Evaluations of Multimedia Access began in 2001 with a goal of driving content-based search technology for multimedia just as its progenitor, the Text Retrieval Conference (TREC) did for text and web1.

Highlights

The recent article, “Challenges and Prospects in Vision and Language Research” by Kafle et al (2019) identified several deficiencies in existing research in multimedia understanding
Once it became it’s own separate venue in 2003, TRECVID began with four tasks, each focused on some facet of the multimedia retrieval problem: shot boundary determination, story segmentation, high-level feature extraction, and search
Evaluation-driven research, using datasets to measure and improve the quality and effectiveness of algorithms, has grown from the early days of computer science to dominate the development of artificial intelligence

Summary

INTRODUCTION

The recent article, “Challenges and Prospects in Vision and Language Research” by Kafle et al (2019) identified several deficiencies in existing research in multimedia understanding. Existing benchmark tasks exhibit bias, are not robust, and induce spurious correlations which detract from rather than reveal advances in vision and language algorithms. These tasks frequently conflate a number of component tasks, such as object identification and entity coreference, which should be evaluated separately. Our group at NIST has found that embedding technology researchers within the process of developing the datasets, metrics, and methods used to evaluate that technology can create a cycle wherein the technology advances along with our understanding of the capabilities of that technology, how people might use it to improve their everyday lives, and how we would know if that were true. By linking the research in visual understanding to the development of methods for measuring the degree of that understanding, we can continually improve our datasets and tasks

BACKGROUND

TRECVID

Task History

Non-TRECVID Datasets

AUTOMATIC AND MANUAL EVALUATION

DESIGNING EVALUATION TASKS

Findings

CONCLUSION

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Evaluating Multimedia and Language Tasks.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in artificial intelligence

Lead the way for us

Journal: Frontiers in artificial intelligence	Publication Date: May 5, 2020
License type: CC BY 4.0

Similar Papers

The Text REtrieval Conferences (TRECs): Providing a Test‐Bed for Information Retrieval Systems
Donna Harman
Bulletin of the American Society for Information Science and Technology | VOL. 24
Donna HarmanDonna Harman
01 Apr 1998
Bulletin of the American Society for Information Science and Technology | VOL. 24

Question answering in TREC
Ellen M Voorhees
-
Ellen M VoorheesEllen M Voorhees
05 Oct 2001
05 Oct 2001

TREC and Interactive Track Environments
Iris Xie
-
Iris XieIris Xie
01 Jan 2008
01 Jan 2008

The text REtrieval conference (TREC)
Ellen M Voorhees ... Donna Harman
ACM SIGIR Forum | VOL. 33
Ellen M Voorhees, et. al.Ellen M Voorhees ... Donna Harman
01 Dec 1999
ACM SIGIR Forum | VOL. 33

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Evaluating Multimedia and Language Tasks.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in artificial intelligence