Evaluating Vision-Language Models on the TriangleCOPA Benchmark

Ankur Chemburkar,Andrew Gordon,Andrew Feng

doi:10.32473/flairs.37.1.135485

Evaluating Vision-Language Models on the TriangleCOPA Benchmark

Ankur Chemburkar, Andrew Gordon + Show 1 more

https://doi.org/10.32473/flairs.37.1.135485

Copy DOI

Journal: The International FLAIRS Conference Proceedings	Publication Date: May 13, 2024
License type: CC BY-NC 4.0

#Style Of Film #Classic Film + Show 6 more

Abstract
Full-Text PDF
Similar Papers

Abstract

The TriangleCOPA benchmark consists of 100 textual questions with videos depicting the movements of simple shapes in the style of the classic social-psychology film created by Fritz Heider and Marianne Simmel in 1944. In our experiments, we investigate the performance of current vision-language models on this challenging benchmark, assessing the capability of these models for visual anthropomorphism and abstract interpretation.

Full Text