A Black Box Comparison of Machine Learning Reverse Image Search for Cybersecurity OSINT Applications

Esther Nanjala Wekesa,Andy Zhu,Casimer Decusatis

doi:10.3390/electronics12234822

Esther Nanjala Wekesa, Andy Zhu + Show 1 more

Open Access

https://doi.org/10.3390/electronics12234822

Copy DOI

Journal: Electronics	Publication Date: Nov 29, 2023
Citations: 1	License type: CC BY 4.0

Affiliation: Marist College, Poughkeepsie Public Library District

Abstract

Machine learning algorithms for reverse image search (a subset of open source intelligence or OSINT) provide a free, useful tool for determining the content of an image, where and when it was captured, and, in some cases, whether it has been digitally modified. Using a test data set of 24 images, we compared the performance of reverse image search for Google, Bing, and Yandex. Our black box experimental results are presented for three different categories of images (uncluttered images, images with significant background clutter, and facial recognition). The total number of correct images was highest for Google (65%), while Bing (55%) and Yandex (50%) yielded different results. Google was also the best at identifying cluttered, uncluttered, and facial images. We compare these results with previous studies and review how relative performance has changed over time. Accurate recognition rates for all reverse search platforms tested were higher for original images not previously uploaded, as opposed to images used in earlier studies. We validate our results using exchangeable image file format (EXIF) data and error level analysis (ELA) of selected images. Based on these results, best practices for OSINT image investigation are proposed.

Full Text