Maritime vessel re-identification: novel VR-VCA dataset and a multi-branch architecture MVR-net

Amir Ghahremani,Egor Bondarev,Peter H N De With,Tunc Alkanat

doi:10.1007/s00138-021-01199-1

Amir Ghahremani, Egor Bondarev + Show 2 more

Open Access

https://doi.org/10.1007/s00138-021-01199-1

Copy DOI

Abstract

Maritime vessel re-identification (re-ID) is a computer vision task of vessel identity matching across disjoint camera views. Prominent applications of vessel re-ID exist in the fields of surveillance and maritime traffic flow analysis. However, the field suffers from the absence of a large-scale dataset that enables training of deep learning models. In this study, we present a new dataset that includes 4614 images of 729 vessels along with 5-bin orientation and 8-class vessel-type annotations to promote further research. A second contribution of this study is the baseline re-ID analysis of our new dataset. Performances of 10 recent deep learning architectures are quantitatively compared to reveal the best practices. Lastly, we propose a novel multi-branch deep learning architecture, Maritime Vessel Re-ID network (MVR-net), to address the challenging problem of vessel re-ID. Evaluation of our approach on the new dataset yields 74.5% mAP and 77.9% Rank-1 score, providing a performance increase of 5.7% mAP and 5.0% Rank-1 over the best-performing baseline. MVR-net also outperforms the PRN (a pioneering vehicle re-ID network), by 2.9% and 4.3% higher mAP and Rank-1, respectively.

Highlights

In recent years, the demand for automated surveillance systems has grown rapidly
We introduce a new dataset called VR-Video Coding and Architectures (VCA)
Maritime Vessel Re-ID network (MVR-net) generates 4.3% and 6.6% higher Rank-1 compared to PRN and MGN, respectively

Summary

Introduction

The demand for automated surveillance systems has grown rapidly. This is mainly due to the continuous decrease of the costs of cameras and sensors, leading to broadly available video material and the inefficiency and high labor costs to process this enormous amount of data by humans. Numerous algorithms have been proposed to automate the analysis of video surveillance material and the subsequent alerting for specific various events and dangerous situations. The automated analysis of video surveillance involves the automated detection of objects and their classification. These objects-of-interest include people, vehicles and maritime vessels. The camera field-of-views are inevitably sparse compared to the full area where objects-of-interest may be

Results

Discussion

Conclusion