Direct Aerial Visual Geolocalization Using Deep Neural Networks

Winthrop Harvey,Jackson Cothren,Chase Rainwater

doi:10.3390/rs13194017

Winthrop Harvey, Jackson Cothren + Show 1 more

Open Access

PDF Available

https://doi.org/10.3390/rs13194017

Copy DOI

Export

Save

Cite

Journal: Remote Sensing	Publication Date: Oct 8, 2021
Citations: 8	License type: CC BY 4.0

Affiliation: University of Arkansas at Fayetteville

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

Unmanned aerial vehicles (UAVs) must keep track of their location in order to maintain flight plans. Currently, this task is almost entirely performed by a combination of Inertial Measurement Units (IMUs) and reference to GNSS (Global Navigation Satellite System). Navigation by GNSS, however, is not always reliable, due to various causes both natural (reflection and blockage from objects, technical fault, inclement weather) and artificial (GPS spoofing and denial). In such GPS-denied situations, it is desirable to have additional methods for aerial geolocalization. One such method is visual geolocalization, where aircraft use their ground facing cameras to localize and navigate. The state of the art in many ground-level image processing tasks involve the use of Convolutional Neural Networks (CNNs). We present here a study of how effectively a modern CNN designed for visual classification can be applied to the problem of Absolute Visual Geolocalization (AVL, localization without a prior location estimate). An Xception based architecture is trained from scratch over a >1000 km2 section of Washington County, Arkansas to directly regress latitude and longitude from images from different orthorectified high-altitude survey flights. It achieves average localization accuracy on unseen image sets over the same region from different years and seasons with as low as 115 m average error, which localizes to 0.004% of the training area, or about 8% of the width of the 1.5 × 1.5 km input image. This demonstrates that CNNs are expressive enough to encode robust landscape information for geolocalization over large geographic areas. Furthermore, discussed are methods of providing uncertainty for CNN regression outputs, and future areas of potential improvement for use of deep neural networks in visual geolocalization.

Highlights

Pilotage or piloting is the practice of using vision to navigate an aircraft, usually by reference to terrain and landmarks
An Xception based architecture is trained from scratch over a >1000 km2 section of Washington County, Arkansas to directly regress latitude and longitude from images from different orthorectified high-altitude survey flights. It achieves average localization accuracy on unseen image sets over the same region from different years and seasons with as low as 115 m average error, which localizes to 0.004% of the training area, or about 8% of the width of the 1.5 × 1.5 km input image. This demonstrates that Convolutional Neural Networks (CNNs) are expressive enough to encode robust landscape information for geolocalization over large geographic areas
This paper aims to address the question of how effectively a modern CNN architecture that has been successfully used on ground-level imagery tasks can be repurposed for absolute visual localizationn (AVL)

Summary

Introduction

Pilotage or piloting is the practice of using vision to navigate an aircraft, usually by reference to terrain and landmarks. This is in contrast to flying by instrument. Vision and visual flight remains immensely valuable to pilots when it is available, both in improving navigation quality and as a check against instrument error. A valuable source of navigation capability and backup instrumentation is currently not being utilized. This is of particular concern in the context of security and reliability, when external navigational aids may not always be available or reliable [1]

Objectives

Results

Discussion

Conclusion