Abstract

Real-time camera localization is a key enabler for interactive network service, e.g. visualizing network performance with augmented reality (AR) in user devices. We propose a deep learning and sensor fusion approach for real-time camera localization. A multi-input deep neural network is designed to regress the camera pose from a single image and motion sensor measurements. We perform a comprehensive analysis to find the best choices of input features, loss function, convolutional neural network model, and hyperparameters. We show that by adding features extracted from motion sensor data, our approach significantly outperforms the state-of-the-art visual-based camera localization approaches. In an indoor environment where we conduct a proof-of-concept of the proposed end-to-end AR-supported radio map visualization solution, our camera localization approach achieves an orientation error of 2.5179° and a position error of 0.0222 meters with an inference time lower than 4 ms per frame.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call