ABSTRACT In this paper, we proposed a hybrid method based on Deep Learning (DL) and visual tracking, RFR-DLVT, to achieve effective face recognition (FR). First, video sequences are divided into reference frames (RFs) and non-reference frames (NRFs). Then, the target face is identified through the DL-based FR method in RFs. In the meantime, the Kernelized-correlation-filters-based visual tracking method is used in NRFs to speed up FR. Our proposed method is tested on common data sets and achieves better performance. Particularly, RFR-DLVT has an accuracy of 99.6% and an efficiency of 30 (FPS) in the real-time FR on real-life surveillance videos.