Nowadays, 5G profoundly impacts video surveillance and monitoring services by processing video streams at high-speed with high-reliability, high bandwidth, and secure network connectivity. It also enhances artificial intelligence, machine learning, and deep learning techniques, which require intense processing to deliver near-real-time solutions. In video surveillance, person tracking is a crucial task due to the deformable nature of the human body, various environmental components such as occlusion, illumination, and background conditions, specifically, from a top view perspective where the person’s visual appearance is significantly different from a frontal or side view. In this work, multiple people tracking framework is presented, which uses 5G infrastructure. A top view perspective is used, which offers broad coverage of the scene or field of view. To perform a person tracking deep learning-based tracking by detection framework is proposed, which includes detection by YOLOv3 and tracking by Deep SORT algorithm. Although the model is pre-trained using the frontal view images, even then, it gives good detection results. In order to further enhance the accuracy of the detection model, the transfer learning approach is adopted. In this way, a detection model takes advantage of a pre-trained model appended with an additional trained layer using top view data set. To evaluate the performance, experiments are carried out on different top view video sequences. Experimental results reveal that transfer learning improves the overall performance, detection accuracy, and reduces false positives. The deep learning detection model YOLOv3 achieves detection accuracy of 92% with a pre-trained model without transfer learning and 95% with transfer learning. The tracking algorithm Deep SORT also achieves excellent results with a tracking accuracy of 96%.