Deep neural networks are rapidly emerging as data analysis tools, often outperforming the conventional techniques used in complex microfluidic systems. One fundamental analysis frequently desired in microfluidic experiments is counting and tracking the droplets. Specifically, droplet tracking in dense emulsions is challenging due to inherently small droplets moving in tightly packed configurations. Sometimes, the individual droplets in these dense clusters are hard to resolve, even for a human observer. Here, two deep learning-based cutting-edge algorithms for object detection [you only look once (YOLO)] and object tracking (DeepSORT) are combined into a single image analysis tool, DropTrack, to track droplets in the microfluidic experiments. DropTrack analyzes input microfluidic experimental videos, extracts droplets' trajectories, and infers other observables of interest, such as droplet numbers. Training an object detector network for droplet recognition with manually annotated images is a labor-intensive task and a persistent bottleneck. In this work, this problem is partly resolved by training many object detector networks (YOLOv5) with several hybrid datasets containing real and synthetic images. We present an analysis of a double emulsion experiment as a case study to measure DropTrack's performance. For our test case, the YOLO network trained by combining 40% real images and 60% synthetic images yields the best accuracy in droplet detection and droplet counting in real experimental videos. Also, this strategy reduces labor-intensive image annotation work by 60%. DropTrack's performance is measured in terms of mean average precision of droplet detection, mean squared error in counting the droplets, and image analysis speed for inferring droplets' trajectories. The fastest configuration of DropTrack can detect and track the droplets at approximately 30 frames per second, well within the standards for a real-time image analysis.