This paper focuses on the challenge of pedestrian tracking using camera and millimeter wave radar in autonomous driving. Pedestrian tracking using a single sensor has inherent limitations due to the lack of comprehensive dimensionality of tracking information. Meanwhile, existing multi-sensor based tracking algorithms suffer from limited tracking accuracy by applying the fusion of projected positions. To enhance the tracking accuracy and robustness, a multi-sensor based tracking algorithm based on fused detection of millimeter-wave radar and vision is proposed, which improves the association of detection results from multiple heterogeneous sensors by utilizing newly designed back-projection and undirected graph, and finally improves the fusion detection by simultaneously utilizing a pedestrian’s appearance, local and global location information. Field tests are conducted to produce dataset, and the performance evaluation results based on the self-produced dataset have verified the superiority of the proposed algorithm over the conventional single-sensor based tracking algorithm and multi-sensor based tracking algorithms.