In this paper, we consider a composite difference-of-convex (DC) program, whose objective function is the sum of a smooth convex function with Lipschitz continuous gradient, a proper closed and convex function, and a continuous concave function. This problem has many applications in machine learning and data science. The proximal DCA (pDCA), a special case of the classical difference-of-convex algorithm (DCA), as well as two Nesterov-type extrapolated DCA – ADCA (Phan et al. IJCAI:1369–1375, 2018) and pDCAe (Wen et al. Comput. Optim. Appl. 69:297–324, 2018) – can solve this problem. The algorithmic stepsizes of pDCA, pDCAe, and ADCA are fixed and determined by estimating a prior the smoothness parameter of the loss function. However, such an estimate may be hard to obtain or poor in some real-world applications. Motivated by this difficulty, we propose a variable metric and Nesterov extrapolated proximal DCA with backtracking (SPDCAe), which combines the backtracking line search procedure (not necessarily monotone) and the Nesterov's extrapolation for potential acceleration; moreover, the variable metric method is incorporated for better local approximation. Numerical simulations on sparse binary logistic regression and compressed sensing with Poisson noise demonstrate the effectiveness of our proposed method.