In large-scale distributed learning, the direct application of traditional inference is often not feasible, because it may contain multiple themes, such as communication costs, privacy issues, and Byzantine failures. Nowadays, the internet is vulnerable to attacks, and Byzantine failures frequently occur. For copying with Byzantine failures, the paper develops two Byzantine-robust distributed learning algorithms under a framework of communication-efficient surrogate likelihood. In our algorithms, we adopt the δ-approximate compressors, including sign-based operator and topk sparsification, to improve communication efficiency, and an unsophisticated thresholding of local gradient norms to guard against Byzantine failures. For accelerating convergence and achieving an optimal statistical error rate, error feedback is exploited in the second algorithm. The two algorithms are robust to arbitrary adversaries, although Byzantine workers don't adhere to the mandated compression mechanism. We explicitly establish statistical error rates, which imply that our algorithms don't sacrifice the quality of learning, and attain the order-optimal under some settings. In addition, we provide a trade-off between compression and adversary in the presence of Byzantine worker machines. Extensive numerical experiments validate our theoretical results and demonstrate a good performance of our algorithms.