Support vector machine (SVM), being considered one of the most efficient tools for classification, has received widespread attention in various fields. However, its performance is hindered when dealing with large-scale pattern classification tasks due to high memory requirements and running very slow. To address this challenge, we construct a novel sparse and robust SVM based on our newly proposed capped squared loss (named as Lcsl-SVM). To solve Lcsl-SVM, we first focus on establishing optimality theory of Lcsl-SVM via our defined proximal stationary point, which is convenient for us to efficiently characterize the Lcsl support vectors of Lcsl-SVM. We subsequently demonstrate that the Lcsl support vectors comprise merely a minor fraction of entire training data. This observation leads us to introduce the concept of the working set. Furthermore, we design a novel subspace fast algorithm with working set (named as Lcsl-ADMM) for solving Lcsl-SVM, which is proven that Lcsl-ADMM has both global convergence and relatively low computational complexity. Finally, numerical experiments show that Lcsl-ADMM has excellent performances in terms of getting the best classification accuracy, using the shortest time and presenting the smallest numbers of support vectors when solving large-scale pattern classification problems.