$$l_1$$l1-minimization ($$l_1$$l1-min) algorithms for the $$l_1$$l1-min problem have been widely developed. For most $$l_1$$l1-min algorithms, their main components include dense matrix-vector multiplications such as Ax and $$A^Tx$$ATx, and vector operations. We propose a novel warp-based implementation of the matrix-vector multiplication (Ax) on the graphics processing unit (GPU), called the GEMV kernel, and a novel thread-based implementation of the matrix-vector multiplication ($$A^Tx$$ATx) on the GPU, called the GEMV-T kernel. For the GEMV kernel, a self-adaptive warp allocation strategy is used to assign the optimal warp number for each matrix row. Similar to the GEMV kernel, we design a self-adaptive thread allocation strategy to assign the optimal thread number to each matrix row for the GEMV-T kernel. Two popular $$l_1$$l1-min algorithms, fast iterative shrinkage-thresholding algorithm and augmented Lagrangian multiplier method, are taken for example. Based on the GEMV and GEMV-T kernels, we present two highly parallel $$l_1$$l1-min solvers on the GPU utilizing the technique of merging kernels and the sparsity of the solution of the $$l_1$$l1-min algorithms. Furthermore, we design a concurrent multiple $$l_1$$l1-min solver on the GPU, and optimize its performance by using new features of GPU such as the shuffle instruction and read-only data cache. The experimental results have validated high efficiency and good performance of our methods.