In this paper, we address the challenging points of binocular disparity estimation: (1) unsatisfactory results in the occluded region when utilizing warping function in unsupervised learning; (2) inefficiency in running time and the number of parameters as adopting a lot of 3D convolutions in the feature matching module. To solve these drawbacks, we propose a patch attention network for semi-supervised stereo matching learning. First, we employ a channel-attention mechanism to aggregate the cost volume by selecting its different surfaces for reducing a large number of 3D convolution, called the patch attention network (PA-Net). Second, we use our proposed PA-Net as a generator and then combine it, traditional unsupervised learning loss, and the adversarial learning model to construct a semi-supervised learning framework for improving performance in the occluded areas. We have trained our PA-Net in supervised learning, semi-supervised learning, and unsupervised learning manners. Extensive experiments show that (1) our semi-supervised learning framework can overcome the drawbacks of unsupervised learning and significantly improve the performance in the ill-posed region by using only a few or inaccurate ground truths; (2) our PA-Net can outperform other state-of-the-art approaches in supervised learning and use fewer parameters.