Abstract This paper proposes an approach for estimating stereo disparity based on a Siamese network. The existing Top-k disparity regression strategy may overlook the true disparity at object edges. To address this issue, an additional maximum disparity estimation value is introduced to optimize the Top-k strategy, creating the Top-k+ strategy. Furthermore, this research improves the MobileNetv2 (MV2) block by introducing an attention mechanism in the frequency domain, resulting in a more efficient FMV2 block that extracts high-frequency information such as textures and edges. Considering the excellent performance of the gated recurrent unit (GRU) in disparity optimization, the algorithm adopts an iterative optimization method based on GRU. Comparative studies pertaining to the Scene Flow, KITTI 2015, and ETH3D benchmark datasets show outstanding results in disparity estimation, with an end-point error of 0.58 pixels on the Scene Flow dataset, a D1-all error of 1.67% on the KITTI 2015 dataset, and an average disparity error of 0.22 pixels on the ETH3D dataset. Compared to other GRU-based iterative optimization algorithms, the proposed method not only exhibits significant performance advantages but also demonstrates a lightweight design.