Accurate cloud detection is a crucial initial stage in optical satellite remote sensing. In this study, a daytime cloud mask model is proposed for the Advanced Geostationary Radiation Imager (AGRI) onboard the Fengyun 4A (FY-4A) satellite based on a deep learning approach. The model, named “Convolutional and Attention-based Cloud Mask Net (CACM-Net)”, was trained using the 2021 dataset with CALIPSO data as the truth value. Two CACM-Net models were trained based on a satellite zenith angle (SZA) < 70° and >70°, respectively. The study evaluated the National Satellite Meteorological Center (NSMC) cloud mask product and compared it with the method established in this paper. The results indicate that CACM-Net outperforms the NSMC cloud mask product overall. Specifically, in the SZA < 70° subset, CACM-Net enhances accuracy, precision, and F1 score by 4.8%, 7.3%, and 3.6%, respectively, while reducing the false alarm rate (FAR) by approximately 7.3%. In the SZA > 70° section, improvements of 12.2%, 19.5%, and 8% in accuracy, precision, and F1 score, respectively, were observed, with a 19.5% reduction in FAR compared to NSMC. An independent validation dataset for January–June 2023 further validates the performance of CACM-Net. The results show improvements of 3.5%, 2.2%, and 2.8% in accuracy, precision, and F1 scores for SZA < 70° and 7.8%, 11.3%, and 4.8% for SZA > 70°, respectively, along with reductions in FAR. Cross-comparison with other satellite cloud mask products reveals high levels of agreement, with 88.6% and 86.3% matching results with the MODIS and Himawari-9 products, respectively. These results confirm the reliability of the CACM-Net cloud mask model, which can produce stable and high-quality FY-4A AGRI cloud mask results.