Detecting and mapping paddy fields in Taiwan’s agriculture is crucial for managing agricultural production, predicting yields, and assessing damages. Although researchers at the Taiwan Agricultural Research Institute currently use site surveys to identify rice planting areas, this method is time-consuming. This study aimed to determine the optimal band combinations and vegetation index for accurately detecting paddy fields during various phenological stages. Additionally, the Mask RCNN instance segmentation model in the ArcGIS Pro software was employed to enhance the effectiveness of detecting and segmenting paddy fields in aerial images. This study utilized aerial images collected from 2018 to 2019 covering Changhua, Yunlin, Chiayi, and Tainan in central and southern Taiwan, with a label file comprising four categories of rice growing, ripening, harvested stage, and other crops. To create different image datasets, the image pre-processing stage involved modifying band information using different vegetation indices, including NDVI, CMFI, DVI, RVI, and GRVI. The resolution of the training image chips was cropped to 550 by 550 pixels. After the model training process, the study found that the ResNet-50 backbone performed better than the ResNet-101, and the RGB + DVI image dataset achieved the highest mean average precision of 74.01%. In addition, the model trained on the RGB + CMFI image dataset was recommended for detecting paddy fields in the rice growing stage, RGB + NIR for the rice ripening stage, and RGB + GRVI for the rice harvested stage. These models exhibit Dice coefficients of 79.59%, 89.71%, and 87.94%, respectively. The detection and segmentation results can improve the efficiency of rice production management by using different band combinations according to different rice phenological stages. Furthermore, this method can be applied to large-scale detection of other crops, improving land use survey efficiency and reducing the burden on researchers.