Traditional Chinese villages, vital carriers of traditional culture, have faced significant alterations due to urbanization in recent years, urgently necessitating artificial intelligence data updates. This study integrates high spatial resolution remote sensing imagery with deep learning techniques, proposing a novel method for identifying rooftops of traditional Chinese village buildings using high-definition remote sensing images. Using 0.54 m spatial resolution imagery of traditional village areas as the data source, this method analyzes the geometric and spectral image characteristics of village building rooftops. It constructs a deep learning feature sample library tailored to the target types. Employing a semantically enhanced version of the improved Mask R-CNN (Mask Region-based Convolutional Neural Network) for building recognition, the study conducts experiments on localized imagery from different regions. The results demonstrated that the modified Mask R-CNN effectively identifies traditional village building rooftops, achieving an of 0.7520 and an of 0.7400. It improves the current problem of misidentification and missed detection caused by feature heterogeneity. This method offers a viable and effective approach for industrialized data monitoring of traditional villages, contributing to their sustainable development.