Arnica montana L. is a medicinal plant with significant conservation importance. It is crucial to monitor this species, ensuring its sustainable harvesting and management. The aim of this study is to develop a practical system that can effectively detect A. montana inflorescences utilizing unmanned aerial vehicles (UAVs) with RGB sensors (red–green–blue, visible light) to improve the monitoring of A. montana habitats during the harvest season. From a methodological point of view, a model was developed based on a convolutional neural network (CNN) ResNet101 architecture. The trained model offers quantitative and qualitative assessments of A. montana inflorescences detected in semi-natural grasslands using low-resolution imagery, with a correctable error rate. The developed prototype is applicable in monitoring a larger area in a short time by flying at a higher altitude, implicitly capturing lower-resolution images. Despite the challenges posed by shadow effects, fluctuating ground sampling distance (GSD), and overlapping vegetation, this approach revealed encouraging outcomes, particularly when the GSD value was less than 0.45 cm. This research highlights the importance of low-resolution image clarity, on the training data by the phenophase, and of the need for training across different photoperiods to enhance model flexibility. This innovative approach provides guidelines for mission planning in support of reaching sustainable management goals. The robustness of the model can be attributed to the fact that it has been trained with real-world imagery of semi-natural grassland, making it practical for fieldwork with accessible portable devices. This study confirms the potential of ResNet CNN models to transfer learning to new plant communities, contributing to the broader effort of using high-resolution RGB sensors, UAVs, and machine-learning technologies for sustainable management and biodiversity conservation.