Automatic identification and mapping of tree species is an essential task in forestry and conservation. However, applications that can geolocate individual trees and identify their species in heterogeneous forests on a large scale are lacking. Here, we assessed the potential of the Convolutional Neural Network algorithm, Faster R-CNN, which is an efficient end-to-end object detection approach, combined with open-source aerial RGB imagery for the identification and geolocation of tree species in the upper canopy layer of heterogeneous temperate forests. We studied four tree species, i.e., Norway spruce (Picea abies (L.) H. Karst.), silver fir (Abies alba Mill.), Scots pine (Pinus sylvestris L.), and European beech (Fagus sylvatica L.), growing in heterogeneous temperate forests. To fully explore the potential of the approach for tree species identification, we trained single-species and multi-species models. For the single-species models, the average detection accuracy (F1 score) was 0.76. Picea abies was detected with the highest accuracy, with an average F1 of 0.86, followed by A. alba (F1 = 0.84), F. sylvatica (F1 = 0.75), and Pinus sylvestris (F1 = 0.59). Detection accuracy increased in multi-species models for Pinus sylvestris (F1 = 0.92), while it remained the same or decreased slightly for the other species. Model performance was more influenced by site conditions, such as forest stand structure, and less by illumination. Moreover, the misidentification of tree species decreased as the number of species included in the models increased. In conclusion, the presented method can accurately map the location of four individual tree species in heterogeneous forests and may serve as a basis for future inventories and targeted management actions to support more resilient forests.