BackgroundThe manual study of root dynamics using images requires huge investments of time and resources and is prone to previously poorly quantified annotator bias. Artificial intelligence (AI) image-processing tools have been successful in overcoming limitations of manual annotation in homogeneous soils, but their efficiency and accuracy is yet to be widely tested on less homogenous, non-agricultural soil profiles, e.g., that of forests, from which data on root dynamics are key to understanding the carbon cycle. Here, we quantify variance in root length measured by human annotators with varying experience levels. We evaluate the application of a convolutional neural network (CNN) model, trained on a software accessible to researchers without a machine learning background, on a heterogeneous minirhizotron image dataset taken in a multispecies, mature, deciduous temperate forest.ResultsLess experienced annotators consistently identified more root length than experienced annotators. Root length annotation also varied between experienced annotators. The CNN root length results were neither precise nor accurate, taking ~ 10% of the time but significantly overestimating root length compared to expert manual annotation (p = 0.01). The CNN net root length change results were closer to manual (p = 0.08) but there remained substantial variation.ConclusionsManual root length annotation is contingent on the individual annotator. The only accessible CNN model cannot yet produce root data of sufficient accuracy and precision for ecological applications when applied to a complex, heterogeneous forest image dataset. A continuing evaluation and development of accessible CNNs for natural ecosystems is required.