DNA barcoding helps to identify species, especially when identification is based on parts of organisms or life stages such as seeds, pollen, wood, roots or juveniles. However, the implementation of this approach strongly depends on the existence of complete reference libraries of DNA sequences. If such a library is incomplete, DNA-based identification will be inefficient. Here, we assess if DNA barcoding can already be implemented in species-rich tropical regions. We focus on the tree flora of São Paulo state, Brazil, which contains more than 2000 tree species. Using new DNA sequence data and carefully assembled GenBank accessions, we assembled 12,113 sequences from ten different regions. The ITS, rbcL, psbA-trnH, matK and trnL regions were better represented within the available sequences for São Paulo tree flora. Currently, only 58% of the São Paulo tree flora currently have at least one barcoding sequence available. However, these species represent on average 89% of the trees in São Paulo state forests. Therefore, conservation-oriented and ecological studies can already benefit from DNA barcoding to obtain more accurate species identifications. We present which taxa remain underrepresented for the São Paulo tree flora and discuss the implications of this result for other species-rich tropical regions.
Read full abstract