Cotton leafroll dwarf virus (CLRDV), a threat to the cotton industry, was first reported in the United States (US) as an emergent pathogen in 2017. Phylogenetic analysis supports the hypothesis that US CLRDV strains are genetically distinct from strains in South America and elsewhere, which is not consistent with the hypothesis that the virus is newly introduced into the country. Using database mining, we evaluated the timeline and geographic distribution of CLRDV in the country. We uncovered evidence that shows CLRDV had been in the US for over a decade before its official first report. CLRDV sequences were detected in datasets derived from Mississippi in 2006, Louisiana in 2015, and California in 2018. Additionally, through field surveys of upland cotton in 2023, we confirmed that CLRDV is present in California, which had no prior reports of the virus. Viral sequences from these old and new datasets exhibited high nucleotide identities (>98%) with recently characterized US isolates, and phylogenetic analyses with their homologs placed these sequences within a US-specific clade, further supporting the earlier presence of CLRDV in the country. Moreover, potential new hosts, including another fiber crop, flax, were determined through data mining. Retrospective analysis suggests CLRDV has been present in the US since at least 2006 (Mississippi). Our findings challenge the current understanding of the arrival and spread of CLRDV in the US, highlight the power of data mining for virus discovery, and underscore the need for further investigation into CLRDV's impact on US cotton.
Read full abstract