Coffea arabica L. (C. arabica) is an economically important agricultural crop and the most popular beverage worldwide. To analyze genetic diversity and provide genetic resources for the selection and breeding of superior varieties of C. arabica, 61 cultivated Arabica coffee accessions were analyzed in the study, including 12 resequencing accessions from previous research and 49 accessions that were resequenced in this study. Single nucleotide polymorphisms (SNPs) and insertion–deletions (InDels) were statistically analyzed. Based on SNP variations, a genetic structure analysis, phylogenetic tree construction, and principal component analysis were performed for the 61 coffee accessions. The results showed that a total of 805.46 Gb of raw whole-genome resequencing data was obtained from the 61 coffee accessions, with 781.29 Gb of high-quality sequencing data after filtering. In total, 7,013,820 SNP sites and 1,074,329 InDel sites were detected. The average sequencing depth ranged from 6.69× to 19.35×, and the coverage ranged from 85.49% to 96.43%. The population genetic structure and phylogenetic analysis of the 61 coffee accessions revealed four lineages, suggesting that they had at least four ancestral genetic components. Catimor exhibited the highest genetic diversity, while Geisha had the lowest genetic diversity. The selective sweep analysis indicated that among the selected genes in Catimor, disease-resistance genes were significantly more numerous than in other coffee varieties. The genome resequencing data and genetic markers identified from the 61 cultivated Arabica coffee materials provided insights into the genetic variation in Arabica coffee germplasm and facilitated extensive genetic research.
Read full abstract