Abstract

The chloroplast genome sequence of Coffea arabica L., the first sequenced member of the fourth largest family of angiosperms, Rubiaceae, is reported. The genome is 155 189 bp in length, including a pair of inverted repeats of 25,943 bp. Of the 130 genes present, 112 are distinct and 18 are duplicated in the inverted repeat. The coding region comprises 79 protein genes, 29 transfer RNA genes, four ribosomal RNA genes and 18 genes containing introns (three with three exons). Repeat analysis revealed five direct and three inverted repeats of 30 bp or longer with a sequence identity of 90% or more. Comparisons of the coffee chloroplast genome with sequenced genomes of the closely related family Solanaceae indicated that coffee has a portion of rps19 duplicated in the inverted repeat and an intact copy of infA. Furthermore, whole-genome comparisons identified large indels (> 500 bp) in several intergenic spacer regions and introns in the Solanaceae, including trnE (UUC)-trnT (GGU) spacer, ycf4-cemA spacer, trnI (GAU) intron and rrn5-trnR (ACG) spacer. Phylogenetic analyses based on the DNA sequences of 61 protein-coding genes for 35 taxa, performed using both maximum parsimony and maximum likelihood methods, strongly supported the monophyly of several major clades of angiosperms, including monocots, eudicots, rosids, asterids, eurosids II, and euasterids I and II. Coffea (Rubiaceae, Gentianales) is only the second order sampled from the euasterid I clade. The availability of the complete chloroplast genome of coffee provides regulatory and intergenic spacer sequences for utilization in chloroplast genetic engineering to improve this important crop.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call