Abstract

A pan-genome describes the full complement of genes in species. It is a superset of all the genes in all the individuals of a species, which is composed of a 'core genome' containing genes present in all individuals, and a 'dispensable genome' containing genes present only in some individuals and individual-specific genes. From pan-genome sight, 30 finished genomes from Escherichia coli were employed to analyze their gene and genome compositions and evaluation in this study. The results indicated that the core genes accounted for about 50% of the total number of genes, while about 146 strain-specific genes existed in the each strain tested. The data suggests that the E. coli pan-genome is vast, and unique genes will continue to be identified when more E. coli genomes are sequenced. After analyzing relationships of the gene conservation, GC content and selection pressure in different strains tested, we found that more conserved genes had a nar-row range of GC content, and they also bear more selection pressure. These results will be helpful for better understanding of the evolution profile of E. coli genome, and the dynamic changes of its gene compositions. The E. coli pan-genome pro-vides useful information for prevention and control of the diseases caused by pathogenic E. coli, and also provides a para-digm for the large-scale analysis of pathogenic bacteria genomes.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call