Abstract
AbstractAccuracy of SNP-based whole-genome phylogeny reconstruction relies heavily on quality of sequence alignment which is particularly hindered by poorly assembled genomes. Alignment-free methods might provide additional insights. Here, we constructed a whole-genome phylogeny of 10 isolates from the current German E. coli outbreak against 30 existing E. coli genomes as well as that of a historical EHEC isolate using the alignment-free feature frequency profile method. Our results revealed a high similarity among E. coli isolates from the current outbreak and the historical EHEC being the most closely related isolate sequenced thus far.
Highlights
Accuracy of SNP-based whole-genome phylogeny reconstruction relies heavily on quality of sequence alignment which is hindered by poorly assembled genomes
Our tree generally agrees with that reported by Konrad Paszkiewicz & Kat Holt built using a SNP-based approach (15-06-2011, http://bacpathgenomics.wordpress.com/2011/06/15/snpbase-phylogeny-confirms-similarity-of-e-coli-outbreak-to-eaec-ec55989/), both revealing a high similarity among the outbreak isolates and 55989 being the most closely related isolate sequenced far
We should note that the genetic difference between 55989 and the outbreak isolates in our feature frequency profile (FFP) tree is greater than that in the tree built based on SNPs
Summary
Accuracy of SNP-based whole-genome phylogeny reconstruction relies heavily on quality of sequence alignment which is hindered by poorly assembled genomes. Genome sequences of 30 E. coli isolates from NCBI, and 2. 5. Features occurring more than 3 times in any of the isolates were removed.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have