So, you want to sequence a genome...

Derek L Stemple

doi:10.1186/gb-2013-14-7-128

Abstract

Anyone who has attempted to identify the responsible gene or mutation underlying a disease or mutant phenotype will know the importance of an accurate reference genome assembly. For complex vertebrate genomes, however, generating such an assembly is not trivial, even with new sequencing technologies. In 2007, a mid-point in the zebrafish genome sequencing project, I was asked to lead the project to completion. At that point we were faced with a highly fragmented physical assembly and lacked genetic maps of sufficient density and resolution to produce an accurate assembly. A high-quality reference genome assembly is generally made up of a large set of minimally overlapping large-insert genomic clones, each of which has been sequenced to completion, with a minimal number of gaps and with no artificially duplicated regions. These high-quality reference genome assemblies, such as the current human reference genome (http://www.genomereference.org), are essential for modern molecular genetic studies. For many species, however, only lower quality whole-genome shotgun assemblies are available. When one considers, for example, only the protein-coding genes, this quality of genome sequence is often not sufficient to determine the complete gene count or comprehensive set of accurate gene models. It is important, for the best application of the genomic information, that the reference genome be complete and accurately assembled. While high-throughput short-read sequencing using the current generation of machines will yield high quality for bacterial, and other small, genomes, it is not possible to completely and accurately assemble the large, complex genomes of vertebrates without other long-range contiguity information. Experience with the zebrafish genome [1] may provide some useful guidance for anyone embarking on a genome-sequencing project for new species with a complex genome. The human, mouse and zebrafish reference genomes were assembled using old-school approaches, where the long-range contiguity was derived from genetic or genomic mapping and not derived directly from sequencing reads or read-pairs. The maps used were accurate physical maps of overlapping genomic DNA fragments or high-resolution genetic maps with a high density of short sequence markers, but such maps are expensive and time-consuming to generate. There are some good possibilities for cheaper, easier way to generate accurate maps, but there are several issues to consider.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Genome Biology	Publication Date: Jan 1, 2013
Citations: 6	License type: NO-CC CODE

R Discovery Prime

R Discovery Prime

So, you want to sequence a genome...

Abstract

Talk to us

Similar Papers

More From: Genome Biology

Lead the way for us

Similar Papers

A 3-way hybrid approach to generate a new high-quality chimpanzee reference genome (Pan_tro_3.0).
...
GigaScience | VOL. 6
, et. al. ...
30 Oct 2017
GigaScience | VOL. 6

High-resolution genetic linkage map of European pear (Pyrus communis) and QTL fine-mapping of vegetative budbreak time
Gilad Gabay ... Giora Ben-Ari
BMC Plant Biology | VOL. 18
Gilad Gabay, et. al.Gilad Gabay ... Giora Ben-Ari
31 Aug 2018
BMC Plant Biology | VOL. 18

Capturing the Perfect Reference Genome
Andrew S Wiecek
BioTechniques | VOL. 49
Andrew S WiecekAndrew S Wiecek
01 Sep 2010
BioTechniques | VOL. 49

Construction of a high-density, high-resolution genetic map and its integration with BAC-based physical map in channel catfish.
Y Li ... Z Qin
DNA Research | VOL. 22
Y Li, et. al.Y Li ... Z Qin
26 Nov 2014
DNA Research | VOL. 22

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

So, you want to sequence a genome...

Abstract

Talk to us

Similar Papers

More From: Genome Biology