Finding Anchors for Genomic Sequence Comparison

Ross A Lippert,Sorin Istrail,Clark Mobarry,Liliana Florea,Xiaoyue Zhao

doi:10.1089/cmb.2005.12.762

Abstract

Recent sequencing of the human and other mammalian genomes has brought about the necessity to align them, to identify and characterize their commonalities and differences. Programs that align whole genomes generally use a seed-and-extend technique, starting from exact or near-exact matches and selecting a reliable subset of these, called anchors, and then filling in the remaining portions between the anchors using a combination of local and global alignment algorithms, but their choices for the parameters so far have been primarily heuristic. We present a statistical framework and practical methods for selecting a set of matches that is both sensitive and specific and can constitute a reliable set of anchors for a one-to-one mapping of two genomes from which a whole-genome alignment can be built. Starting from exact matches, we introduce a novel per-base repeat annotation, the Z-score, from which noise and repeat filtering conditions are explored. Dynamic programming-based chaining algorithms are also evaluated as context-based filters. We apply the methods described here to the comparison of two progressive assemblies of the human genome, NCBI build 28 and build 34 (www.genome.ucsc.edu), and show that a significant portion of the two genomes can be found in selected exact matches, with very limited amount of sequence duplication.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Finding Anchors for Genomic Sequence Comparison

Abstract

Talk to us

Similar Papers

More From: Journal of Computational Biology

Lead the way for us

Journal: Journal of Computational Biology	Publication Date: Jul 1, 2005
Citations: 14

Similar Papers

Finding anchors for genomic sequence comparison
Ross A Lippert ... Liliana Florea
-
Ross A Lippert, et. al.Ross A Lippert ... Liliana Florea
01 Jan 2004
01 Jan 2004

Computational Methods to Locate and Reconstruct Genes for Complexity Reduction in Comparative Genomics
Vidya A ... Patnaik L.M
-
Vidya A, et. al.Vidya A ... Patnaik L.M
01 Jan 2010
01 Jan 2010

Survey of biological network alignment: cross-species analysis of conserved systems
Sawal Maskey ... Young-Rae Cho
-
Sawal Maskey, et. al.Sawal Maskey ... Young-Rae Cho
01 Nov 2019
01 Nov 2019

SL-GLAlign: improving local alignment of biological networks through simulated annealing
Marianna Milano ... Pietro Hiram Guzzi
Network Modeling Analysis in Health Informatics and Bioinformatics | VOL. 9
Marianna Milano, et. al.Marianna Milano ... Pietro Hiram Guzzi
08 Jan 2020
Network Modeling Analysis in Health Informatics and Bioinformatics | VOL. 9

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Finding Anchors for Genomic Sequence Comparison

Abstract

Talk to us

Similar Papers

More From: Journal of Computational Biology