Abstract

DNA fragment assembly requirements have generated an important computational problem created by their structure and the volume of data. Therefore, it is important to develop algorithms able to produce high-quality information that use computer resources efficiently. Such an algorithm, using graph theory, is introduced in the present article. We first determine the overlaps between DNA fragments, obtaining the edges of a directed graph; with this information, the next step is to construct an adjacency list with some particularities. Using the adjacency list, it is possible to obtain the DNA contigs (group of assembled fragments building a contiguous element) using graph theory. We performed a set of experiments on real DNA data and compared our results to those obtained with common assemblers (Edena and Velvet). Finally, we searched the contigs in the original genome, in our results and in those of Edena and Velvet.

Highlights

  • Each monomer comprising the DNA polymer is formed with a pentose, a phosphate group and one of four nitrogenous bases: adenine, guanine, cytosine and thymine

  • The direction of the polymer chain is determined by the pentose carbon atoms 50 and 30

  • One paradigm for DNA fragment assembly using overlapping fragments is based on a graph

Read more

Summary

Introduction

Each monomer comprising the DNA polymer is formed with a pentose, a phosphate group and one of four nitrogenous bases: adenine, guanine, cytosine and thymine. Watson and Francis Crick [1] discovered the double-helix spatial structure of the DNA molecule This double chain is coiled around a single axis, and the strands are attached by hydrogen bridges between pairs of opposite bases. Sanger et al [4,5] proposed splitting the DNA sequences at random points The disadvantage of this method is that the order of the fragments is unknown, generating an NP-Complete (Non-deterministic polynomial time) [6] computational problem. This method is known as the shotgun technique.

A CCGTCGGA T
Generalities
The Shotgun Technique
Pair Generation
Adjacency List
Specific Characteristics
Contigs
Go to next node
Contig Assembly
For each fragment in the contig
Experiments
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call