Abstract

A data set consisting of DNA sequences from a large-scale shotgun DNA cloning and sequencing project has been collected and posted for public release. The purpose is to propose a standard genomic DNA sequencing data set by which various algorithms and implementations can be tested. This set of data is divided into two subsets, one containing raw DNA sequence data (1023 clones) and the other consisting of the corresponding partially refined or edited DNA sequence data (820 clones). Suggested criteria or guidelines for this data refinement are presented so that algorithms for preprocessing and screening raw sequences may be developed. Development of such preprocessing, screening, aligning, and assembling algorithms will expedite large-scale DNA sequencing projects so that the complete unambiguous consensus DNA sequences will be made available to the general research community in a quicker manner. Smaller scale routine DNA sequencing projects will also be greatly aided by such computational efforts.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.