Abstract

High-throughput sequencing continues to produce an immense volume of information that is processed and assembled into mature sequence data. Data analysis tools are urgently needed that leverage the embedded DNA sequence polymorphisms and consequent changes to restriction sites or sequence motifs in a high-throughput manner to enable biological experimentation. CisSERS was developed as a standalone open source tool to analyze sequence datasets and provide biologists with individual or comparative genome organization information in terms of presence and frequency of patterns or motifs such as restriction enzymes. Predicted agarose gel visualization of the custom analyses results was also integrated to enhance the usefulness of the software. CisSERS offers several novel functionalities, such as handling of large and multiple datasets in parallel, multiple restriction enzyme site detection and custom motif detection features, which are seamlessly integrated with real time agarose gel visualization. Using a simple fasta-formatted file as input, CisSERS utilizes the REBASE enzyme database. Results from CisSERS enable the user to make decisions for designing genotyping by sequencing experiments, reduced representation sequencing, 3’UTR sequencing, and cleaved amplified polymorphic sequence (CAPS) molecular markers for large sample sets. CisSERS is a java based graphical user interface built around a perl backbone. Several of the applications of CisSERS including CAPS molecular marker development were successfully validated using wet-lab experimentation. Here, we present the tool CisSERS and results from in-silico and corresponding wet-lab analyses demonstrating that CisSERS is a technology platform solution that facilitates efficient data utilization in genomics and genetics studies.

Highlights

  • High-throughput sequencing technologies continue to generate vast amounts of information

  • The DNA sequence information is processed for quality and assembled into contigs resulting in the generation of mature sequence data that is subsequently utilized by biologists in wet lab experiments

  • While there are few canonical transcriptional start sites (TATAAT sites) associated with Nostoc genes (Nos7107_0081 hypothetical protein and Nos7101_1087 group 1 glycosyl transferase on the forward strand and Nos7107_3714 hypothetical protein on the reverse strand), the distinctive functionality of CisSERS can be used with the degenerate nucleotide base codes to increase the identification of possible cis-element Pribnow box motif doi:10.1371/journal.pone.0152404.g003

Read more

Summary

Introduction

High-throughput sequencing technologies continue to generate vast amounts of information. The DNA sequence information is processed for quality and assembled into contigs resulting in the generation of mature sequence data that is subsequently utilized by biologists in wet lab experiments. Existing sequence data can be harnessed for nucleotide polymorphism information, ascertaining genetic diversity in a population, and reduced representation sequencing. The consequences of nucleotide polymorphisms are diverse They might result in altering the phenotype if there is a change in an amino acid or alterations in the regulatory regions. Biologists endeavor to first identify and utilize the polymorphic information to establish causal relationships between the genotype and the phenotype in genomics and genetics approaches

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call