Abstract

Simple Sequence Repeats (SSRs) are abundant in genome sequences and become popular biomarkers for genetic studies. Several SSRs were proved essential for gene regulation, abnormal repeat patterns of these critical SSRs might cause lethal diseases. The Next Generation Sequencing technologies provided efficient approaches for SSR polymorphism detection. However, inefficient and manually curated processes were unavoidable for identifying SSR markers in previous approaches. An automatic and efficient system for detecting polymorphic SSRs at genomic scales was proposed without manual curated and examining works. The workflow accepted multiple NGS sequencing datasets and started with assembly by de novo or reference mapping approaches. The consensus sequences were then obtained from previously assembled contigs, and calibrated coordinates in each individual contig were aligned according to the selected reference sequences. Next, the mining SSR mechanism was designed to retrieve all potential polymorphic SSRs whenever the circumstances were occurred due to insertion or deletion mechanisms. The 1000 genomes Trio projects were employed as the testing sequence datasets, and the CODIS SSR markers and 9 well known disease-related SSR motifs were verified as the testing targets. The results have shown the proposed method could identify the known polymorphic SSRs as well as novel SSR markers when there was no sequencing or mapping errors within the consensus sequences. The proposed method employed NGS technologies to identify SSR polymorphism and accelerate related researches, which facilitates novel SSR biomarker selection and regulatory elements discovery.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.