Abstract

Cone snails (genus Conus) have attracted scientific interest for the great neuropharmacological potential of their venoms to treat chronic pain, which consist of a complex mixture of peptides known as conotoxins. For discovery purposes, we have carried out a survey of the venom-ducts of 22 Conus species using next generation high throughput RNAseq (NGS). In silico analyses of these data are complicated because paralogous conotoxin precursors display both highly conserved, as well as hyper varied regions. As a result, NGS-based discovery involves an inherent trade off between fidelity of transcript assembly and sensitivity towards novel discovery. On the one hand, overly lenient assembly parameters create a few, long, but misassembled chimeric transcripts, which lessen the true discovery potential of NGS. On the other hand, overly stringent assembly parameters can mistake sequencing artifacts as novel discoveries. Moreover, many new conotoxins likely remain undiscovered. This fact can complicate homology-based discovery efforts using tools such as BLAST because reference databases may lack homologous peptides, leading to false negative results. With these problems in mind, I developed a comprehensive pipeline for discovery of conotoxins and their modification enzymes from high throughput RNAseq data. My pipeline includes (1) simulation software for benchmarking purposes, (2) a ‘partial extension pipeline' that employs a novel kmerization tool called Taxonomer to rapidly cluster and taxonomically classify reads prior to assembly, and (3) a discovery engine that can identify novel conotoxins even when they lack significant homologs. Collectively, my pipeline maximizes the discovery potential of Conus RNAseq data, identifying on average ~ 30% more full length toxins per sample than any other than approach in use today.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.