Abstract
RNA aptamers are small oligonucleotide molecules whose composition and resulting folded structure enable them to bind with high affinity and high selectivity to target ligands and therefore hold great promise as potential therapeutic drugs. Functional aptamers are selected from a large, randomized initial library in a process known as SELEX (systematic evolution of ligands by exponential enrichment). This is an iterative process involving numerous rounds of binding, elution, and amplification against a specific target substrate. During each iteration -- or round of selection -- we enrich for the species with the highest binding affinity to the target. After multiple rounds, we ideally have an enriched aptamer library suitable for subsequent investigation. Modern techniques employ massively parallel sequencing, enabling the generation of large libraries (~106 sequences) in a matter of hours for each round of selection. As RNA is single-stranded, covariance models (CMs) are ideal for representing motifs in their secondary structures, allowing us to discover patterns within functional aptamer populations following each round. CMs have been implemented in Infernal, a program that infers RNA alignments based on RNA sequence and structure. Calibrating a single CM in Infernal can take several hours and is a significant performance bottleneck for our work. However, as each CM calculation is itself independently determined and requires defined processing and memory resources, their computation in parallel offers a potential solution to this problem. In this paper, we describe using the Open Science Grid (OSG) to facilitate the identification of aptamer motifs by running CM calibrations and refinements in parallel across up to ten OSG clients. We use the Simple API for Grid Applications (SAGA) to interface with OSG and manage job submissions and file transfers. When run in parallel, our results show a significant speed up, constrained by typical latencies and QoS associated with nominal OSG usage. Our work demonstrates the ability of SAGA and the OSG to assist in parallelizing solutions to complex sequencing-based biomedical challenges.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.