Abstract

Summary: The Sun Grid Engine (SGE) high-performance computing batch queueing system is commonly used in bioinformatics analysis. Creating re-usable scripts for the SGE is a common challenge. The qsubsec template language and interpreter described here allow researchers to easily create generic template definitions that encapsulate a particular computational job, effectively separating the process logic from the specific run details. At submission time, the generic template is filled in with specific values. This system provides an intermediate level between simple scripting and complete workflow management tools.Availability and implementation: Qsubsec is open-source and is available at https://github.com/alastair-droop/qsubsec.Contact: a.p.droop@leeds.ac.ukSupplementary information: Supplementary data are available at Bioinformatics online.

Highlights

  • High-performance computing (HPC) is fast becoming an essential part of all but the smallest bioinformatics analyses

  • These systems attempt to be complete data management and integration tools. They allow users to search online data repositories, download relevant data and provide sets of common analysis tools (Deelman et al, 2009). These systems attempt to simplify common bioinformatics analyses but there can be several major hurdles: installation and maintenance are not trivial, workflow definitions often require knowledge of languages such as CWL or SCUFL (Oinn et al, 2004), and the user is conceptually removed from the running code

  • I here describe qsubsec, a Python-based mini-language that separates the core logic of a computational task from the specific data for a single instance without the overhead of a workflow management system

Read more

Summary

Introduction

High-performance computing (HPC) is fast becoming an essential part of all but the smallest bioinformatics analyses. They allow users to search online data repositories, download relevant data and provide sets of common analysis tools (Deelman et al, 2009) These systems attempt to simplify common bioinformatics analyses but there can be several major hurdles: installation and maintenance are not trivial, workflow definitions often require knowledge of languages such as CWL (https://github.com/common-workflow-lan guage/common-workflow-language) or SCUFL (Oinn et al, 2004), and the user is conceptually removed from the running code. I here describe qsubsec, a Python-based mini-language that separates the core logic of a computational task from the specific data for a single instance without the overhead of a workflow management system This enables users to write SGE job scripts in a generic form that is processed at submission time into a specific computational task.

Implementation
Example usage
Logging
Summary
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call