Abstract

BackgroundAutomated bioinformatics workflows are more robust, easier to maintain, and results more reproducible when built with command-line utilities than with custom-coded scripts. Command-line utilities further benefit by relieving bioinformatics developers to learn the use of, or to interact directly with, biological software libraries. There is however a lack of command-line utilities that leverage popular Open Source biological software toolkits such as BioPerl (http://bioperl.org) to make many of the well-designed, robust, and routinely used biological classes available for a wider base of end users.ResultsDesigned as standard utilities for UNIX-family operating systems, BpWrapper makes functionality of some of the most popular BioPerl modules readily accessible on the command line to novice as well as to experienced bioinformatics practitioners. The initial release of BpWrapper includes four utilities with concise command-line user interfaces, bioseq, bioaln, biotree, and biopop, specialized for manipulation of molecular sequences, sequence alignments, phylogenetic trees, and DNA polymorphisms, respectively. Over a hundred methods are currently available as command-line options and new methods are easily incorporated. Performance of BpWrapper utilities lags that of precompiled utilities while equivalent to that of other utilities based on BioPerl. BpWrapper has been tested on BioPerl Release 1.6, Perl versions 5.10.1 to 5.25.10, and operating systems including Apple macOS, Microsoft Windows, and GNU/Linux. Release code is available from the Comprehensive Perl Archive Network (CPAN) at https://metacpan.org/pod/Bio::BPWrapper. Source code is available on GitHub at https://github.com/bioperl/p5-bpwrapper.ConclusionsBpWrapper improves on existing sequence utilities by following the design principles of Unix text utilities such including a concise user interface, extensive command-line options, and standard input/output for serialized operations. Further, dozens of novel methods for manipulation of sequences, alignments, and phylogenetic trees, unavailable in existing utilities (e.g., EMBOSS, Newick Utilities, and FAST), are provided. Bioinformaticians should find BpWrapper useful for rapid prototyping of workflows on the command-line without creating custom scripts for comparative genomics and other bioinformatics applications.

Highlights

  • Automated bioinformatics workflows are more robust, easier to maintain, and results more reproducible when built with command-line utilities than with custom-coded scripts

  • There are two distinct approaches to develop a command-line pipeline for this purpose, both relying on biological Application Programming Interfaces (APIs) such as BioPerl and BioPython [3,4,5]

  • Whereas existing sequence utilities (e.g., European Molecular Biology Open Software Suite (EMBOSS), Newick Utilities, and Fast Analysis of Sequences Toolbox (FAST)) offer methods for manipulating sequences or Newick trees but not both, BpWrapper includes over a hundred methods for manipulating sequences, alignments, and phylogenetic trees, many of which are novel and not found in existing utilities

Read more

Summary

Introduction

Automated bioinformatics workflows are more robust, easier to maintain, and results more reproducible when built with command-line utilities than with custom-coded scripts. In one approach based on the BioPerl toolkit, one may compose a custom Perl script that calls the Bio::SeqIO module to read the nucleotide sequences and store them as Bio::Seq objects. The script will call an external program (e.g., MUSCLE [6]) to align the protein sequences and produce a second temporary file, which will subsequently be read back and turned into a Bio::SimpleAlign object. The script will invoke the “aa_to_dna_aln()” method of the Bio::Align::Utilities module to produce the codon-based alignment using nucleotide sequences stored as Bio::Seq objects and the protein alignment stored as a Bio::SimpleAlign object. One may use existing (or design new) command-line utilities for each of the above steps and accomplish the same task exclusively using commands on a Unix-like operating system, such as GNU/Linux or macOS

Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.