Abstract

Developments in high-throughput sequencing (HTS) result in an exponential increase in the amount of data generated by sequencing experiments, an increase in the complexity of bioinformatics analysis reporting and an increase in the types of data generated. These increases in volume, diversity and complexity of the data generated and their analysis expose the necessity of a structured and standardized reporting template. BioCompute Objects (BCOs) provide the requisite support for communication of HTS data analysis that includes support for workflow, as well as data, curation, accessibility and reproducibility of communication. BCOs standardize how researchers report provenance and the established verification and validation protocols used in workflows while also being robust enough to convey content integration or curation in knowledge bases. BCOs that encapsulate tools, platforms, datasets and workflows are FAIR (findable, accessible, interoperable and reusable) compliant. Providing operational workflow and data information facilitates interoperability between platforms and incorporation of future dataset within an HTS analysis for use within industrial, academic and regulatory settings. Cloud-based platforms, including High-performance Integrated Virtual Environment (HIVE), Cancer Genomics Cloud (CGC) and Galaxy, support BCO generation for users. Given the 100K+ userbase between these platforms, BioCompute can be leveraged for workflow documentation. In this paper, we report the availability of platform-dependent and platform-independent BCO tools: HIVE BCO App, CGC BCO App, Galaxy BCO API Extension and BCO Portal. Community engagement was utilized to evaluate tool efficacy. We demonstrate that these tools further advance BCO creation from text editing approaches used in earlier releases of the standard. Moreover, we demonstrate that integrating BCO generation within existing analysis platforms greatly streamlines BCO creation while capturing granular workflow details. We also demonstrate that the BCO tools described in the paper provide an approach to solve the long-standing challenge of standardizing workflow descriptions that are both human and machine readable while accommodating manual and automated curation with evidence tagging. Database URL: https://www.biocomputeobject.org/resources

Highlights

  • The availability of high-throughput sequencing (HTS) data, referred to as next-generation sequencing (NGS) data, is growing at exponential rates due to decreasing costs to generate, store and analyze NGS data

  • This paper reports the availability of platformindependent and platform-dependent BioCompute Objects (BCOs) creation tools and the evaluation of these tools through a novel research initiative with novice bioinformaticians as the initial audience for the platform-independent (BCO Portal) and platform-dependent (HIVE BCO App and Cancer Genomics Cloud (CGC) BCO App)

  • We report on the efficacy on which novice bioinformaticians can generate BCOs after conducting platformspecific trainings (HIVE and CGC)

Read more

Summary

Introduction

The availability of high-throughput sequencing (HTS) data, referred to as next-generation sequencing (NGS) data, is growing at exponential rates due to decreasing costs to generate, store and analyze NGS data. Bioinformatics in support of NGS analysis are evolving rapidly: every day novel algorithms are published, researchers generate new interpretations and applications for existing workflows and regulatory sponsors submit data and analysis as regulatory evidence for maintenance and review to regulatory bodies. Platforms such as the High-performance Integrated Virtual Environment (HIVE) [1, 2], Cancer Genomics Cloud (CGC) [3] and Galaxy [4] contain robust infrastructure to support this research—from the start of an analysis to summarizing results and to validation checks that ensure workflow reproducibility.

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call