The HTPmod Shiny application enables modeling and visualization of large-scale biological data

Dijun Chen,Kerstin Kaufmann,Ming Chen,Christian Klukas,Liang-Yu Fu,Dahui Hu

doi:10.1038/s42003-018-0091-x

Abstract

The wave of high-throughput technologies in genomics and phenomics are enabling data to be generated on an unprecedented scale and at a reasonable cost. Exploring the large-scale data sets generated by these technologies to derive biological insights requires efficient bioinformatic tools. Here we introduce an interactive, open-source web application (HTPmod) for high-throughput biological data modeling and visualization. HTPmod is implemented with the Shiny framework by integrating the computational power and professional visualization of R and including various machine-learning approaches. We demonstrate that HTPmod can be used for modeling and visualizing large-scale, high-dimensional data sets (such as multiple omics data) under a broad context. By reinvestigating example data sets from recent studies, we find not only that HTPmod can reproduce results from the original studies in a straightforward fashion and within a reasonable time, but also that novel insights may be gained from fast reinvestigation of existing data by HTPmod.

Highlights

The wave of high-throughput technologies in genomics and phenomics are enabling data to be generated on an unprecedented scale and at a reasonable cost
By reinvestigating example data sets from recent studies, we demonstrate that HTPmod can be used for modeling and visualizing multiple types of omics data under a broad context in a straightforward and an efficient fashion
By integrating existing machine-learning approaches applied in high-throughput experiments[1,25,26], HTPmod was implemented with the Shiny framework, which combines the computational power of R with friendly and interactive web interfaces

Summary

Introduction

The wave of high-throughput technologies in genomics and phenomics are enabling data to be generated on an unprecedented scale and at a reasonable cost. Exploring the large-scale data sets generated by these technologies to derive biological insights requires efficient bioinformatic tools. We introduce an interactive, open-source web application (HTPmod) for high-throughput biological data modeling and visualization. We demonstrate that HTPmod can be used for modeling and visualizing large-scale, highdimensional data sets (such as multiple omics data) under a broad context. The immense volume, variety, velocity, and veracity of high-throughput biological data generated by these technologies make it a big data problem[11,12,13]. By reinvestigating example data sets from recent studies, we demonstrate that HTPmod can be used for modeling and visualizing multiple types of omics data (such as phenomics, transcriptomics, metabolomics, and epigenomics data) under a broad context in a straightforward and an efficient fashion

Methods

Results

Conclusion