F1000Research | VOL. 10
Read

Sherlock: an open-source data platform to store, analyze and integrate Big Data for computational biologists

Publication Date Aug 10, 2022

Abstract

In the era of Big Data, data collection underpins biological research more than ever before. In many cases, this can be as time-consuming as the analysis itself. It requires downloading multiple public databases with various data structures, and in general, spending days preparing the data before answering any biological questions. Here, we introduce Sherlock, an open-source, cloud-based big data platform (https://earlham-sherlock.github.io/) to solve this problem. Sherlock provides a gap-filling way for computational biologists to store, convert, query, share and generate biology data while ultimately streamlining bioinformatics data management. The Sherlock platform offers a simple interface to leverage big data technologies, such as Docker and PrestoDB. Sherlock is designed to enable users to analyze, process, query and extract information from extremely complex and large data sets. Furthermore, Sherlock can handle different structured data (interaction, localization, or genomic sequence) from several sources and convert them to a common optimized storage format, for example, the Optimized Row Columnar (ORC). This format facilitates Sherlock’s ability to quickly and efficiently execute distributed analytical queries on extremely large data files and share datasets between teams. The Sherlock platform is freely available on GitHub, and contains specific loader scripts for structured data sources of genomics, interaction and expression databases. With these loader scripts, users can easily and quickly create and work with s...

Concepts

JavaScript Object Notation Specific File Formats Big Data Era Of Big Data Big Data Technologies Expression Databases Data Platform Open-source Platform Large Data Sets Extract Information

Round-ups are the summaries of handpicked papers around trending topics published every week. These would enable you to scan through a collection of papers and decide if the paper is relevant to you before actually investing time into reading it.

Coronavirus Research Articles published between Sep 26, 2022 to Oct 02, 2022

R DiscoveryOct 03, 2022
R DiscoveryArticles Included:  5

Introduction: Test solutions (Biotrue, renu Advanced [Bausch and Lomb], ACUVUE RevitaLens [Johnson and Johnson Vision], cleadew [Ophtecs corp.] or AOS...

Read More

Good health Research Articles published between Sep 26, 2022 to Oct 02, 2022

R DiscoveryOct 03, 2022
R DiscoveryArticles Included:  2

Patient and public involvement in health care is considered indispensable in the way we conduct daily pediatric neurology practice, and in the develop...

Read More

Quality Of Education Research Articles published between Sep 26, 2022 to Oct 02, 2022

R DiscoveryOct 03, 2022
R DiscoveryArticles Included:  5

Ingenta is not the publisher of the publication content on this website. The responsibility for the publication content rests with the publishers prov...

Read More

Gender Equality Research Articles published between Sep 26, 2022 to Oct 02, 2022

R DiscoveryOct 03, 2022
R DiscoveryArticles Included:  3

Introduction: As of early March 2022, the COVID-19 pandemic has killed more 5.9 million people worldwide, and infected more than 437 million.

Read More

Coronavirus Pandemic

You can also read COVID related content on R COVID-19

R ProductsCOVID-19

ONE PROBLEM . ONE PURPOSE . ONE PLACE

Creating the world’s largest AI-driven & human-curated collection of research, news, expert recommendations and educational resources on COVID-19

COVID-19 Dashboard

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on “as is” basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The Copyright Law.