Abstract
In the past few decades, we have witnessed tremendous advancements in biology, life sciences and healthcare. These advancements are due in no small part to the big data made available by various high-throughput technologies, the ever-advancing computing power, and the algorithmic advancements in machine learning. Specifically, big data analytics such as statistical and machine learning has become an essential tool in these rapidly developing fields. As a result, the subject has drawn increased attention and many review papers have been published in just the past few years on the subject. Different from all existing reviews, this work focuses on the application of systems, engineering principles and techniques in addressing some of the common challenges in big data analytics for biological, biomedical and healthcare applications. Specifically, this review focuses on the following three key areas in biological big data analytics where systems engineering principles and techniques have been playing important roles: the principle of parsimony in addressing overfitting, the dynamic analysis of biological data, and the role of domain knowledge in biological data analytics.
Highlights
Massive quantities of data are being generated in biology, the life sciences and healthcare industries and institutions, which hold the promise of advancing our understandings of various biological systems and diseases, developing new biocatalysts and drugs, as well as delivering more affordable and effective patient care
We propose a knowledge-guided machine learning (ML) approach by defining model structures based on domain knowledge and hypothesis
There are applications where domain knowledge may have appeared to be playing a lesser role, we demonstrated with ample examples that, for analyzing biological big data, domain knowledge has been and will continue to be playing a significant role
Summary
Massive quantities of data are being generated in biology, the life sciences and healthcare industries and institutions, which hold the promise of advancing our understandings of various biological systems and diseases, developing new biocatalysts and drugs, as well as delivering more affordable and effective patient care. To get a big picture of the research in the biological big data analytics field, we conducted a search on the Web of Science using the exact phrase: “big data” and any of the following words or phrases: biology, “life science”, healthcare, “health care”, biomedical, disease, and cancer. Journal articles articles of big big data data analytics analytics in in biology, biology, life sciences and healthcare, and their citation numbers in the past past decade decade based based on on aa Web. Given the the fast-growing fast-growing nature nature of of the the field, field, many many review review papers papers have have been been published published in in just just the the past few years. For ease of reading, these distinctions are largely ignored in this work
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have