Abstract

Among buzzwords, “big data” is a term and concept that is being hotly debated and is rapidly becoming an essential tool in the care of our patients. The idea of big data has been discussed for more than a decade, and its use is continuously being redefined. Basically, in health care, big data is the use of data that are too big and too cumbersome for health care providers to process with existing tools and technologies.The following 6 Vs are attributes that are commonly used to define, explain, and describe the concept of big data:⁃value (relevance of the data);⁃variability (evolution and seasonality of diseases);⁃variety (data from different categories, taxonomies, and data sources);⁃volume (quantity of data and high-throughput technologies);⁃velocity (speed of processing and generation of new data);⁃veracity (quality of data).For example, an ever-growing number of companies are offering genetic testing to both health care providers and the public, and it is important to put such output into the perspective of big data. What is the value, variability, variety, volume, velocity, and veracity of available genetic testing?Making sense of available health care data that may soon reach an output measured in zettabytes (1021) or even yottabytes (1024) is an impossible task unless we develop and embrace new data management technologies. Data are continuously generated by real-time imaging (for example, cardiovascular magnetic resonance imaging), point-of-care devices, and various and sundry mobile and wearable devices. Advances in technology, including the ability to detect even minute processes such as metabolic signaling, will generate data that have never been seen before and that will result in the development of yet unheard of therapeutic agents.1Kim T. Hyeon T. Applications of inorganic nanoparticles as therapeutic agents.Nanotechnology. 2014; 25: 012001Crossref Scopus (122) Google ScholarHealth care professionals soon will be able to decode and interpret real-time patient data that may include an oral microbiome that will denote a state of health or disease; provide genomic, proteomic, transcriptomic, and metabolomic data to be used in pharmacogenomics, as well as for precision or personalized oral health care; and suggest specific dental materials and other treatment modalities that can interact directly with a patient’s own tissues.The reason for using big data in health care is to provide better, more efficient, and more evidence-based clinical care (care that answers clinical questions that are supported by observational evidence). Does big data create a hypothesis or will a hypothesis create big data? Having large data sets invites searches for statistically significant findings, which can result in a retrospective hypothesis or a post hoc analysis—one created after analyzing results. Unfortunately, commonly used statistical methods are not good at delineating significant findings from large amounts of data, as large amounts of data almost always will result in some kind of statistical significance. Thus, health care professionals need to be able to sift through and be selective when choosing which particular data set to use. Present algorithms may not be sufficient.The advantage of using big data is the generation of predictive disease models for both chronic and acute conditions that can be made on the basis of voluminous patient information, sometimes even in real time. One pitfall of using big data is not being mindful of the gravitational pull of larger data sets, which will overwhelm significant and important information from smaller data sets.Translational genomics already have helped better identify subtypes of different cancers and subsequently improved treatment. For example, targeted therapies—a treatment that takes advantage of gene changes associated with the development of specific cancers—have shown great promise for better outcomes in patients with breast cancer.2Murphy C.G. Morris P.G. Recent advances in novel targeted therapies for HER2-positive breast cancer.Anticancer Drugs. 2012; 23: 765-776Crossref PubMed Scopus (56) Google Scholar In other areas, pharmacogenetic-guided anticoagulation dosing with warfarin has shown greater effectiveness and safety.3Maitland-van der Zee A.H. Daly A.K. Kamali F. et al.Patients benefit from genetics-guided coumarin anticoagulant therapy.Clin Pharmacol Ther. 2014; 96: 15-17Crossref PubMed Scopus (17) Google ScholarThe Human Genome Project,4National Human Genome Research Institute. All about the Human Genome Project. Available at: www.genome.gov/10001772. Accessed September 11, 2015.Google Scholar the Research Collaboratory for Structural Bioinformatics Protein Data Bank,5Research Collaboratory for Structural Bioinformatics (RCSB) Protein Data Bank. Available at: www.rcsb.org/pdb/home/home.do. Accessed September 11, 2015.Google Scholar and the Human Metabolome Database6The Human Metabolome Database. Available at: www.hmdb.ca. Accessed September 11, 2015.Google Scholar are 3 large databases that provide new insights into a person’s susceptibility to specific diseases and conditions and may be able to help clinicians discern in more detail disease etiology, prevention, treatment, and cures.7Taylor J.C. Martin H.C. Lise S. et al.Factors influencing success of clinical genome sequencing across a broad spectrum of disorders.Nat Genet. 2015; 47: 717-726Crossref PubMed Scopus (231) Google Scholar Such enormous data sets enable better understanding of complex disease patterns and facilitate the discovery of novel and clinically useful biomarkers.Big data also will change the way we define and diagnose oral diseases, including periodontal diseases, inflammatory and immunologic pathologies, and even cancers. Using different -omic markers will result in the recognition that oral and oropharyngeal cancers actually are several different diseases with different causes, treatments, and cure rates.8Glick M. Johnson N.W. Oral and oropharyngeal cancer: what are the next steps?.JADA. 2011; 142: 892-894Abstract Full Text Full Text PDF PubMed Scopus (13) Google Scholar The availability of better and more data also eventually will result in more insight into the pathogenesis and biological pathways of many commonly occurring diseases, such as diabetes and cardiovascular diseases, which will assist in better surveillance and health outcomes. The use of genomic, transcriptomic, proteomic, and metabolomic data, together with physiologic monitoring, will create an integrative personal -omics profile that can enhance our understanding of a person’s overall health and disease status far beyond today’s commonly used screening and diagnostic tools.9Chen R. Mias G.I. Li-Pook-Than J. et al.Personal omics profiling reveals dynamic molecular and medical phenotypes.Cell. 2012; 148: 613-624Abstract Full Text Full Text PDF Scopus (890) Google Scholar, 10Glick M. Personalized oral health care: providing “-omic” answers to oral health care queries.JADA. 2012; 143: 102-104Abstract Full Text Full Text PDF PubMed Scopus (7) Google Scholar How to effectively use this enormous volume of data for clinical care poses an interesting and ambitious challenge. Already, a third term used to describe an experimental model has joined the commonly used terms in vivo and in vitro: in silico, meaning “performed on computer or via computer simulation.”11In silico. Available at: https://en.wikipedia.org/wiki/In_silico. Accessed September 11, 2015.Google ScholarOne opportunity that has not been fully realized is the use and data mining of multiple electronic health records (EHRs)—a source of big data—that can communicate with each other. Unfortunately, although there are numerous appropriate choices for dental EHRs, few can interact in a meaningful way with medical databases. The inability to effectively gather and exchange information within the entire health care system results in inefficiencies and much higher medical costs. The proliferation and breaches of EHRs have highlighted the need to develop better and more robust policies that will protect personal medical records. Another important issue that needs better delineation is the ownership and appropriate use of personal health-related data, especially with the growth in personal health apps and devices that can record a person’s real-time data and send that data to the user’s health care providers. In this era of big data, this issue will become an even more onerous task.Opportunities to improve health and better serve patients through the use of big data are rapidly emerging. Dentistry needs to increase its involvement, and oral health care professionals need to proactively contribute information to databases that are being used to determine and assess health outcomes. If not, as the saying goes, “If you are not at the table, you will be part of the menu.” Among buzzwords, “big data” is a term and concept that is being hotly debated and is rapidly becoming an essential tool in the care of our patients. The idea of big data has been discussed for more than a decade, and its use is continuously being redefined. Basically, in health care, big data is the use of data that are too big and too cumbersome for health care providers to process with existing tools and technologies. The following 6 Vs are attributes that are commonly used to define, explain, and describe the concept of big data:⁃value (relevance of the data);⁃variability (evolution and seasonality of diseases);⁃variety (data from different categories, taxonomies, and data sources);⁃volume (quantity of data and high-throughput technologies);⁃velocity (speed of processing and generation of new data);⁃veracity (quality of data). For example, an ever-growing number of companies are offering genetic testing to both health care providers and the public, and it is important to put such output into the perspective of big data. What is the value, variability, variety, volume, velocity, and veracity of available genetic testing? Making sense of available health care data that may soon reach an output measured in zettabytes (1021) or even yottabytes (1024) is an impossible task unless we develop and embrace new data management technologies. Data are continuously generated by real-time imaging (for example, cardiovascular magnetic resonance imaging), point-of-care devices, and various and sundry mobile and wearable devices. Advances in technology, including the ability to detect even minute processes such as metabolic signaling, will generate data that have never been seen before and that will result in the development of yet unheard of therapeutic agents.1Kim T. Hyeon T. Applications of inorganic nanoparticles as therapeutic agents.Nanotechnology. 2014; 25: 012001Crossref Scopus (122) Google Scholar Health care professionals soon will be able to decode and interpret real-time patient data that may include an oral microbiome that will denote a state of health or disease; provide genomic, proteomic, transcriptomic, and metabolomic data to be used in pharmacogenomics, as well as for precision or personalized oral health care; and suggest specific dental materials and other treatment modalities that can interact directly with a patient’s own tissues. The reason for using big data in health care is to provide better, more efficient, and more evidence-based clinical care (care that answers clinical questions that are supported by observational evidence). Does big data create a hypothesis or will a hypothesis create big data? Having large data sets invites searches for statistically significant findings, which can result in a retrospective hypothesis or a post hoc analysis—one created after analyzing results. Unfortunately, commonly used statistical methods are not good at delineating significant findings from large amounts of data, as large amounts of data almost always will result in some kind of statistical significance. Thus, health care professionals need to be able to sift through and be selective when choosing which particular data set to use. Present algorithms may not be sufficient. The advantage of using big data is the generation of predictive disease models for both chronic and acute conditions that can be made on the basis of voluminous patient information, sometimes even in real time. One pitfall of using big data is not being mindful of the gravitational pull of larger data sets, which will overwhelm significant and important information from smaller data sets. Translational genomics already have helped better identify subtypes of different cancers and subsequently improved treatment. For example, targeted therapies—a treatment that takes advantage of gene changes associated with the development of specific cancers—have shown great promise for better outcomes in patients with breast cancer.2Murphy C.G. Morris P.G. Recent advances in novel targeted therapies for HER2-positive breast cancer.Anticancer Drugs. 2012; 23: 765-776Crossref PubMed Scopus (56) Google Scholar In other areas, pharmacogenetic-guided anticoagulation dosing with warfarin has shown greater effectiveness and safety.3Maitland-van der Zee A.H. Daly A.K. Kamali F. et al.Patients benefit from genetics-guided coumarin anticoagulant therapy.Clin Pharmacol Ther. 2014; 96: 15-17Crossref PubMed Scopus (17) Google Scholar The Human Genome Project,4National Human Genome Research Institute. All about the Human Genome Project. Available at: www.genome.gov/10001772. Accessed September 11, 2015.Google Scholar the Research Collaboratory for Structural Bioinformatics Protein Data Bank,5Research Collaboratory for Structural Bioinformatics (RCSB) Protein Data Bank. Available at: www.rcsb.org/pdb/home/home.do. Accessed September 11, 2015.Google Scholar and the Human Metabolome Database6The Human Metabolome Database. Available at: www.hmdb.ca. Accessed September 11, 2015.Google Scholar are 3 large databases that provide new insights into a person’s susceptibility to specific diseases and conditions and may be able to help clinicians discern in more detail disease etiology, prevention, treatment, and cures.7Taylor J.C. Martin H.C. Lise S. et al.Factors influencing success of clinical genome sequencing across a broad spectrum of disorders.Nat Genet. 2015; 47: 717-726Crossref PubMed Scopus (231) Google Scholar Such enormous data sets enable better understanding of complex disease patterns and facilitate the discovery of novel and clinically useful biomarkers. Big data also will change the way we define and diagnose oral diseases, including periodontal diseases, inflammatory and immunologic pathologies, and even cancers. Using different -omic markers will result in the recognition that oral and oropharyngeal cancers actually are several different diseases with different causes, treatments, and cure rates.8Glick M. Johnson N.W. Oral and oropharyngeal cancer: what are the next steps?.JADA. 2011; 142: 892-894Abstract Full Text Full Text PDF PubMed Scopus (13) Google Scholar The availability of better and more data also eventually will result in more insight into the pathogenesis and biological pathways of many commonly occurring diseases, such as diabetes and cardiovascular diseases, which will assist in better surveillance and health outcomes. The use of genomic, transcriptomic, proteomic, and metabolomic data, together with physiologic monitoring, will create an integrative personal -omics profile that can enhance our understanding of a person’s overall health and disease status far beyond today’s commonly used screening and diagnostic tools.9Chen R. Mias G.I. Li-Pook-Than J. et al.Personal omics profiling reveals dynamic molecular and medical phenotypes.Cell. 2012; 148: 613-624Abstract Full Text Full Text PDF Scopus (890) Google Scholar, 10Glick M. Personalized oral health care: providing “-omic” answers to oral health care queries.JADA. 2012; 143: 102-104Abstract Full Text Full Text PDF PubMed Scopus (7) Google Scholar How to effectively use this enormous volume of data for clinical care poses an interesting and ambitious challenge. Already, a third term used to describe an experimental model has joined the commonly used terms in vivo and in vitro: in silico, meaning “performed on computer or via computer simulation.”11In silico. Available at: https://en.wikipedia.org/wiki/In_silico. Accessed September 11, 2015.Google Scholar One opportunity that has not been fully realized is the use and data mining of multiple electronic health records (EHRs)—a source of big data—that can communicate with each other. Unfortunately, although there are numerous appropriate choices for dental EHRs, few can interact in a meaningful way with medical databases. The inability to effectively gather and exchange information within the entire health care system results in inefficiencies and much higher medical costs. The proliferation and breaches of EHRs have highlighted the need to develop better and more robust policies that will protect personal medical records. Another important issue that needs better delineation is the ownership and appropriate use of personal health-related data, especially with the growth in personal health apps and devices that can record a person’s real-time data and send that data to the user’s health care providers. In this era of big data, this issue will become an even more onerous task. Opportunities to improve health and better serve patients through the use of big data are rapidly emerging. Dentistry needs to increase its involvement, and oral health care professionals need to proactively contribute information to databases that are being used to determine and assess health outcomes. If not, as the saying goes, “If you are not at the table, you will be part of the menu.” Dr. Glick is a professor and the William M. Feagans Chair, School of Dental Medicine, University at Buffalo, The State University of New York, Buffalo, NY. He also is the editor of The Journal of the American Dental Association.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call