Abstract

From the rise in data-mining and the flood of big data has emerged a new era, an informatics age that will probably be the next revolution for humankind in terms of the management of information and generation of new knowledge in diverse fields. A special issue discussing big data storage, management and analyses published by Nature opened the curtain of the big data era in 2008.1 The big data era demonstrated its utility in the successful influenza prediction via Google in 2009.2 Cancer is a heterogeneous disease per se, though through clonal expansion, it is also known that heterogeneity is present within a single patient. It is understandable that somatic mutations accumulate among different single cancer cells. Our current technology for cancer biomarker discovery is usually capable of detecting a snapshot, but not the dynamic and longitudinal changes of the cancer landscape. Big data can tackle this hurdle by gathering different molecular features at the DNA, RNA, protein and metabolite levels. Biodata-mining has led those working in the life sciences to embrace the informatics field, in order to accommodate the deluge of big data generated by next-gene ration biotechnologies, such as next-generation sequencing, proteomics and metabolomics, as well as the structured and unstructured biomedical and healthcare data from electronic health records. Life scientists are bringing together tremendous volumes of information from various high-throughput studies. The holistic approach on data informatics may facilitate the emergence of an era of precision medicine, which ultimately provides treatments tailored to each individual’s needs. Big data certainly challenges cancer research in various aspects. The trends of integrating different platforms and their findings create an intersection among different areas and dimensionalities, which is ultimately a step towards a holistic understanding of the disease. An omics approach to the study of cancer is keeping its fast pace.3 Nowadays, next-generation sequencing, proteomics and metabolomics develop rapidly in various directions. However, they also pose a difficult question regarding the existing platform of using a single dimension to measure cancer. Multi-dimensional approach to cancer research is urgently needed. On the other hand, there is a sea of data, not only huge in volume and complex in structure, but also vast in dynamic scale and depth. We foresee that a large amount of loosely connected, inherently noisy and heterogeneous data may also be gathered in the databases. Do all the data contain useful information? How can we judge whether the information is useful or junk? Thus, certain standards, guidelines or harmanizations may be needed to consolidate useful data in the same database. Some international consortia and institutions have made efforts to set rules to guarantee transparency and traceability so as to ensure the usefulness of the data.4 Electronic health records are becoming a better resource for databases. The information can be exchanged and gathered via telemedicine and mobile connectivity. Patients’ information and data security become an issue and more security measures need to be in place. We are in the precision medicine era, and meticulous decision making based on individual, detailed molecular profiles is a pressing need. Tumor molecular profiling has enabled the subclassification of cancer to redefine treatment regimes. Big data are certainly sailing in the same boat with precision medicine.5 Nowadays, more and more databases have been established, eg, The Catalogue of Somatic Mutations in Cancer (COSMIC), which is the largest database of somatic mutations and their effects on human cancer.4 Another example is the DREAM (Dialogue for Reverse Engineering Assessments and Methods) Consortium, which uses open innovation crowd sourcing to identify top-performing computational methods for inferring genetic heterogeneity from next-generation sequencing data provided by a large multi-institutional community of cancer genomics projects, including the International Cancer Genome Consortium and The Cancer Genome Atlas.7 These huge databases may lead to a paradigm shift in cancer research, researchers from any corner of the world can share and utilize the data for further research and analysis. It is envisioned that in the years to come, we will step into a revolutionized era of utilizing big data to benefit our cancer patients.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call