Abstract

BackgroundUse of big data is becoming increasingly popular in medical research. Since big data-based projects differ notably from classical research studies, both in terms of scope and quality, a debate is apt as to whether big data require new approaches to scientific reasoning different from those established in statistics and philosophy of science.Main textThe progressing digitalization of our societies generates vast amounts of data that also become available for medical research. Here, the big promise of big data is to facilitate major improvements in the treatment, diagnosis and prevention of diseases. An ongoing examination of the idiosyncrasies of big data is therefore essential to ensure that the field stays congruent with the principles of evidence-based medicine. We discuss the inherent challenges and opportunities of big data in medicine from a methodological point of view, particularly highlighting the relative importance of causality and correlation in commercial and medical research settings. We make a strong case for upholding the distinction between exploratory data analysis facilitating hypothesis generation and confirmatory approaches involving hypothesis validation. An independent verification of research results will be ever more important in the context of big data, where data quality is often hampered by a lack of standardization and structuring.ConclusionsWe argue that it would be both unnecessary and dangerous to discard long-established principles of data generation, analysis and interpretation in the age of big data. While many medical research areas may reasonably benefit from big data analyses, they should nevertheless be complemented by carefully designed (prospective) studies.

Highlights

  • The progressing digitalization of our societies generates vast amounts of data that become available for medical research

  • Regardless of the legitimacy of the latter assertions, it is indisputable that big data has pushed the boundaries in terms of data quality, analysis, manageability and interpretability as well

  • Since medical research classically proceeds through studies designed to answer specific questions, it is often hampered by high costs, long timescales, and insufficient sample sizes

Read more

Summary

Main text

Big data analysis may facilitate hypothesis generation Scientific hypotheses arise through either deduction or induction. Every hypothesis requires evaluation Regardless of its origin, every newly derived hypothesis must stand up to empirical scrutiny Even though this requirement applies to scientific research in general, it appears appropriate for hypotheses arising from big data analysis because “a research finding is less likely to be true [...] when effect sizes are smaller; when there is a greater number and lesser preselection of tested relationships; where there is greater flexibility in designs, definitions, outcomes, and analytical modes”. Well-designed observational studies are the best option for the evaluation of hypotheses Both approaches are, fundamentally different from the analysis of big data that originate outside the realm of control by the researcher and, lack approved quality.

Conclusions
Background
Gartner: IT Glossary

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.