Accelerate Literature Icon
Want to do a literature review? Try our new Literature Review workflow

"Talkin’ ‘Bout a Revolution": Considering Data-Driven Theorizing

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

Most of the quantitative work in management and organizational psychology emphasizes a deductive theory testing approach. In this paper, we focus on data-driven theorizing as an alternative, complementary approach to deduction. The increasing availability of (big) data and sophisticated methods provide opportunities for data-driven theory building and refinement as another way to build knowledge and advance the field. We explain how using data-driven theorizing in responsible and transparent ways can inform knowledge creation and theorizing, and we discuss opportunities and challenges. We also give recommendations for authors, reviewers, editors, and the discipline aimed at increasing transparency and stimulating responsible data-driven theorizing as well as increasing openness to the explorative use of quantitative data in our field.

Similar Papers
  • PDF Download Icon
  • Research Article
  • Cite Count Icon 3
  • 10.52214/vib.v7i.8403
Legal Governance of Brain Data Derived from Artificial Intelligence
  • Jun 2, 2021
  • Voices in Bioethics
  • Mahika Ahluwalia

Photo by Josh Riemer on Unsplash
 Introduction
 With the rapid advancements in neurotechnological machinery and improved analytical insights from machine learning in neuroscience, the availability of big brain data has increased tremendously. Neurological health research is done using digitized brain data.[1] There must be adequate data governance to secure the privacy of subjects participating in brain research and treatments. If not properly regulated, the research methods could lead to significant breaches of the subject’s autonomy and privacy. This paper will address the necessity for neuroprotection laws, which effectively govern the use of big brain data to ensure respect for patient privacy and autonomy.
 Background
 Artificial intelligence and machine learning can be integrated with neuroscience big brain data to drive research studies. This integrative technology allows patterns of electrical activity in neurons to be studied in detail.[2]Specifically, it uses a robotic system which can reason, plan, and exhibit biologically intelligent behavior. Machine learning is a method of computer programming where the code can adapt its behavior based on big brain data.[3] The big brain data is the collection of large amounts of information for the purpose of deciphering patterns through computer analysis using machine learning.[4] The information that these technologies provide is extensive enough to allow a researcher to read a patient’s mind. AI and machine learning technologies work by finding the underlying structure of brain data, which is then described by patterns known as latent factors, eventually resulting in an understanding of the brain’s temporal dynamics.[5]
 Through these technologies, researchers are able to decipher how the human brain computes its performances and thoughts. However, due to the extensive and complex nature of the data processed through AI and machine learning, researchers may gain access to personal information a patient may not wish to reveal. From a bioethical lens, tensions arise in the realm of patient autonomy. Patients are not able to control the transmission of data from their brains that is analyzed by researchers. Governing brain data through laws may enhance the extent of patient privacy in the case where brain data is being used through AI technologies.[6] A responsible approach to governing brain data would require a sophisticated legal structure.
 Analysis
 Impact on Patient Autonomy and Privacy 
 In research pertaining to big brain data, the consent forms do not fully cover the vast amounts of information that is collected. According to research, personal data has become the most sought out commodity to provide content to corporations and the web-based service industry. Unfortunately, data leaks that release private information frequently occur.[7] The storage of an individual’s data on technologies accessible on the internet during research studies makes it vulnerable to leaks, jeopardizing an individual’s privacy. These data leaks may cause the patient to be identified easily, as the degree of information provided by AI technologies are personalized and may be decoded through brain fingerprinting methods.[8]
 There has been an extensive growth in the development and use of AI. It is efficient in providing information to radiologists who diagnose various diseases including brain cancer and psychiatric disease, and AI assists in the delivery of telemedicine.[9] However, the ethical pitfall of reduced patient autonomy must be addressed by analyzing current AI technologies and creating more options for patient preference in how the data may be used. For instance, facial recognition technology[10] commonly used in health care produces more information than listed in common consent forms, threatening to undermine informed consent. Facial recognition software collects extensive data and may disclose more information than a person would prefer to provide despite being a useful tool for diagnosing medical and genetic conditions.[11] In addition, people may not be aware that their images are being used to generate more clinical data for other purposes. It is difficult to guarantee the data is anonymized. Consent requirements must include informing people about the complexity of the potential uses of the data; software developers should maximize patient privacy.[12] Furthermore, there is a “human element” in the use of AI technologies as medical providers control the use and the extent to which data is captured or accessed through the AI technologies.[13] People must understand the scope of the technology and have clear communication with the physician or health care provider about how the medical information will be used. 
 Existing Laws for Brain Data Governance 
 A strict system of defined legal responsibilities of medical providers will ensure a higher degree of patient privacy and autonomy when AI technologies and data from machine learning are used. Governing specific algorithmic data is crucial in safeguarding a patient’s privacy and developing a gold standard treatment protocol following the procurement of the information.[14] Certain AI technologies provide more data than others, and legal boundaries should be established to ensure strong performance, quality control, and scope for patient privacy and autonomy. For instance, currently AI technologies are being used in the realm of intensive neurological care. However, there is a significant level of patient uncertainty about how much control patients have over the data’s uses.[15] Calibrated legal and ethical standards will allow important brain data to be securely governed and monitored.
 Once brain signals are recorded and processed from one individual, the data may be merged with other data in Brain Computer Interface Technology (BCI).[16] To ensure a right and ability to retrieve personal data or pull it from the collection, specific regulations for varying types of data are needed.[17] The importance of consent and patient privacy must be considered through giving patients a transparent view of how brain data is governed.[18] The legal system must address discriminatory issues and risks to patients whose data is used in studies. Laws like the General Data Protection Regulation (GDPR) and the California Consumer Privacy Protection Act (CCPA) can serve as effective models to protect aggregated data. These laws govern consumer information and ensure the compliance when personal data is collected.[19] California voters recently approved expansion of the CCPA to health data. The Washington Privacy Act, which would have provided rights to access, change, and withdraw personal data, failed to pass. Other states should improve privacy as well,[20] although a federal bill would be preferable. Scientists at the Heidelberg Academy of Sciences argue for data security to be governed in a manner that balances patient privacy and autonomy with the commercial interests of researchers.[21] The balance could be achieved through privacy protections like those in the Washington Privacy Act. Although the Health Insurance Portability and Accountability Act (HIPAA) provides an overall framework to deter the likelihood of dangers to patient protection and privacy, more thorough laws are warranted to combat pervasive data transfer and analysis that technology has brought to the health care industry.[22] Breaches of patient privacy under current HIPAA regulations include releasing patient information to a reporter without their consent and sending HIV data to a patient’s employer without consent.[23] HIPAA does not cover information being shared with outside contractors who do not have an agreement with technology companies to keep patient data confidential. HIPAA regulations also do not always address blatant breaches on patient data confidentiality.[24] Patients must be provided with methods to monitor the data being analyzed to be able to view the extent of private information being generated via AI technologies. In health research, the medical purposes of better diagnosis, earlier detection of diseases, or prevention are ethical justifications for the use of the data if it was collected with permission, the person understood and approved the uses of the data, and the data was deidentified.
 A standard governance framework is required in providing the fairest system of care to patients who allow their brain data to be examined. Informed consent in the neuroscience field could reaffirm the privacy and autonomy of patients by ensuring that they understand the type of information collected. Laws also could protect data after a patient’s death. Malpractice in the scope of brain data could give people a cause of action critical in safeguarding patient’s rights. Data breach lawsuits will become common but generally do not cover deidentified data that becomes part of big data collection. A more synchronized approach to the collection and consent process will encourage an understanding of how big data is used to diagnose and treat patients. Some altruistic people may even be more likely to consent if they know the largescale data collection is helpful to treat and diagnose people. Others should have the ability to opt out of sharing neurological data, especially when there is not certainty surrounding deidentification.[25]
 Conclusion
 Artificial intelligence and machine learning technologies have the potential to aid in the diagnosis and treatment of people globally by extracting and aggregating brain data specific to individuals. However, the secure use of the data is necessary to build trust between care providers and patients, as well as in balancing the bioethical principles of beneficence and patient autonomy. We must ensure the highest quality of care to patients, while protecting their privacy, informed consent, and clinical trust. More sophis

  • Front Matter
  • Cite Count Icon 47
  • 10.1016/j.ijinfomgt.2023.102661
Guest Editorial: Big data-driven theory building: Philosophies, guiding principles, and common traps
  • May 19, 2023
  • International Journal of Information Management
  • Arpan Kumar Kar + 2 more

Guest Editorial: Big data-driven theory building: Philosophies, guiding principles, and common traps

  • Research Article
  • Cite Count Icon 24
  • 10.1161/circoutcomes.116.003081
Data Science in Healthcare: Implications for Early Career Investigators.
  • Nov 1, 2016
  • Circulation: Cardiovascular Quality and Outcomes
  • Sanjeev P Bhavnani + 2 more

The confluence of science, technology, and medicine in our dynamic digital era has spawned new data applications to develop prescriptive analytics, to improve healthcare personalization and precision medicine, and to automate the reporting of health data for clinical decisions.1 Data science in health care has seen recent and rapid progress along 3 paths: (1) through big data via the aggregation of large and complex data sets including electronic medical records, social media, genomic databases, and digitized physiological data from wireless mobile health devices2; (2) through new open-access initiatives that seek to leverage the availability of clinical trial, research, and citizen science data sources for data sharing3; and (3) in analytic techniques particularly for big data, including machine learning and artificial intelligence that may enhance the analyses of both structured and unstructured data.4 As new data sets are created, analyzed, and become increasingly available, several key questions emerge including the following: What is the quality of unstructured data generation? Will the use of nonstandardized methods in data processing with traditional software and hardware lead to data fragmentation and analyses that are nonreproducible? Will healthcare systems incorporate and use big data especially from new publically and patient-generated sources? How will physicians and researchers learn from new open-sourced data and big-data analytics? And ultimately, How can they acquire the skills to create a knowledge translation in data sciences?5 Practicing in an era of continuous payment reform and decline in research funding, early career investigators are challenged to keep up with the accelerating pace of change in medicine, all while being expected to provide meaningful contributions through productive clinical, educational, and research experiences.6 In this perspective, we aim to highlight how data science can catalyze professional advancement and discuss the implications of big data, open access, …

  • Research Article
  • Cite Count Icon 9
  • 10.1089/big.2014.1522
Why Big Data = Big Deal.
  • Jun 1, 2014
  • Big Data
  • Vasant Dhar

Why Big Data = Big Deal.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 1
  • 10.30574/gjeta.2023.15.3.0114
Big data and data analytics in 5G mobile networks
  • Jun 30, 2023
  • Global Journal of Engineering and Technology Advances
  • Panagiotis Leliopoulos + 1 more

In this paper, we study the features of big data and data analytics. We see how Big Data contributes to mobile networks. We give a term in which big data generally refers to a large amount of digital data. Also, we estimate that the amount processed by "Big Data" systems will double every two years. Hence, Big Data on mobile networks need to be analyzed in-depth according to retrieve exciting and useful information. Big Data provides unprecedented opportunities for internet service providers to understand the behavior and requirements of their users, which in turn enables real-time decision making across a wide range of applications. After that, we mention the dimensions often describe the 4Vs of Big Data. We continue with the study about the use of big data analytics in mobile networks. As we see, new technologies for managing big data in a highly scalable, cost-effective, and damage-resistant manner are required. So, beyond 2020 the system capacity and data rates in mobile networks must support thousands of times more traffic than 2010 levels. Furthermore, we mention the end-to-end latency, the massive number of connections, the cost, the Quality of Experience, the Issues, and finally, the big data management. We continue with the study about the big data analytics in 5G. The 5G networks standardizing and the 5G mobile optimization are crucial areas. There are new research areas were exploring new analytics techniques in big data according to discover new patterns and extract knowledge from the data are collected. Big data analytics can provide organizations with the ability to profile and segment customers based on distinct socioeconomic characteristics and increase customer satisfaction and retention levels. Also, Big Data analytics techniques can provide telecom providers with in-depth knowledge of networks before making informed decisions. Also, as we see, these analytics techniques can help Telecommunication providers to monitor and analyze various types of data as well as event messages on networks. Important information, like business intelligence, can be extracted from momentary and stored data. Hence, the mass adoption of smartphones, mobile broadband modems, tablets, and mobile data applications has been overwhelmingly wireless. Operators bend under the pressure and cost of continuously adding capacity and improving coverage while maximizing the use of the existing components of their range. Advanced radio access technologies, and all Internet Protocols, open internet network architectures must evolve smoothly from 4G systems. So those needs are leading us to make a study about the heterogeneous network or else HetNet for 5G networks. We are continuing with the challenges, and we mention about the curse of modularity, dimensions procedure, feature engineering, non-linearity, Bonferonni's principle, category report, variance and bias, data locality, data heterogeneity, noisy data, data availability, real-time processing, and streaming, data provenance, and data security.

  • Research Article
  • Cite Count Icon 1
  • 10.34172/doh.2023.34
Big Data and Pharmaceutical Industry: Applications and Priorities
  • Dec 2, 2023
  • Depiction of Health
  • Mohammad Hossein Ronaghi + 1 more

Background. Hospitals, patients, researchers, and healthcare organizations are producing enormous amounts of data in both the healthcare and drug detection sectors. With continued development of cheap data storage and availability of smart devices in the world, the influence of big data (BD) will continue to grow. This influence has also carried over to healthcare. The volumes of data available in the fields of pharmacology, toxicology, and pharmaceutics are constantly increasing. Therefore, the present study aimed to identify the applications of big data technology in the field of pharmaceutics. Methods. Using the mixed methods approach, this study was conducted in two phases in winter 2023. In the first phase, the applications of big data technology were identified by library search and assessed by content analysis. In the second phase, applications were ranked by a panel of experts, including 17 experts who worked in the Iranian pharmaceutical industry. The stepwise weight assessment ratio analysis (SWARA) method was used for ranking the application of big data in pharmaceutics. Results. The present study examined the importance of big data applications in pharmaceutics. The results showed that drug discovery (0.263) and clinical study analysis (0.224) had the highest importance, followed by drug efficacy and performance (0.197), drug safety (0.170) and drug personalization (0.146). Conclusion. The present study detailed a substantial attempt to review the existing literature regarding the implementation of big data and to rank big data applications in the pharmaceutical industry. Big data can help researchers to better discover and develop drugs and understand the effects of drugs and other chemicals on the human body. As a result, it can help improve the safety and efficacy of drugs and other chemicals. Also, big data can help to improve the accuracy of predictions regarding the effects of drugs and chemicals, which can improve safety in drug development and help to avoid potential adverse drug interactions. The applications considered in this study are ultimately necessary for humanity, and big data may significantly impact the betterment of these domains. Big data has a revolutionary potential, providing new ways to understand and predict the effects of drugs. BD will possibly play an important role in pharmaceutics in future, critically helping to drug discovery and improve drug safety and efficacy.

  • Research Article
  • Cite Count Icon 56
  • 10.1111/gcb.15317
Research challenges and opportunities for using big data in global change biology.
  • Sep 13, 2020
  • Global Change Biology
  • Jianyang Xia + 2 more

Global change biology has been entering a big data era due to the vast increase in availability of both environmental and biological data. Big data refers to large data volume, complex data sets, and multiple data sources. The recent use of such big data is improving our understanding of interactions between biological systems and global environmental changes. In this review, we first explore how big data has been analyzed to identify the general patterns of biological responses to global changes at scales from gene to ecosystem. After that, we investigate how observational networks and space-based big data have facilitated the discovery of emergent mechanisms and phenomena on the regional and global scales. Then, we evaluate the predictions of terrestrial biosphere under global changes by big modeling data. Finally, we introduce some methods to extract knowledge from big data, such as meta-analysis, machine learning, traceability analysis, and data assimilation. The big data has opened new research opportunities, especially for developing new data-driven theories for improving biological predictions in Earth system models, tracing global change impacts across different organismic levels, and constructing cyberinfrastructure tools to accelerate the pace of model-data integrations. These efforts will uncork the bottleneck of using big data to understand biological responses and adaptations to future global changes.

  • Research Article
  • Cite Count Icon 25
  • 10.1002/1944-2866.poi326
Addressing the policy challenges and opportunities of “Big data”
  • Jun 1, 2013
  • Policy & Internet
  • Helen Margetts + 1 more

Addressing the policy challenges and opportunities of “Big data”

  • Conference Article
  • Cite Count Icon 8
  • 10.1109/bdcloud.2018.00172
On the Usability of Big (Social) Data
  • Dec 1, 2018
  • Sunil Choenni + 3 more

Due to the growing availability of huge amounts of data of different types and the growing capabilities to analyze these data, the expectations of big data applications are high. In this paper, we argue that the usability of big data in the social domain is far from trivial. If the outcomes of big data are wrongly interpreted, this may shape the development of our society in a wrong direction. Therefore, care should be taken of a proper interpretation of big data outcomes and its applications in real-life. To support such an interpretation, we distinguish three major building blocks in big data, the data as input for analyses, the algorithms to analyze the data, and the models as output of the analyses. We show that each of the building blocks entail different complications for a proper interpretation of big data outcomes in practice. Therefore, well thought-through strategies are required for using big data outcomes in a responsible way. We discuss a framework for such strategies.

  • Front Matter
  • Cite Count Icon 4
  • 10.1007/s12021-014-9239-0
Data persistence insurance.
  • Jul 1, 2014
  • Neuroinformatics
  • David N Kennedy

Data persistence insurance.

  • Research Article
  • Cite Count Icon 11
  • 10.1007/s11524-021-00562-x
Food, Big Data, and Decision-making: a Scoping Review-the 3-D Commission.
  • Aug 1, 2021
  • Journal of urban health : bulletin of the New York Academy of Medicine
  • Olivia Biermann + 4 more

Food is an important determinant of health, featuring prominently in the Sustainable Development Goals. The term "big data" is seldom used in relation to food, partly because food data are scattered across different sectors. The increasing availability of food-related data presents an opportunity to glean new insights on food and food systems. These insights may enhance the quality of products and services and improve decision-making on optimizing food availability, all to the end of producing better health. Yet, knowledge gaps remain about the unique opportunities and challenges linked to big data on food and their use in decision-making. This scoping review explored the available literature linking food with big data and decision-making, using the following research question: What is the current literature on data about food, and how are these data used in decision-making? We searched PubMed until 29 February 2020 and Embase, Web of Sciences, and the Cochrane Database of Systematic Reviews until 8 March 2020. We included studies written in English and conducted narrative analyses to identify relevant themes from included studies. Sixteen studies fulfilled our eligibility criteria, including big data analyses, modelling studies, and reviews. These studies described the added value of using big data and how evidence from big data had or can be used for decision-making, as well as challenges and opportunities for such use. The majority of the included studies examined the link between food and big data, while hypothesizing of how these insights could inform decision-making, including policies, interventions, programs, and financing. There were only two examples wherein big data on food informed decision-making directly. The review highlights several false dichotomies in how the subject is approached in the literature and the importance of context, both between and within countries, in shaping the availability and types of data that can be used as meaningful evidence to inform decision-making. This review shows the paucity of research around the intersection of food, big data, and decision-making, as well as the potential in using big data on food systems to the end of informing decisions to improve the health of populations. Future research and decision-making around health systems can benefit from examining the full spectrum of perspectives on the subject. Future research and decision-making around health systems can also employ the steadfast embrace of technology, which will potentially reduce disparities in big data availability, to the end of improving the health of populations.

  • Research Article
  • 10.3760/cma.j.issn.1000-6672.2018.07.010
Status analysis and practical experiments of regional medical big data sharing and open access
  • Jul 2, 2018
  • Chinese Journal of Hospital Administration
  • Shishi Ma + 1 more

Following an overview of the present big medical data sharing abroad, the paper identified the problems of the regional health information platform in data sharing and utilization as follows.Namely, poor data integration, low data availability, poor data security and privacy, unclear data sharing model, and poor data management accountability.On such basis, the authors made thoughtful studies in data quality management, information security and privacy protection and data sharing model.These efforts provide useful references for big health data integration sharing and open access. Key words: Information services; Regional health informatization; Big data resource; Sharing model; Data quality

  • Book Chapter
  • Cite Count Icon 9
  • 10.1007/978-3-319-61043-6_12
Why Big Data Needs the Virtues
  • Jan 1, 2017
  • Frances S Grodzinsky

In this paper I offer a critical reflection on Big Data through the lens of “the virtues” in an attempt to separate much of the “hype” from reality. Part 1 defines what is meant by Big Data and describes why it is valuable. I examine its ethical issues in the context of the characteristics of Big Data as exemplified by the 4V’s: volume, velocity, variety (traditional 3V’s) and a 4th, veracity. Part 2 considers whether Big Data Science is really a “theory free” science based only on statistical correlations. In Part 3, I explore the role of the “Big Data Scientist” and her responsibilities as virtuous epistemic agent. Part 4 applies both virtue ethics and virtue epistemology to Big Data, focusing on how it can be used in an ethically responsible way to benefit society. Finally, I will explain why thinking in terms of the virtues is helpful in the analysis of Big Data because when a Data Scientist habitually acts in accordance with the virtues, she will be better able to cope with the “messiness” and dynamic flux of Big Data with open-mindedness and intellectual courage.

  • Research Article
  • 10.22032/dbt.37905
Semantic big biodiversity data integration toolA
  • Jan 1, 2018
  • Thüringer Universitäts- und Landesbibliothek
  • Taysir Hassan A Soliman + 4 more

Our planet is facing huge effects of global climate changes that are threatening biodiversity data to be surviving. Biodiversity data exist in very complex characteristics, such as high volume, variety, veracity, velocity, and value, as Big data. The variety or heterogeneity of biodiversity data provides a very high challenging research problem since they exist in unstructured, semi-structured, quasi-structured, and generated in XML, EML, Excel sheets, videos, images, or ontologies. In addition, the availability of biodiversity data includes trait-measurements, species distribution, species’ morphology, genetic sequences, phylogenetic trees, spatial data, and ecological niches; data are collected and uploaded in Bio Portals via citizen scientists, museums’ collections, ecological surveys, and environmental studies. These data collections generate big data, which is important current research. The first phase of Big data analytics life cycle discovers whether the data is enough to perform the analytics process, which takes more time than any other phase. In addition, Big biodiversity data management life cycle includes data integration as a main phase, affecting storage, indexing, and querying. In the data integration phase, we apply semantic data integration in order to combine data from different sources and consolidate them into valuable information that depends on semantic technologies. A number of research attempts have been achieved on semantic big data integration. For example, Ontology-Based Data Access (OBDA) has been proposed in relational schema and in NOSQL [1,2] databases since it provides a semantically conceptual schema over data repository. Another example is Semantic Extract Transform Load (ETL) framework [3], which integrates and publishes data from multiple sources as open linked data provides through semantic technologies. Moreover, Semantic MongoDB-based has been developed where researchers represented as an OWL ontology. However, the need for semantic big data integration tools becomes highly recommended because of the growth of biodiversity big data. In the current work, a semantic big data integration system is developed, which handles the following features: 1) Data heterogeneity, 2) NoSQL databases, 3) Ontology based Integration, and 4) User Interaction, where data integration components can be chosen. A proof-of-concept will be developed based on biodiversity data, having various data formats. In addition, related ontologies will be used from BioPortal.

  • Book Chapter
  • Cite Count Icon 2
  • 10.1007/978-981-16-4177-0_13
Design and Development of an Algorithm to Secure Big Data on Cloud Computing
  • Dec 6, 2021
  • Ankur N Shah + 1 more

In any organization, data is coming from various online sources and various offline sources. It is very difficult to equipment huge amount of data. Big data is one of the technologies who tackle huge amount of data easily. Cloud computing is a platform for big data. Although there are vast number of advantages for big data and cloud computing, they failed to make their place in people heart as people are anxious about security of their data. They are huge number of advantages we gain if we move big data on cloud computing like on-demand service availability, availability of data and information on Internet, resources grouping, easy to manage, easy to analyze large volume of big data, and most important is cost effective. Although there are numerous advantages to moving big data on cloud computing, we cannot avoid challenges to big data and cloud computing. The various challenges to big data are cost, data quality, rapid change, skilled man power requirement, infrastructure need, moving data onto big data platform, need for synchronization across data sources, and most important is security of data. These all challenges are not going to be easily solved; various researches are going on to solve all these problems. In this paper, we provide review about various research done in the area of security of big data on cloud computing, and finally, we provide research plan for our project regarding security of big data on cloud computing.KeywordsBig dataCloud computingSecurityAESRSAMongoDB

Save Icon
Up Arrow
Open/Close
Notes

Save Important notes in documents

Highlight text to save as a note, or write notes directly

You can also access these Documents in Paperpal, our AI writing tool

Powered by our AI Writing Assistant