Abstract

The term “data science” refers to an emerging field of research and practice that focuses on obtaining, processing, visualizing, analyzing, preserving, and re-using large collections of information. A related term, “big data,” has been used to refer to one of the important challenges faced by data scientists in many applied environments: the need to analyze large data sources, in certain cases using high-speed, real-time data analysis techniques. Data science encompasses much more than big data, however, as a result of many advancements in cognate fields such as computer science and statistics. Data science has also benefited from the widespread availability of inexpensive computing hardware—a development that has enabled “cloud-based” services for the storage and analysis of large data sets. The techniques and tools of data science have broad applicability in the sciences. Within the field of psychology, data science offers new opportunities for data collection and data analysis that have begun to streamline and augment efforts to investigate the brain and behavior. The tools of data science also enable new areas of research, such as computational neuroscience. As an example of the impact of data science, psychologists frequently use predictive analysis as an investigative tool to probe the relationships between a set of independent variables and one or more dependent variables. While predictive analysis has traditionally been accomplished with techniques such as multiple regression, recent developments in the area of machine learning have put new predictive tools in the hands of psychologists. These machine learning tools relax distributional assumptions and facilitate exploration of non-linear relationships among variables. These tools also enable the analysis of large data sets by opening options for parallel processing. In this article, a range of relevant areas from data science is reviewed for applicability to key research problems in psychology including large-scale data collection, exploratory data analysis, confirmatory data analysis, and visualization. This bibliography covers data mining, machine learning, deep learning, natural language processing, Bayesian data analysis, visualization, crowdsourcing, web scraping, open source software, application programming interfaces, and research resources such as journals and textbooks.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call