Data science education is an interdisciplinary and multidisciplinary field, with curricula continually evolving to meet societal needs. This paper aims to report a bibliometric analysis focused on the pedagogical aspects and teaching/learning strategies employed in data science curriculum design, emphasizing contributions from key authors, publication sources, affiliations, content, and cited documents. The analysis draws on metadata from documents published over a 20-year period (2005–2024), encompassing a total of 1245 documents sourced from the Scopus scientific database. Additionally, a scoping review of 20 articles was conducted to identify key skills, topics, and courses in data science education. The findings reveal a growing interest in the field, with an increasingly multidisciplinary and interdisciplinary approach. Advances in artificial intelligence and related topics, such as linked data, the semantic web, ontologies, and machine learning, are shaping the development of data science curricula. The main challenges in data science education include the creation of up-to-date and competitive curricula, integrating data science training at early educational stages (K-12, secondary schools, pre-collegiate), leveraging data-driven technologies, and defining the profile of a data scientist. Furthermore, the availability of vast amounts of open, linked, and restricted data, along with advancements in data-driven technologies, is significantly influencing research in the field of data science education.
Read full abstract