Abstract
Updating curricula in new computer science domains is a critical challenge faced by many instructors and programs. In this paper we present an approach for identifying emerging topics and issues in Data Science by using Question and Answer (Q&A) sites as a resource. Q&A sites provide a useful online platform for discussion of topics and through the sharing of information they become a valuable corpus of knowledge. We applied latent Dirichlet allocation (LDA), a statistical topic modeling technique, to analyze data science related threads from from two popular Q&A communities Stack Exchange and Reddit. We uncovered both important topics as well as useful examples that can be incorporated into teaching. In addition to technical topics, our analysis also identified topics related to professional development. We believe that approaches such as these are critical in order to update curriculum and bridge the workplace-school divide in teaching of newer topics such as data science. Given the pace of technical development and frequent changes in the field, this is an inventive and effective method to keep teaching up to date. We also discuss the limitations of this approach whereby topics of importance such as data ethics are largely missing from online discussions.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.