Abstract
Software engineering is a data-driven discipline and an integral part of data science. The introduction of big data systems has led to a great transformation in the architecture, methodologies, knowledge domains, and skills related to software engineering. Accordingly, education programs are now required to adapt themselves to up-to-date developments by first identifying the competencies concerning big data software engineering to meet the industrial needs and follow the latest trends. This paper aims to reveal the knowledge domains and skill sets required for big data software engineering and develop a taxonomy by mapping these competencies. A semi-automatic methodology is proposed for the semantic analysis of the textual contents of online job advertisements related to big data software engineering. This methodology uses the latent Dirichlet allocation (LDA), a probabilistic topic-modeling technique to discover the hidden semantic structures from a given textual corpus. The output of this paper is a systematic competency map comprising the essential knowledge domains, skills, and tools for big data software engineering. The findings of this paper are expected to help evaluate and improve IT professionals' vocational knowledge and skills, identify professional roles and competencies in personnel recruitment processes of companies, and meet the skill requirements of the industry through software engineering education programs. Additionally, the proposed model can be extended to blogs, social networks, forums, and other online communities to allow automatic identification of emerging trends and generate contextual tags.
Highlights
Today’s digital world is called the era of big data
This study aimed to identify the knowledge domains and skill sets required for big data software engineering’ (BDSE)
The methodology of this study was based on a content analysis of the textual content of BDSE job ads using probabilistic topic models in order to reveal the knowledge domains and skill sets required for BDSE
Summary
Big data systems are causing a transformation in the architecture and methodologies of software engineering [1]. The increasing demands and challenges related to BDSAs, which have emerged as a natural consequence of the widespread use of big data, have been a fundamental issue frequently highlighted in the literature, especially in the last few decades. Recent studies indicate that products and services using BDSAs require more advanced and specific professional knowledge and skills in terms of software engineering principles, procedures and processes [8], [13], [18], [20]. Software engineering is a data-driven discipline, software development processes concerning big data systems require the use of more progressive knowledge and skills, such as scalable architecture, real-time data processing, real-time coding, integration, and testing. Software engineering is a data-driven discipline, software development processes concerning big data systems require the use of more progressive knowledge and skills, such as scalable architecture, real-time data processing, real-time coding, integration, and testing. [3], [7], [8], [10], [11], [22], [25], [26]
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.