Abstract

The hype around data sciences in general and big data in particular and the focus either on the potential commercial value of data analytics or on promoting its adoption as a new paradigm in conducting research, may crowd out important discussions that need to take place about the theoretical foundations of this ‘emerging’ discipline. In South Africa, discussions around (or the mere mention of) big data, especially within the National System of Innovation, often go hand in glove either with the Square Kilometre Array project and astrophysics, or eResearch or cyberinfrastructure.

Highlights

  • Background and terminologyThe views expressed in this commentary are based on a desktop analysis of four different types of documents

  • The third type of documents considered are reports produced by Committees of the National Research Council of The National Academies – two worth highlighting are ‘The Mathematical Sciences in 2025’ and ‘Frontiers in Massive Data Analysis’

  • It is generally accepted that one has to be mindful of the entire ‘big data analysis pipeline’

Read more

Summary

Mathematical and statistical foundations of data sciences

The hype around data sciences in general and big data in particular and the focus either on the potential commercial value of data analytics or on promoting its adoption as a new paradigm in conducting research, may crowd out important discussions that need to take place about the theoretical foundations of this ‘emerging’ discipline. In South Africa, discussions around (or the mere mention of) big data, especially within the National System of Innovation, often go hand in glove either with the Square Kilometre Array project and astrophysics, or eResearch or cyberinfrastructure In his excellent essay ‘50 Years of data science’, David Donoho of Stanford University remarks: The now-contemplated field of data science amounts to a superset of the fields of statistics and machine learning which adds some technology for ‘scaling’ up to ‘big data’. The main objective in this commentary is to argue for the positioning of computational, mathematical and statistical sciences in South Africa at the centre of the heralded big data revolution These disciplines are strategically important to provide a solid intellectual and academic foundation upon which to build a vibrant and successful project in big data analysis, especially in South Africa. It provides an opportunity to produce graduates with the breadth and depth of knowledge in all three major disciplines who are versatile enough to either be capable of pursuing fundamental research in these three broad disciplines or work in areas (public or private sectors) focusing on applications of (big) data sciences

Background and terminology
High dimensionality
Heterogeneity and incompleteness
Security and privacy
Big data techniques
Mathematical and statistical challenges
Future directions
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.