Abstract

Introduction. The review material is based mainly on business intelligence (BI) solutions designed for tasks with corporate data. But all the main aspects of working with data discussed in the work are also used on data processing platforms (Data Science Platform). Many BI vendors have expanded the capabilities of their systems to perform more advanced analytics, including Data Science. They added the phrase “Data Science” to their marketing research, and the term “advanced analytics” lost some popularity in relation to corporate data. The Data Science Platform provides a comprehensive set of tools for use by advanced users who traditionally work with data. Capabilities that allow you to connect to multi-structured data across different types of storage platforms, both on-premises and in the cloud, and the infrastructure architecture of a modern BI analytics platform enable high-performance workloads, including business intelligence. It uses distributed architecture, massively parallel processing, data virtualization, in-memory computing, etc. The combination of traditional relational data processing with calculations on the well-known Apache Hadoop software infrastructure, which integrates a number of components of the Hadoop ecosystem (Apache Hive, HBase, Spark, Solr, etc.) with the necessary target functions, allows you to create a fully functional platform for storing and processing structured and non-structures data. Purpose. A review of data processing problems and an analysis of the use of world-class mathematical apparatus and tools for obtaining knowledge from information were carried out. Methods. The paper describes the use of Data Mining methods in big data processing tasks, as well as methods of business, recommendation and predictive analytics. Result. The study suggests that machine learning-enhanced master data management (MDM), data quality, data preparation, and data catalogs will converge into a single, modern Enterprise Information Management (EIM) platform applicable to most new analytics projects. The results of the analysis of the process of identifying useful data can be useful to researchers and developers of modern platforms for processing and researching data in various spheres of society. Conclusion. A review of data processing problems and an analysis of the use of world-class mathematical apparatus and tools for obtaining knowledge from information were carried out. It is shown that a high-quality solution to the problems of working with first-level data indicated in this review will be provided by data research in modern analytical platforms. Successful penetration into their essence at the level of obtaining knowledge using machine learning and artificial intelligence algorithms will make it possible to predict future results in managed objects (processes) and make informed decisions.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.