Abstract

Abstract Big data processing often uses the paradigm of parallelism by computing directly on top of the distributed data storage. The existing big data workflows unify the data processing practices to utilize the cloud’s native computational potentials to offer advanced machine learning and BI capabilities. Spark is an open-source massively parallel in-memory data processing framework, the current state-of-the-art. The primary approach is to break down the job into granular-level executed tasks, enabling parallelization. In the discussed case study, through IoT – cloud solutions, the plant data can be converted into an analyzable form to let the farther machine learning modules produce added value. To maximize the efficiency of the processing and accumulation, cloud-based components are introduced. Based on the data insights, the appropriate operative actions can be taken. The cost and performance optimization methods were also discussed in the study. Through achieving higher degree of digitalization, the control over the production increased.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.