Abstract
The amount of sensors in process industry is continuously increasing as they are getting faster, better and cheaper. Due to the rising amount of available data, the processing of generated data has to be automatized in a computationally efficient manner. Such a solution should also be easily implementable and reproducible independently of the details of the application domain. This paper provides a suitable and versatile usable infrastructure that deals with Big Data in the process industry on various platforms using efficient, fast and modern technologies for data gathering, processing, storing and visualization. Contrary to prior work, we provide an easy-to-use, easily reproducible, adaptable and configurable Big Data management solution with a detailed implementation description that does not require expert or domain-specific knowledge. In addition to the infrastructure implementation, we focus on monitoring both infrastructure inputs and outputs, including incoming data of processes and model predictions and performances, thus allowing for early interventions and actions if problems occur.
Highlights
It has recently been recognized that machine learning and data analytics play a critical role in realizing long-term sustainability goals in the process industry
Our goal in this paper is to present a versatile, scalable and deployable Big Data infrastructure that is capable of running data processing pipelines as a basis for advanced machine learning and process mining algorithms
We identified key necessities for data analytics in the process industry and developed an infrastructure that covers the fullstack of data analytics, i.e., data gathering, preprocessing, exploring, visualizing, persisting, model building, and model deploying in real-time and historical data
Summary
It has recently been recognized that machine learning and data analytics play a critical role in realizing long-term sustainability goals in the process industry. Such goals require the active participation and involvement of the whole industrial sector, whose energy consumption reached the 24.6% of the total energy consumption in Europe in 2017, according to the European Environment Agency (2017). These goals cannot be achieved without large scale digitization of the process industry and a subsequent data management analysis for efficient decision making, which pose significant challenges for industrial manufacturers, as observed by Zeng and Yin (2017). To enable Big Data processing in terms of Industry 4.0, further development of the required infrastructures, concerning loading, processing, storing, and visualizing the data, is required Such processing cycle should be performed automatically in a near real-time fashion in a versatile, scalable and deployable manner.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have