The term smart manufacturing refers to a future-state of manufacturing, where the real-time transmission and processing of information across the factory will be used to produce advanced manufacturing intelligence that can optimize every aspect of its operation. In recent years, initiatives and groups such as the Smart Manufacturing Leadership Coalition (SMLC), Industry 4.0, and the Industrial Internet Consortium (IIC), have led the way by bringing together industry, academia and government to establish policies, roadmaps and platforms to support smart manufacturing. Although there are many characteristics that can be associated with smart manufacturing across these initiatives, a common theme is the emphasis on transitioning operations from reactive and responsive, to predictive and preventative. The research presented in this paper focuses on the development of a data pipeline that supports the development of data-driven Prognostics and Health Management (PHM) applications. In the context of smart manufacturing, PHM enables facilities to transition from preventative and reactive maintenance strategies, to predictive, preventative and condition-based strategies. The benefits that can be derived by PHM are aligned with those of smart manufacturing, which include the opportunity to decrease costs, increase machine availability, reduce energy consumption, and improve production yield.
 However, the process of ingesting, cleaning and transforming real-time data streams for data-driven PHM is a difficult, complex and time-consuming task, with estimates from business intelligence projects ranging from 80% to 90% of total project effort. This effort may only be exacerbated further in manufacturing environments due to additional technology challenges, such as low levels of standardization, disparate protocols and interfaces, and ad hoc data management. While emerging technologies such as Cyber Physical Systems (CPS) and Internet of Things (IoT) can overcome many of these challenges and provide an open platform for transmitting data, existing large-scale manufacturing facilities that are subject to compliance, regulation, and stringent quality
 
 
 assurance policies may not be able to adopt these technologies in the short-term due to the associated cost, risk and effort. Therefore, PHM applications that need to access data streams in large-scale manufacturing facilities must do so using transparent data integration that does not discriminate between emerging and legacy technologies in the factory. To this end, this research presents a real- time, scalable, robust, and fault tolerant data pipeline for ingesting, cleaning, transforming, processing and contextualizing time-series data from a wide-range of sources in the factory.
 
 
Read full abstract