Abstract

The design and implementation of Data-Driven Fuzzy Models (DDFMs) to learn balanced industrial/manufacturing data has demonstrated to be a popular machine learning methodology. However, DDFMs have also proven to perform poorly when it comes to learn from heavily imbalanced data. In this study we propose a DDFM to tackle the challenge of a two-class imbalanced case study for rail quality. We integrate a number of machine learning methods, namely: Granular Computing (GrC), RBF Neural Networks (RBF-NN), Feature Selection (FS) to create a DDFM framework which is sensitive to imbalanced data. The rationale behind the DDFM framework can be described into three main stages: in the first stage, a Fast Correlation-Based Filter (FCBF) is employed to select the most representative features. Subsequently, the concept of iterative granulation is applied to group (cluster) the rail data set. Granulation provides a number of information granules which can be viewed as fuzzy constraints. In the second stage, an RBF-NN is used as a Neural Fuzzy Model (NFM) whose initial parameters are the parameters of the fuzzy sets created during the granulation process. Finally, a twofold bootstrapping strategy is performed. On the one hand, bootstrapping is used to balance the rate between the majority and minority class. On the other hand, bootstrapping estimates the appropriate number of fuzzy linguistic rules in the NFM. The proposed modelling framework is tested against a manufacturing case study provided by TATA Steel, UK.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call