Abstract

Missing values are unavoidable in lubricant formulation data in the chemical industry owing to the complexity of lubricant manufacturing. Therefore, imputing missing values using statistical analysis and data mining is essential to obtain meaningful information such as correlations and patterns. Traditional methods, such as random forest (RF), k-nearest neighbors (k-NN), support vector machine (SVM), and deep neural networks (DNNs), have been employed for imputing missing values. However, these traditional methods neglect the latent structure because they only consider the feature information of the data. To this end, this study proposed a novel graph-based imputation method (GBIM) considering the feature information and the relations between data points to improve model performance. The proposed GBIM expresses the relation between each data point via a graph by consulting with dependency modeling and imputes missing values using a graph convolutional network (GCN). Experiments were performed for four physical properties in a lubricant formulation dataset. The results using GBIM were compared with those of traditional imputation methods (RF, k-NN, SVM, and DNN) by considering missing rates at 5% intervals from 5% to 50%. GBIM achieved 4–7% higher imputation accuracy than the other methods. The proposed GBIM can be applied in various industries as a powerful method for imputing missing values.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.