Abstract

Sunflower oil is widely used as edible oil. It is commonly extracted by solvent extraction method from the sunflower seed. After extraction, crude sunflower oil is obtained. Crude sunflower oil has some undesirable impurities and dark colors. These impurities and dark colors require removal. The bleaching process is applied to remove the color. The bleaching earth is used in the refining and removes color. The specifications of crude sunflower oil such as impurity, free fatty acid ratio, wax, color index and the temperature of the process, the vacuum of the process, the amount of bleaching earth used affect the bleaching output color value. In this study, machine learning algorithms are used to predict the bleaching output color. In order to predict, Waikato Environment for Knowledge Analysis (WEKA), an open-source Data Mining workbench is run. 15 well-known machine learning classifier algorithms, suitable for our data such as k-nearest neighbors, multilayer perceptron and random forest are performed. Each algorithm is tested on a real dataset by a 10-fold cross-validation method. The correlation coefficient, mean absolute error and root mean squared error is calculated for each algorithm and benchmarked. Results show that Random Forest Classifier is the most effective classifier for our data. Additionally, Wilcoxon Signed-Rank statistical test is conducted whether Random Forest Classifier is the most effective classifier for some k-fold cross validation.

Highlights

  • Today, rather than the data problem, there is a problem of extracting meaningful information from large volumes of data

  • Benchmarking of machine learning algorithms are used in many real-life applications in order to recognize handwritten digit (Bottou et al, 1994), to classify clinical samples (Sampson et al, 2011), to predict heart diseases (Austin et al, 2013; Abdar et al, 2015; Pouriyeh et al, 2017; Tougui et al, 2020), to detect

  • Some machine learning algorithms such as K-Nearest Neighbors classifier (KNN), Simple Linear Regression (SLR), Gaussian Processes (GP), KStar classifier (KS), Decision Table Classifier (DTC), Decision Stump Classifier (DSC), Zeror Classifier (ZR), Random Tree Classifier (RTC), M5Rules classifier (M5R), REPTree classifier (REPT), Locally Weighted Learning classifier (LWL), M5 model trees classifier (M5P), Random Forest Classifier (RFC) and Multilayer Perceptron (MP) that are suitable to our data are run and results are discussed

Read more

Summary

Introduction

Rather than the data problem, there is a problem of extracting meaningful information from large volumes of data. Data mining contains the use of complex data analysis tools to detect previously unknown, valid forms and relationships in large data set (Karasozen et al, 2006) These tools can contain mathematical algorithms, statistical models and machine learning methods. 15 well-known machine learning classifier algorithms are used to predict the bleaching output color of sunflower oil using the aforamentioned specifications. These 15 algorithms are compared by calculating a correlation coefficient, mean absolute error and root mean squared error. With the best algorithm obtained, the output color can be predicted against the changes that may occur in the input parameters.

Evaluation classifiers to dedect best algorithm
Evaluation
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call