An Enhanced Optimize Outlier Detection Using Different Machine Learning Classifier

Himanee Mishra,Chetan Gupta

doi:10.1007/978-981-99-0550-8_6

Himanee Mishra, Chetan Gupta

https://doi.org/10.1007/978-981-99-0550-8_6

Copy DOI

Export

Save

Cite

Publication Date: Jan 1, 2023

Abstract
Full-Text
Similar Papers

Abstract

Listen

Data mining (DM) is an efficient tool used to mine hidden information from databases enriched with historical data. The mined information provides useful knowledge for decision makers to make suitable decisions. Based on the applications, the knowledge required by the decision makers will differ and thus need different mining techniques. Hence, an ample set of mining techniques like classification, clustering, association mining, regression analysis, outlier analysis, etc. are used in practice for knowledge discovery. These mining techniques utilize various Machine Learning (ML) algorithms. ML algorithms assume the normal objects as highly probable and the outliers as low probable. The global outliers which occur very rarely will deviate totally from the normal objects and can be easily distinguished by unsupervised ML algorithms. Whereas, the collective outliers which occur rarely as groups will deviate from the normal objects and can be distinguished by ML algorithms. This paper analyzes the outliers and class imbalance for diabetes prediction for different ML algorithms, i.e. logistic regression (LR), decision tree (DT), random forest (RF), K-neighbors (K-NN), and XG-Boosting (XGB).

Full Text