Abstract

The aim of the research is to find the best performance both of logistic regression and linear discriminant which their threshold uses some various values. The performance tools used for evaluating classifier model are confusion matrix, precision-recall, F1 score and receiver operation characteristic (ROC) curve. The Audit-risk data set are used for the implementation of the proposed method. The screening data and dimension reduction by using principal component analysis (PCA) are the first step that must be conducted before the data are divided into the training and testing set. After the training process for obtaining the classifier model parameters has been completed, the calculation of performance measures is done only on the testing set where the various constants are added to the threshold value of both classifier models. The logistic regression classifier has the best performance of 94% on the precision-recall, 91.7% on the F1-score, and 0.906 on the area under curve (AUC) where the threshold values are on the interval between 0.002 and 0.018. On the other hand, the linear discriminant classifier has the best performance when the threshold value is 0.035 and its performance value is respectively the precision-recall of 94%, the F1-score of 91.7%, and the AUC of 0.846.

Highlights

  • Machine Learning (ML) has a central role in processing data to be information or even to be knowledge

  • Implementation of a regression technique based on fuzzy logic for predicting of the time series data have be conducted by Handoyo and Marji [1], Handoyo et al.[2], Handoyo and Chen [3], Efendy et al.[4]

  • principal component analysis (PCA) is applied to all of the screened input variables and the transformed predictor variables and the original of its response variable are divided into the training and testing set

Read more

Summary

Introduction

Machine Learning (ML) has a central role in processing data to be information or even to be knowledge. The type of response (target) variable will lead to a kind of suitable method . When the response variable has an interval or a ratio measurement scale, so the matching analyses method is called regression technique. If the measurement scale of response variable is a categorical (nominal or ordinal), the suitable analyses method is called classification technique. Implementation of a regression technique based on fuzzy logic for predicting of the time series data have be conducted by Handoyo and Marji [1], Handoyo et al.[2], Handoyo and Chen [3], Efendy et al.[4]. Kusdarwati and Handoyo [5] implemented the regression technique based on the hybrid Neural Network and wavelet. The performance of the regression models based on the fuzzy logic above are very satisfied, but the performances of the classification models based on the fuzzy logic are a reasonable worse which it is not balance on the trade-off between it complicated in computation and it yielded in performance

Objectives
Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.