Abstract

Cluster analysis is a technique commonly used to group objects and then further analysis is carried out to obtain a model, named cluster integration. This process can be continued with various analyzes, including path analyzes, discriminant analyzes, logistics, etc. In this chapter, the author discusses the reason to use dummy variables in this type of cluster analysis. Dummy variables are the main way that categorical variables are included as predictors in modeling. With statistical models such as linear regression, one of the dummy variables needs to be excluded, otherwise the predictor variables are perfectly correlated. Thus, usually if a categorical variable can take k values, we only need k-1 dummy variables, the k-th variable being redundant, it does not bring any new information. When more dummy variables than needed are used this is known as dummy variable trapping. The advantage to use dummy variables is that they are simple to use and the decision making process is easier to manage. The novelty in this chapter is the perspective of the dummy variable technique using cluster analysis in statistical modeling. The data used in this study is an assessment of the provision of credit risk at a bank in Indonesia. All analyzes were carried out using software R.

Highlights

  • The application of cluster analysis is commonly used to group objects

  • We will explain the technical perspective of dummy variables using cluster analysis in statistical modeling, such as regression analysis, path analysis, and discriminant analysis

  • The coefficient of total determination of the Cluster integration logistic regression analysis model with 3 groups is 0.8923, so it can be concluded that the diversity of data that can be explained by the model is 89.23% while the remaining 10.17% is explained by variables outside the model

Read more

Summary

Introduction

The application of cluster analysis is commonly used to group objects. Cluster analysis can be used to group objects and further analysis is carried out to obtain a model, namely cluster integration. Cluster integration can be continued with various analyzes, including path analysis, discriminant analysis, logistics, etc. In cluster integration with path analysis, it aims to group homogeneous objects into one group, the goal is that the resulting residual variance is homogeneous in addition to maximizing the adjusted R2 value. In cluster integration with discriminant analysis, the benefits of cluster analysis generated can maximize the accuracy, sensitivity, and specificity of the model. We will explain the technical perspective of dummy variables using cluster analysis in statistical modeling, such as regression analysis, path analysis, and discriminant analysis

Why use dummy variables
Hierarchical cluster
Integrated cluster equation model with logistic regression analysis
Logistics regression analysis assumptions
Integrated cluster analysis method with logistic regression analysis
Groups 2 Groups
Discriminant analysis
Integration of cluster analysis with discriminant analysis of dummy variable approach
Model efficiency
Implementation of integrated cluster with discriminant analysis
Regression analysis
Regression analysis with dummy variables
Application of regression analysis with dummy variables
Non multicollinearity
Normality error
Non autocorrelation
Homoscedasticity
Parameter significance test a
Model interpretation
Findings
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call