Abstract

A Bayesian network (or a belief network) is a probabilistic graphical model that represents a set of variables and their probabilistic independencies. Some researches often involve continuous random variables. In order to apply these continuous variables to BN models, these variables should convert into discrete variables with limited states, often two. During the discretization process, one problem that researchers faced is to decide the number of states for discretization. Does the number of states chosen for discretization impact models’ power? In this study, this issue is examined empirically. The study examines this issue in the financial distress prediction field. The sample consists of 144 firms listed in Tehran stock exchange from 1997 to 2007. In order to develop Naïve Bayes models, two methods for choosing variables were used. The first method is based upon conditional correlation between variables and the second method is based upon conditional likelihood. The accuracy in predicting financial distress of the first naïve Bayes model's performance that is based upon conditional correlation is 90% and the accuracy of the second naïve Bayes model is 93%. Collectively, the results showed that the performance of the second naïve Bayes model that based upon conditional likelihood is better than the first one. Further analyses also showed that the number of states chosen for discretization has effect on models’ performance. In comparing the model's performance when continuous variables are discretized into two, three, four and five states, the results showed that the naïve Bayes model's performance increases when the number of states for discretization increases from two to three, and from three to four but when the number of states increases from four to five the model's performance decreased. Key words: Bayesian networks, naïve Bayes, selection of predictors, discretization, continuous variables, financial distress predictors, firms, Tehran Stock Exchange (TSE).

Highlights

  • Reasoning with incomplete and unreliable information is a central characteristic of decision making in some industry such as medicine and finance

  • In comparing the model's performance when continuous variables are discretized into two, three, four and five states, the results showed that the naïve Bayes model's performance increases when the number of states for discretization increases from two to three, and from three to four but when the number of states increases from four to five the model's performance decreased

  • The aim of this study is to examine the effect of discretization on a Naïve Bayes Model's Performance

Read more

Summary

Introduction

Reasoning with incomplete and unreliable information is a central characteristic of decision making in some industry such as medicine and finance. Bayesian networks provide a theoretical framework for dealing with this uncertainty using an underlying graphical structure and the probability calculus. Bayesian networks have been successfully implemented in areas as diverse as medical diagnosis and finance (Holmes and Jain, 2008). Bayesian networks are used for developing two naïve Bayes models for predicting financial distress and the effect of discretization on naïve Bayes model's performance were investigated. A Bayesian network (or a belief network) is a probabilistic graphical model that represents a set of variables and their probabilistic independencies. Naïve Bayes models work only with discrete variables and to Abbreviations: EP-T, Extended Pearson-turkey; BN, Bayesian network; TSE, Tehran stock exchange; DAG, direct acyclic graph

Objectives
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.