Abstract
The notion of uncertain reasoning has grown relative to the power and intelligence of computers. From sources which are uncertain information and/or imprecise data, it is importantly the ability to represent uncertainty and reason about it (Shafer & Pearl, 1990). A very general problem of uncertain reasoning is how to combine information from independent and partially reliable sources (Haenni & Hartmann,forthcoming). With data mining, understanding the confirming and/or conflicting information from characteristics describing objects classified to given hypotheses is affected by their reliability. Further, the presence of missing values compounds the problem, since the reasons for their presence may be external to the incumbent reliability issues (Olinsky, Chen, & Harlow, 2003; West, 2001). These issues are demonstrated here using the classification technique: Classification and Ranking Belief Simplex (CaRBS), introduced in Beynon and Buchanan (2004) and Beynon (2005). CaRBS operates within the domain of uncertain reasoning, namely in its accommodation of ignorance, due to its mathematical structure based on the Dempster-Shafer theory of evidence (DST) (Srivastava & Mock, 2002). The ignorance here encapsulates incompleteness of the data set (presence of missing values), as well as uncertainty in the evidential support of characteristics to the final classification of the objects. This chapter demonstrates that a technique such as CaRBS, through uncertain reasoning, is able to uniquely manage the presence of missing values by considering them as a manifestation of ignorance, as well as allowing the possible unreliability of characteristics to be inherent. Importantly, the described process removes the need to falsely transform the data set in any way, such as through imputation (Huisman, 2000). The example issue of credit ratings considered here has become increasingly influential since its introduction in around 1900 with the Manual of Industrial and Miscellaneous Securities (Levich, Majnoni, & Reinhart, 2002). The rating agencies shroud their operations in particular secrecy, stating that statistical models cannot be used to replicate their ratings (Singleton & Surkan, 1991), hence advocating the need for alternative analyses, including those utilising uncertain reasoning.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have