Information Veins and Resampling with Rough Set Theory

Benjamin Griffiths

doi:10.4018/978-1-60566-010-3.ch160

Abstract

Rough Set Theory (RST), since its introduction in Pawlak (1982), continues to develop as an effective tool in data mining. Within a set theoretical structure, its remit is closely concerned with the classification of objects to decision attribute values, based on their description by a number of condition attributes. With regards to RST, this classification is through the construction of ‘if .. then ..’ decision rules. The development of RST has been in many directions, amongst the earliest was with the allowance for miss-classification in the constructed decision rules, namely the Variable Precision Rough Sets model (VPRS) (Ziarko, 1993), the recent references for this include; Beynon (2001), Mi et al. (2004), and Slezak and Ziarko (2005). Further developments of RST have included; its operation within a fuzzy environment (Greco et al., 2006), and using a dominance relation based approach (Greco et al., 2004). The regular major international conferences of ‘International Conference on Rough Sets and Current Trends in Computing’ (RSCTC, 2004) and ‘International Conference on Rough Sets, Fuzzy Sets, Data Mining and Granular Computing’ (RSFDGrC, 2005) continue to include RST research covering the varying directions of its development. This is true also for the associated book series entitled ‘Transactions on Rough Sets’ (Peters and Skowron, 2005), which further includes doctoral theses on this subject. What is true, is that RST is still evolving, with the eclectic attitude to its development meaning that the definitive concomitant RST data mining techniques are still to be realised. Grzymala-Busse and Ziarko (2000), in a defence of RST, discussed a number of points relevant to data mining, and also made comparisons between RST and other techniques. Within the area of data mining and the desire to identify relationships between condition attributes, the effectiveness of RST is particularly pertinent due to the inherent intent within RST type methodologies for data reduction and feature selection (Jensen and Shen, 2005). That is, subsets of condition attributes identified that perform the same role as all the condition attributes in a considered data set (termed ß-reducts in VPRS, see later). Chen (2001) addresses this, when discussing the original RST, they state it follows a reductionist approach and is lenient to inconsistent data (contradicting condition attributes - one aspect of underlying uncertainty). This encyclopaedia article describes and demonstrates the practical application of a RST type methodology in data mining, namely VPRS, using nascent software initially described in Griffiths and Beynon (2005). The use of VPRS, through its relative simplistic structure, outlines many of the rudiments of RST based methodologies. The software utilised is oriented towards ‘hands on’ data mining, with graphs presented that clearly elucidate ‘veins’ of possible information identified from ß-reducts, over different allowed levels of missclassification associated with the constructed decision rules (Beynon and Griffiths, 2004). Further findings are briefly reported when undertaking VPRS in a resampling environment, with leave-one-out and bootstrapping approaches adopted (Wisnowski et al., 2003). The importance of these results is in the identification of the more influential condition attributes, pertinent to accruing the most effective data mining results.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Information Veins and Resampling with Rough Set Theory

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Theory and Application on Rough Set, Fuzzy Logic, and Granular Computing.
Xibei Yang ... Weihua Xu
The Scientific World Journal | VOL. 2015
Xibei Yang, et. al.Xibei Yang ... Weihua Xu
01 Jan 2015
The Scientific World Journal | VOL. 2015

Variable precision rough fuzzy set model based on general relations
Eric C.C Tsang ... Bingzhen Sun
-
Eric C.C Tsang, et. al.Eric C.C Tsang ... Bingzhen Sun
01 Jul 2012
01 Jul 2012

Grey variable dual precision rough set model and its application
Junliang Du ... Yong Liu
Grey Systems: Theory and Application | VOL. 12
Junliang Du, et. al.Junliang Du ... Yong Liu
18 Mar 2021
Grey Systems: Theory and Application | VOL. 12

Granular Computing Based Data Mining in the Views of Rough Set and Fuzzy Set
Guoyin Wang ...
-
Guoyin Wang, et. al.Guoyin Wang ...
01 Jan 2009
01 Jan 2009

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Information Veins and Resampling with Rough Set Theory

Abstract

Talk to us

Similar Papers