In real-time data mining applications discrete values play vital role in knowledge representation as they are easy to handle and very close to knowledge level representation than continuous attributes. Discretization is a major step in data mining process where continuous attributes are transformed into discrete values. However, most of the classifications algorithms are require discrete values as the input. Even though some data mining algorithms directly contract with continuous attributes, the learning process yields low quality results. In this paper, we introduce a new discretization method based on standard deviation technique called ‘z-score’ for continuous attributes on biomedical datasets. We compare performance of the proposed algorithm with the state-of- the-art discretization techniques. The experiment results show the efficiency in terms of accuracy and also minimize the classifier confusion for decision making process.
Read full abstract