Abstract

Discretization plays a vital role where continuous attributes are transformed into discrete values for preprocessing tasks in data mining algorithms and machine learning classifiers. Most of the real data often come in mixed notations, i.e., continuous and discrete, while many machine learning algorithms require it in the form of discrete data only. In the past few decades, numerous techniques have been proposed as discretization techniques for continuous data, which are unable to provide better accuracy, understandability of models and also classifier confusion in the form of continuous data. In this paper, we propose a new discretization technique called ‘ZDisc’ based on the standard deviation normalization statistical technique for continuous attributes to generate less number of cut points. The proposed method has been compared with other state-of-the-art discretization techniques on benchmark continuous datasets. The experiment shows that the proposed discretization method performs competitive discretization schemes in terms of classifier accuracy and to minimize the classifi er confusion.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call