Abstract
In medical data classification, if the size of data sets is small and if it contains multiple missing attribute values, in such cases improving classification performance is an important issue. The foremost objective of machine learning research is to improve the classification performance of the classifiers. The number of training instances provided for training must be sufficient in size. In the proposed algorithm, we substitute missing attribute values with attribute available domain values and generate additional training tuples that are in addition to original training tuples. These additional, plus original training samples provide sufficient data samples for learning. The neuro-fuzzy classifier trained on this dataset. The classification performance on test data for the neuro-fuzzy classifier is obtained using the k-fold cross-validation method. The proposed method attains around 2.8% and 3.61% improvement in classification accuracy for this classifier.
Highlights
For various medical data classification problems, Data mining and Machine learning methods are effectively applied [1]
The processed method improves the classification performance of the classifier
The input values are the extracted features or attributes are acknowledged by the structure as the feedback, and these feedback attribute values are fuzzifier based on the membership functions (MF)
Summary
For various medical data classification problems, Data mining and Machine learning methods are effectively applied [1]. The training dataset may have small data samples from its inception Another reason for less number of training tuples is the case where the training data may contain data tuples with multiple missing attribute value. There are two basic methods for discarding data instance with missing values [6], the ways are complete case analysis and dropping. Missing values are imputed with reasonable probable values; these imputation-based procedures are applied instead of complete deletion An objective of this method is to use known recognized associations from a valid range of values of the data set [8]. Section four covers the experimentation and results and the last section is a conclusion
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have