Abstract

The purpose of this paper is to study the effectiveness of data imputation methods in dealing with data missingness in the data mining phase of knowledge discovery in Database (KDD). The application of data mining techniques without careful consideration of missing data can result into biased results and skewed conclusions. This research explores the impact of data missingness at various levels in KDD models employing neural networks as the primary data mining algorithm. Four of the most commonly utilized data imputation methods Case Deletion, Mean Substitution, Regression Imputation, and Multiple Imputation were evalutated using Root Mean Square (RMS) Values, ANOVA Testing, T-tests, and Tukey’s Honestly Significant Difference Test to assess the differences of performance levels between various Knowledge Discovery and Neural Network Models, both in the presence and absence of Missing Data. KeywordsKDD; Data mining; Data Imputation; Missing Data; Neural Networks Introduction (Heading 1)

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.