Abstract

As increasing the massive amount of data demands effective and efficient mining strategies, practitioners and researchers are trying to develop scalable mining algorithms, machine learning algorithms and strategies to be successful data mining in turning mountains of data into nuggets. Data of high dimension significantly increases the memory storage requirements and computational costs for data analytics. Therefore, reducing dimension can mainly improve three data mining performance: speed of learning, predictive accuracy and simplicity and comprehensibility of mined result. Feature selection, data preprocessing technique, is effective and efficient in data mining, data analytics and machine learning problems particularly in high dimension reduction. Most feature selection algorithms can eliminate only irrelevant features but redundant features. Not only irrelevant features but also redundant features can degrade learning performance. Mutual information measured feature selection is proposed in this work to remove both irrelevant and redundant features.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call