Data mining has actively contributed to solving many real-world problems with a variety of techniques. Traditional approaches in this field are classification, clustering and regression. During the last few years a number of chal-lenges have emerged, such as imbalanced data, multi-label and multi-instance problems, low quality and/or noisy data or semi-supervised learning, among others [item 1) in the Appendix]. When these non-standard scenarios are encountered in the realm of big data, it remains an uncharted research territory, although a growing effort has been made to break the limits. The current trend is to address the classical and newly emerging data mining problems in big data and knowledge processing. Granular computing provides a powerful tool for multiple granularity and multiple-view data analysis at differ-ent granularity levels, which has demonstrated strong capabil-ities and advantages in intelligent data analysis, pattern recog-nition, machine learning and uncertain reasoning [item 2) inthe Appendix]. Big data often contains a significant amount of unstructured, uncertain and imprecise data. There are new challenges regarding the scalability of granular computing when addressing very big data sets [item 3) in the Appendix]. Big data mining relies on distributed computational strate-gies; it is often impossible to store and process data on one single computing node. The exploration of data mining and granular computing in big data and knowledge processing is an emerging field which crosses multiple research disciplines and industry domains, including transportation, communications, social network, medical health, and so on.
Read full abstract