Abstract
Associative classification is a promising methodology in information mining that uses the association rule discovery procedures to build the classifier. But they have some limitations like: they are not able to handle big data as they have memory constraints, high time complexity, load imbalance and data skewness. Data skewness occurs invariably when big data analytics comes in picture and affects the efficiency of an approach. This paper presents the MapReduce solution for associative classification in respect of vertical data layout. To handle these problems we have proposed two algorithms MR-MCAR-F (MapReduce-Multi Class Associative Classifier-MapReduce fast algorithm) and MR-MCAR-L (MapReduce-Multi Class Associative Classifier Load parallel frequent pattern growth algorithm). Also in this paper, MapReduce solution of Tid List and Database coverage has been proposed. We have used three type of pruning techniques viz. database coverage, global and distributed pruning. The proposed approaches have been compared with latest approach from the literature survey in terms of accuracy, computation time and data skewness. The existing scalable approaches cannot handle skewness while, our proposed method handles it in a very effective manner. All the experiments have been performed on six datasets which have been extracted from UCI repositories on the Hadoop framework. Proposed algorithms are scalable solutions for associative classification to handle big data and data skewness.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.