Abstract

The primary issue in data analysis is scalability of data mining methods. Various scaling options have been explored in prior research to overcome this problem. Several scaling strategies are explored and tested on various datasets in this research. The cascade scaling method is proposed to improve the efficacy of existing methods. The proposed method starts with gathering a huge dataset and then pre- processed. Once the dataset has undergone pre-processing, it is spitted into smaller subsets of equal size to apply a data mining strategy on each subset. The outcomes of the data mining approach on all subsets are pooled and aggregated for the final results. The accuracy of the given algorithm is used to evaluate its performance. The proposed method and existing methods are evaluated on two health care datasets: PIMA Indian Diabetes and Heart Disease. On the basis of the Data mining methods the proposed scaling approach reflects better results as compared to the existing scaling approaches. On both datasets, the proposed method is compared to previous work published by different authors in earlier studies. It was discovered that the proposed method outperformed previous research. For a few data mining methods, the proposed method achieves 100 percentage accuracy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call