Abstract
Conventional clustering methods have an assumption that data is stored centrally and are memory resident which made it tough to arrive at solutions when dealing with large data. Centralizing huge data from multiple locations are always a challenging task owing to the large memory space and computational time required by traditional mining methods. Traditional k-means type of clustering were used for the identification of clusters’ prototype that can serve as a representative point in a large dataset and the major setback is that the cluster centers tend to distort the distribution of the underlying data making the representative points incapable of handling the complete distribution of the data leading to poor pattern generation. With the aim to resolve this issue, this paper proposes an empirical model (EM) that ensures the centers of the cluster for capturing the data distribution which lies under. In the proposed methodology, the asymptotic convergence is centered on the data which is distributed. Secondly, an efficient mechanism for measuring the cluster centers in practice. Finally, a methodology for distributive convergence and center optimization is proposed. The model is compared with that of other methods in the literature and the results are discussed.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have