Abstract
Data mining and big data analytics are approaches for analyzing data and extracting hidden information. Because big data is complicated and large in volume, traditional techniques to analysis and extraction do not function effectively. Data clustering is a common data mining approach that divides data into groups and makes it simple to extract information from them. Big data can include both organized and semi structured information, and it's becoming increasingly beneficial for companies. Examples include old organized database of inventory level, transactions, and consumer information, as well as non - structured comprehension from the internet, social media platforms, and embedded systems. Numerous schemes have been developed to reach the needed in relation to efficiency and effectiveness, and much study has been committed to Big Data analytics. Nevertheless, a few methodologies, such as clustering algorithms, require further research in regards to performance, usefulness, and other factors, leading to the development of a model which gives proper Big Data Analytics assessment and the impactful use of this methodology to retrieve relevant knowledge. We recorded and analyzed several big data sets in our proposed work, as well as discovered relevant current approaches. In this paper we proposed a new clustering technique using dimensionality reduction approach. For implementation of this work, we used real time streaming data in unstructured form and noisy sometimes. The proposed hybrid clustering techniques that improve the clustering accuracy as well as time for generate effectives clusters on large unstructured data. We confirm the findings by testing the suggested methodology on available information sets and comparing and analyzing the effectiveness of the developed system with that of current systems.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have