Abstract
Data mining is the technique by which one can extract efficient and effective data from huge amount of raw data. There are various techniques for extracting useful data. Attribute Selection is one of the effective methods for this operation. To obtain useful data from enormous amount of data is not a simple task. It contains several phases like preprocessing, classification, analysis etc. Attribute Selection is an important topic in Data Mining, as it is the effective way to reduce dimensionality, to remove irrelevant data, to remove redundant data, & to increase accuracy of the data. It is the process of identifying a subset of the most useful attributes that produces attuned results as the original entire set of attribute. In last few years, there were different techniques for attribute selection. These techniques were judged on two measures i.e. efficiency is time to find the clusters and effectiveness is quality of subset attributes. Some of the techniques are Wrapper approach, Filter Approach, Relief Algorithm, Distributional clustering etc. But each of one having some drawbacks like unable to handle large volumes of data, computational complexity, accuracy is not guaranteed, difficult to evaluate and redundancy detection etc. To overcome some of these problems in attribute selection method this paper proposes technique that aims to provide an effective clustering based attribute selection method for high dimensional data. Initially this technique removes irrelevant attributes depending on some threshold value. Afterwards, using Kruskal’s algorithm minimum spanning tree is constructed from these attributes. From that tree some representative attributes are selected by partitioning. This is nothing but the final set of attributes.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.