Abstract

The needless of clustering is continually growing. This fact is due to the huge amount of information daily stored in the sites web. Those informations must be classified in order to facilitate their treatment. There are several methods to classify a set of data; those based on matching learning are more efficient. Among those systems, Kohonen algorithm is a useful system because of its famous proprieties. Some of them will be presented in this paper. Unfortunately, as other clustering algorithm, it suffers from the following problems: dependency of the result on the initialization phase which is performed randomly and the number of classes is unknown in the beginning. The overcome of these problems represents a great challenge in the clustering domain. In the present paper we expose an approach which allows performing suitably the initialization phase. This approach consists of conducting a pre- processing phase. In this latter we use a parameter r, we obtain an idea on distribution of examples. Then the initial weight vectors are chosen from the area which has a high density. This allows us to avoid an initialization with the isolated examples which decrease the performance of the system. Also we can determine approximately the number of classes. After measuring the quality of clustering obtained by Kohonen algorithm, we update the parameter r and we repeat the same process. This latter is arrested when we obtain a suitable quality of clustering. To show the performance of this approach, some experiments are conducted.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.