
Growing Self Organizing Map (GSOM) has proven benefits in text clustering. Latent Semantic Analysis (LSA) also has been used in text clustering to capture the latent concepts from text. This paper presents a novel combination of GSOM and LSA to improve text clustering results compared to using GSOM on its own. LSA is an inherently global algorithm that looks at trends and patterns globally and GSOM is a nearest neighborhood based algorithm which looks at local patterns. Combination of these two can be used to discover both the global and local patterns. In the proposed model, initial text corpus is converted into its vector space representation using the traditional Term Frequency - Inverse Document Frequency (TF-IDF) technique. Then the Singular Value Decomposition (SVD) followed by Frobenius norm is applied on the resulting high dimensional vector to come up with a new vector with an optimal number of dimensions. Experiments using the proposed model were conducted and compared with the original GSOM under the same conditions. Experiment results demonstrate that the new combination of these well known techniques enhances the accuracy of clustering results and the computational time than the GSOM alone.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.