Abstract

This chapter discusses that when the self-organizing map (SOM) is applied to the mapping of documents, one can represent them statistically by their weighted word frequency histograms or some reduced representations of the histograms that can be regarded as data vectors. One SOM of about seven million documents has been made, viz., of all of the patent abstracts in the world that have been written in English and are available in electronic form. The map consists of about one million models. Keywords or key texts can be used to search for the most relevant documents first. New effective coding and computational schemes of the mapping are described. The document organization, searching, and browsing system is called WEBSOM, and is described in this chapter. The original WEBSOM was two-level SOM architecture, but it was later simplified.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.