Abstract
Since the advent of Jordan's recurrent network [Jordan, M. I. (1986) Serial Order: A Parallel Distributed Processing Approach. Tech. Rep. No. 8604. Institute for Cognitive Science, University of California, San Diego.] which allows the processing of data with a temporal component, neural networks have been used routinely for sequence processing. This type of network is analysed in this paper for its ability to discriminate between different languages based on its processing of a small sample of text. The motivation for developing this model was for its potential use in the on-line version of a Trinity College 1872 Printed Catalogue, a library catalogue which has entries in 14 different languages spanning over 5 centuries. It was thought that neural networks would perform well where entries to be analysed comprised only a few words. The neural network's performance was compared with that of trigrams and a suffix/morphology analysis. The trigrams proved to be superior, classifying over 92% of the entries correctly compared to 88% for the neural network and 85% for the morphology/suffix analysis. Trigrams were also far superior in the speed at which statistics were compiled and the rate at which text was processed.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.