Abstract

In this paper, we try to identify analogical proportions, i.e., statements of the form “a is to b as c is to d”, expressed in linguistic terms. While it is conceivable to use an algebraic model for testing proportions such as “2 is to 4 as 5 is to 10”, or even such as “read is to reader as lecture is to lecturer”, there is no algebraic framework to support statements such as “engine is to car as heart is to human” or “wine is to France as beer is to England”, helping to recognize them as meaningful analogical proportions. The idea is then to rely on text corpora, or even on the Web itself, where one may expect to find the pragmatics and the semantics of the words, in their common use. In that context, in order to attach a numerical value to the “analogical ratio” corresponding to the phrase “a is to b”, we start from the works of Kolmogorov on complexity theory. This is the basis for a universal measure of the information content of a word a, or of a word a with respect to another one b, which, in practice, is estimated in a statistical manner. We investigate the link between a purely logical, recently introduced view of analogical proportions and its counterpart based on Kolmogorov theory. The criteria proposed for testing candidate proportions fit with the expected properties (symmetry, central permutation) of analogical proportions. This leads to a new computational method to define, and ultimately to try to detect, analogical proportions in natural language. Experiments with classifiers based on these ideas are reported, and results are rather encouraging with respect to the recognition of common sense linguistic analogies. The approach is also compared with existing works on similar problems.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.