Abstract
In this work we carried out an idiom type identification task on a set of 90 Italian V-NP and V-PP constructions comprising both idioms and non-idioms. Lexical variants were generated from these expressions by replacing their components with semantically related words extracted distributionally and from the Italian section of MultiWordNet. Idiomatic phrases turned out to be less similar to their lexical variants with respect to non-idiomatic ones in distributional semantic spaces. Different variant-based distributional measures of idiomaticity were tested. Our indices proved reliable in identifying also those idioms whose lexical variants are poorly or not at all attested in our corpus.
Highlights
Extensive corpus studies have provided support to Sinclair (1991)’s claim that speakers tend to favor an idiom principle over an open-choice principle in linguistic production, resorting, where possible, topreconstructed phrases rather than using compositional combinatorial expressions
Psycholinguistic studies investigating the comprehension of idiom lexical variants have found such alternative forms to be more acceptable when the idiom parts independently contribute to the idiomatic meaning than when they don’t (Gibbs et al, 1989) or when the idioms are more familiar to the speakers (McGlone et al, 1994)
In the present work we propose a method for idiom type classification that starts from a set of VNP and V-PP constructions, generates a series of lexical variants for each target by replacing the verb and the argument with semantically related words and compares the semantic similarity between the initial constructions and their respective variants
Summary
Extensive corpus studies have provided support to Sinclair (1991)’s claim that speakers tend to favor an idiom principle over an open-choice principle in linguistic production, resorting, where possible, to (semi-)preconstructed phrases rather than using compositional combinatorial expressions These multiword expressions (MWEs) and idioms in particular (Nunberg et al, 1994; Sag et al, 2002; Cacciari, 2014; Siyanova-Chanturia and Martinez, 2014) exhibit an idiosyncratic behavior that makes their account troublesome for most grammar models (Chomsky, 1980; Jackendoff, 1997; Hoffmann and Trousdale, 2013), including restricted semantic compositionality and transparency, low morphosyntactic versatility and, crucially for the study at hand, a considerable degree of lexical fixedness. This kind of lexical flexibility does not turn out to be so widespread, systematic and predictable as in literal constructions
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.