Abstract

In dialectology we often encounter irreducible variation in its data, i.e., multiple responses to its probes about the form of a word or phrase. Dialectometry seeks to measure the differences between dialects and has developed several ways to measure the difference between responses when one or both of them is non-unique. We introduce here BILBAO DISTANCE, where the cardinality of response is unimportant, which may be combined with various weighting functions such as edit distance or inverse frequency weighting, and which yields intuitively appealing measures, e.g., when applied to a singleton set {a} and a set with the same element plus a second, yields d({a},{a,b}) = 0.5. It overcomes flaws in earlier proposals and is conceptually simpler and computationally more efficient to apply than earlier measures. We suspect that its results satisfy the metric axioms, as it is certainly symmetric and measures the difference between identical sets as zero.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call