Abstract
Words categorize the semantic fields they refer to in ways that maximize communication accuracy while minimizing complexity. Focusing on the well-studied color domain, we show that artificial neural networks trained with deep-learning techniques to play a discrimination game develop communication systems whose distribution on the accuracy/complexity plane closely matches that of human languages. The observed variation among emergent color-naming systems is explained by different degrees of discriminative need, of the sort that might also characterize different human communities. Like human languages, emergent systems show a preference for relatively low-complexity solutions, even at the cost of imperfect communication. We demonstrate next that the nature of the emergent systems crucially depends on communication being discrete (as is human word usage). When continuous message passing is allowed, emergent systems become more complex and eventually less efficient. Our study suggests that efficient semantic categorization is a general property of discrete communication systems, not limited to human language. It suggests moreover that it is exactly the discrete nature of such systems that, acting as a bottleneck, pushes them toward low complexity and optimal efficiency.
Highlights
Words categorize the semantic fields they refer to in ways that maximize communication accuracy while minimizing complexity
We show that artificial neural networks trained with generic deep-learning methods to play a color-discrimination game develop color-naming systems whose distribution on the accuracy/complexity plane is strikingly similar to that of human languages
We show that, when two deep learning-trained neural networks (NNs) play a simple color discrimination game, they develop naming systems that closely match the distribution of human languages on the Information Bottleneck (IB) plane, showing both efficiency maximization and complexity control (Fig. 3)
Summary
Words categorize the semantic fields they refer to in ways that maximize communication accuracy while minimizing complexity. Our study suggests that efficient semantic categorization is a general property of discrete communication systems, not limited to human language. Humans develop naming systems to talk about their experience under two competing pressures: “accuracy maximization” (words should encode precise information about their referents) and “complexity avoidance” (preventing unwieldy languages). Actual human naming systems are efficient in the sense that they optimize the accuracy/complexity trade-off. According to IB, a system is optimal if it lies on the curve Equipped with this framework, Zaslavsky et al [7] demonstrated that color-naming systems [4, 10, 11] are notably close to the theoretical limit and efficient in a quantifiable way
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.