Abstract

In this paper we extract the topology of the semantic space in its encyclopedic acception, measuring the semantic flow between the different entries of the largest modern encyclopedia, Wikipedia, and thus creating a directed complex network of semantic flows. Notably at the percolation threshold the semantic space is characterised by scale-free behaviour at different levels of complexity and this relates the semantic space to a wide range of biological, social and linguistics phenomena. In particular we find that the cluster size distribution, representing the size of different semantic areas, is scale-free. Moreover the topology of the resulting semantic space is scale-free in the connectivity distribution and displays small-world properties. However its statistical properties do not allow a classical interpretation via a generative model based on a simple multiplicative process. After giving a detailed description and interpretation of the topological properties of the semantic space, we introduce a stochastic model of content-based network, based on a copy and mutation algorithm and on the Heaps' law, that is able to capture the main statistical properties of the analysed semantic space, including the Zipf's law for the word frequency distribution.

Highlights

  • The meaning of a word can be defined as an indefinite set of interpretants, which are other words that circumscribe the semantic content of the word they represent [1]

  • Topology of the Semantic Space To build the network the directional semantic flow is measured between all the entry pairs

  • Nowadays understanding the topology of the SS and the dynamics of meaning is a fundamental issue in many fields of knowledge and technology [32]

Read more

Summary

Introduction

The meaning of a word can be defined as an indefinite set of interpretants, which are other words that circumscribe the semantic content of the word they represent [1]. A common understanding within these approaches is that the SS is made up of words or concepts that are connected by certain relationships. Depending on the nature of these relationships different semantic webs have already been considered. It has been argued that networks derived by dictionaries and representing the so called dictionary semantics, characterised by scale-free distribution for the connectivity with exponents smaller than -2, reflect the properties of language use more than the properties of the SS [8,9]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call