An Information Theoretic Approach to Symbolic Learning in Synthetic Languages.

Andrew D Back,Janet Wiles

doi:10.3390/e24020259

Andrew D Back, Janet Wiles

Open Access

PDF Available

https://doi.org/10.3390/e24020259

Copy DOI

Export

Save

Cite

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

An important aspect of using entropy-based models and proposed “synthetic languages”, is the seemingly simple task of knowing how to identify the probabilistic symbols. If the system has discrete features, then this task may be trivial; however, for observed analog behaviors described by continuous values, this raises the question of how we should determine such symbols. This task of symbolization extends the concept of scalar and vector quantization to consider explicit linguistic properties. Unlike previous quantization algorithms where the aim is primarily data compression and fidelity, the goal in this case is to produce a symbolic output sequence which incorporates some linguistic properties and hence is useful in forming language-based models. Hence, in this paper, we present methods for symbolization which take into account such properties in the form of probabilistic constraints. In particular, we propose new symbolization algorithms which constrain the symbols to have a Zipf–Mandelbrot–Li distribution which approximates the behavior of language elements. We introduce a novel constrained EM algorithm which is shown to effectively learn to produce symbols which approximate a Zipfian distribution. We demonstrate the efficacy of the proposed approaches on some examples using real world data in different tasks, including the translation of animal behavior into a possible human language understandable equivalent.

Highlights

Language is the primary way in which humans function intelligently in the world.Without language, it is almost inconceivable that we as a species could survive
In contrast to classical value-based models such as those employed in signal processing, or even the concept of quantized models employing discrete values such as those found in classifiers, we propose that the phase of AI systems may be based on the concept of synthetic languages
It should be noted that our intention is not to validate symbolization using simulations, rather we present some potential applications which show that useful results can be obtained

Summary

Introduction

Language is the primary way in which humans function intelligently in the world. Without language, it is almost inconceivable that we as a species could survive. Instead of modeling systems based on hard classifications based on some measured features, a synthetic language approach could be useful for developing an understanding of meaning using behavioral models based on sequences of probabilistic events. These events might be captured as simple language elements. The goal of symbolization can be differentiated from quantization in that the properties of determining language primitives may be very different from efficient data compression or even fidelity of reconstruction These properties may include metrics of robustness, intelligibility, identifiability, and learnability. We demonstrate the efficacy of the proposed approaches on some examples using real world data in quite different tasks including the translation of the movement of a biological agent into a potential human language equivalent

Aspects of Symbolization

Zipf–Mandelbrot–Li Symbolization

Maximum Intelligibility Symbolization

Learning Synthetic Language Symbols

Authorship Classification

Symbol Learning Using an LCEM Algorithm

Potential Translation of Animal Behavior into Human Language

Conclusions

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Entropy	Publication Date: Feb 10, 2022
Citations: 2	License type: CC BY 4.0

R Discovery Prime

An Information Theoretic Approach to Symbolic Learning in Synthetic Languages.

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: Entropy

Lead the way for us

Similar Papers

Entropy Estimation Using a Linguistic Zipf-Mandelbrot-Li Model for Natural Sequences.
Andrew D Back ... Janet Wiles
Entropy | VOL. 23
Andrew D Back, et. al.Andrew D Back ... Janet Wiles
24 Aug 2021
Entropy | VOL. 23

Real World Data – Does it Cut the Mustard or Should We Take it With a Pinch of Salt?
R Muirhead ... A Aggarwal
Clinical Oncology | VOL. 35
R Muirhead, et. al.R Muirhead ... A Aggarwal
19 Oct 2022
Clinical Oncology | VOL. 35

Supplementary Tables 1-6 from Evaluating Pneumonitis Incidence in Patients with Non–small Cell Lung Cancer Treated with Immunotherapy and/or Chemotherapy Using Real-world and Clinical Trial Data
Chenan Zhang ... Yue Huang
-
Chenan Zhang, et. al.Chenan Zhang ... Yue Huang
04 Apr 2023
04 Apr 2023

Supplementary Tables 1-6 from Evaluating Pneumonitis Incidence in Patients with Non–small Cell Lung Cancer Treated with Immunotherapy and/or Chemotherapy Using Real-world and Clinical Trial Data
Chenan Zhang ... Shenggang Wang
-
Chenan Zhang, et. al.Chenan Zhang ... Shenggang Wang
04 Apr 2023
04 Apr 2023

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

An Information Theoretic Approach to Symbolic Learning in Synthetic Languages.

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: Entropy