The Lorenz Curve: A Proper Framework to Define Satisfactory Measures of Symbol Dominance, Symbol Diversity, and Information Entropy.

Julio A Camargo

doi:10.3390/e22050542

Julio A Camargo

Open Access

PDF Available

https://doi.org/10.3390/e22050542

Copy DOI

Export

Save

Cite

Journal: Entropy (Basel, Switzerland)	Publication Date: May 13, 2020
License type: CC BY 4.0

Affiliation: University of Alcalá

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

Novel measures of symbol dominance (dC1 and dC2), symbol diversity (DC1 = N (1 − dC1) and DC2 = N (1 − dC2)), and information entropy (HC1 = log2 DC1 and HC2 = log2 DC2) are derived from Lorenz-consistent statistics that I had previously proposed to quantify dominance and diversity in ecology. Here, dC1 refers to the average absolute difference between the relative abundances of dominant and subordinate symbols, with its value being equivalent to the maximum vertical distance from the Lorenz curve to the 45-degree line of equiprobability; dC2 refers to the average absolute difference between all pairs of relative symbol abundances, with its value being equivalent to twice the area between the Lorenz curve and the 45-degree line of equiprobability; N is the number of different symbols or maximum expected diversity. These Lorenz-consistent statistics are compared with statistics based on Shannon’s entropy and Rényi’s second-order entropy to show that the former have better mathematical behavior than the latter. The use of dC1, DC1, and HC1 is particularly recommended, as only changes in the allocation of relative abundance between dominant (pd > 1/N) and subordinate (ps < 1/N) symbols are of real relevance for probability distributions to achieve the reference distribution (pi = 1/N) or to deviate from it.

Highlights

Following the early use of Shannon’s [1] entropy (HS ) by some theoretical ecologists during the 1950s [2,3,4], HS has been extensively used in community ecology to quantify species diversity.Ecologists have considered the relative abundance or probability of the ith symbol in a message or sequence of N different symbols whose meaning is irrelevant [1,5,6] as the relative abundance or probability of the ith species in a community or assemblage of S different species whose phylogeny is irrelevant [4,7,8]
If we assume that symbol dominance characterizes the extent of relative abundance inequality among different symbols, between dominant and subordinate symbols, the Lorenz-curve-based graphical representation of symbol dominance is given by the separation of the Lorenz curve from the 45-degree line of equiprobability, in which every symbol i has the same relative abundance
This theoretical analysis has shown that the Lorenz curve is a proper framework for defining satisfactory measures of symbol dominance, symbol diversity, and information entropy (Figure 1 and Tables 3 and 4)

Summary

Introduction

Following the early use of Shannon’s [1] entropy (HS ) by some theoretical ecologists during the 1950s [2,3,4], HS has been extensively used in community ecology to quantify species diversity.Ecologists have considered the relative abundance or probability of the ith symbol in a message or sequence of N different symbols whose meaning is irrelevant [1,5,6] as the relative abundance or probability of the ith species in a community or assemblage of S different species whose phylogeny is irrelevant (i.e., all species are considered taxonomically distinct) [4,7,8]. Several ecologists have, claimed that HS is a unsatisfactory diversity index because species diversity takes values from 1 to S and is ideally expressed in units of species (i.e., in the same units as S) Keeping this perspective in mind, and only considering the number of different symbols as the number of different species and the relative abundances of symbols as the relative abundances of species, Hill [9] proposed the exponential form of Shannon’s [1] entropy (HS )

Methods

Results

Conclusion