Abstract

One of the fundamental problems in information science is to distinguish various objects, such as books or journal articles, on the basis of associated values, such as authors and titles. Where the values fail to distinguish two distinct objects, we say that the objects are ambiguous under the given value assignment. To obtain a measure of of ambiguity, it is only necessary to count the number of ways that the objects can be arranged for each set of ambiguous objects, multiply these counts and take logarithms. It is shown that such an approach leads to a measure in the formal sense and that the measure depends only on the definition of equality of values so that it can be simply extended to sets of values and ordered sets of values. It is also shown that it is possible to construct a function of ambiguity that one can call information and that the information loss that occurs when distinct values are grouped into equivalence classes, as in the use of search and sort keys, is also a measure. Finally, it is shown that ambiguity and information as here defined are directly related to Shannon's definition of information, thus tieing this approach to that portion of information theory associated with the derivation of optimal distributions frequently used in information science models. “… if one is concerned with messages to be transmitted, and if there is reasonable freedom in coding the message for transmission, entropy is clearly a quantity of interest and importance. But this fact makes it no easier to describe. One may find (its) description … to be simple, completely perspicuous, and immediately intuitive. But neither the writer nor any of his friends did.” (John W. Tukey, 1963).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call