Abstract

Censoring of statistical data involves the suppression of part of the desired information concerning a sample, within a certain range of values. The missing data normally belong to one or both tails, but might conceivably be taken from the middle. In most cases the number of suppressed classes is known, or the total number of suppressed items is known, but measurements representing the individual items are not recoverable. Censorship is a relatively mild form of restriction, or selection, and for many purposes in practical statistics does not handicap the analyst at all. Sediment samples, of interest to geologists, are almost invariably censored at the fine (rather than the coarse) end, because of the nature of the measuring process (sieving) ; no loss of significant information is involved within the size range being studied. Truncation involves the total removal of information concerning a continuous set of values in part of the distribution. This may occur in one or both tails. No data are available as to the number of missing items, or the number of missing classes. For this reason, truncation is a more severe form of selection than is censorship. Each of these restrictions has been handled in some detail in the statistical literature (for a general bibliography, see Mendenhall, 1958). In geomorphological sampling, even more severe restrictions appear. The land surface over a given area can be represented by a series of elevation determinations (perhaps at grid corners), which can be classified in the usual manner, and represented by various standard parameters. Where a very simple surface exists, a Gaussian distribution (with or without a suitable transformation) may be expected. But not all areas are represented by simple surfaces. Tectonic uplift of a region, or a drop in sea-level, or a climate change, or a forest fire, or some other event, may initiate a new cycle of gullyand eventually valley-cutting. Under such, rather common, circumstances, a new land surface develops at a lower elevation, specifically at the expense of the older, higher, surface. This rejuvenation process might be spread over many thousands of years. At any given instant (such as today), the area may be covered by parts of both surfaces, intimately intermixed. The complex nature of the drainage pattern, and the fact that the younger surface develops along the drainage lines, can be expected to produce a situation where higher parts of the younger, lower, surface are higher than lower portions of the older, high, surface. The two distributions overlap badly. Because destruction of the older, higher, surface occurs along narrow drainage channels rather than as in a broad, sweeping, bull-dozer like attack, the lower portion of that surface is destroyed, at first, only partially. For example, of 20 relief classes, at a given moment, the 14 highest (above sea-level) may be preserved intact, whereas each of the lowest six may have been reduced. percentagewise, either a little or a great deal. Because the bounds of an area can, in few cases, be specified with great assurance, it is impossible to reconstruct, directly, the original distribution. It is here proposed that the partially suppressed distribution be referred to as a filtered distribution, and that the effective agency be noted as a statistical filter. Filtering differs from truncation in that there is. no clearly specified line between known and unknown data: missing items may have been taken out of the original distribution, here and there, according to the characteristics of the filter. Filtering differs from censorship in that some of the information about only some of the items is missing, and further that it is not known how many items have been so affected. Suppose onie wished to experiment by tossing beans from a fixed location into square bins, arranged immediately adjacent to each other, in a line. When the experiment is completed, the beans in the bins will represent a distribution of some kind. If an assistant should cap the first r bins, at one end of the line, without the knowledge of the operator, and then collect (and count in a single category) all of the beans which bounced off of the caps, a censored distribution would result. If, however, he capped the first r bins, without collecting the beans which bounced off of the caps, a truncated distribution would result. For a filtered distribution, the assistant would be instructed to cap only certain bins, in some definite order: every second bin, of the first r bins, or every third bin, or according to some other pattern in space or time. FuTthermore, those beans which bounced off of the capped bins would be considered lost.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call