Abstract

This paper discusses various factors to be considered when making decisions regarding the properties of stored data. These factors extend beyond properties of the data files to include the context within which the data are used. Decisions about the stored length of numeric variables in SAS® data sets are used as an example of the decision-making process. Although the LENGTH statement in SAS is simple to use, what's going on behind the scenes is more complex, especially with respect to numeric variables. Understanding what happens when you specify the length of a numeric variable is essential for making informed decisions. SAS stores the value of all numeric variables in floating-point representation. This paper begins with a brief, practical, overview of floating-point representation and how it relates to programming questions regarding length, precision, and efficient use of disk space. We will discuss situations where numeric length should not be reduced, even if the range of integer values on the data set would appear to permit it. We'll argue, in particular, that decisions about numeric length require that you consider the larger context of data use. This is important because (i) length is a variable attribute that can be passed on to other data sets via merges or concatenation, and (ii) basing attribute decisions on the properties of single data sets ignores the context of data set usage with respect to subsequent updates. Specific examples will be used to illustrate this. For saving disk space, we'll show the advantages of the COMPRESS=BINARY option in SAS. We will also show that saving disk space is the only reason to reduce numeric length. This is because SAS uses the full 8-byte representation of numbers in all DATA steps and PROCs, regardless of the variable's specified length.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.