In a relational database management system, allowing tuples to be of variable length can complicate storage management considerably. As an example, consider an employee relation containing the employee’s name, social security number, and salary. We could store the names of the employee’s dependents in a separate relation, which would require an expensive join to access. As an alternative, we could add a set-valued attribute, called dependents, to the employee relation, with each tuple containing zero, one, or more values for that attribute. In non-first normal form relational databases, set-valued attributes are commonplace [3, 10, 11, 12, 13]. Variable-length tuples also arise in temporal databases, where sets of time intervals are associated with attributes or with tuples [1, 2, 5, 6, 8, 9]. One approach to representing such tuples on disk is to store the set-valued attribute values separately from the rest of the (now fixed-length) tuple. The sets can be stored as a linked list of fixed-size blocks, each storing one or several attribute values. If the sets exhibit a wide range of cardinalities, a linked list is an appropriate storage structure. The alternative of using a hash table to store the set of attribute values was rejected because such a table is also of variable size, complicating page space management. The question we address is, What is the optimal number of attributes in a block? If one attribute per block is employed, we may waste space on pointers. On the other hand, if large blocks are used, the last block of each list may be largely unfilled. The objective of this paper is to obtain an expression identifying the optimal block size, to achieve the best space utilization. In the next section, an analytical model is presented which can help determine the optimal block size. An example from temporal databases that exploits this model is described in Section 3. Section 4 provides a summary.
Read full abstract