Introduction In several database (and other) applications, structures such as trees, graphs, and networks play a prominent role; see for instance Aho, Hopcroft, and Ullman (1983), Houtsma and Apers (1992), Kung, Wagner, and Woa (1995), and Ullman (1989) as well as their references among (many) others. There are many well-known examples of such application areas: bills of material, genealogical trees, organization charts, holding structures, networks of railroads, networks of conduit pipes, and telecom networks, to name a few. But also many other application areas contain, often implicitly, such structures. Within computing science itself we also encounter those structures quite frequently, for example in data dictionaries, in deductive databases, in software configuration management, and in CASE-tools (e.g. ER-diagrams). Consequently, these structures occur in several courses in a CS curriculum. We will show how this can be used as an educational opportunity and combine some classical themes of programming with databases. A tree, graph, or network can be considered as a picture with and edges, informally speaking. Usually, the nodes and the edges are labeled with additional data as well. For some classes of examples we indicate in Table 1 what the pictures, the nodes, and the edges represent in those cases. (The students could be invited to add some classes of examples themselves.) As an illustration, Figure 1 gives a concrete instance of a bill of material (or BOM) for an imaginary manufacturer of office furniture, taken from De Brock (1995). We note that a bill of material constitutes a central part of MRP-systems for instance. The nodes represent products (with product number and description), the edges indicate which products occur as a direct part in which products, and the edge labels tell us in which numbers they occur as a direct part, e.g. product 11297 (bolt + nut) occurs 24 times as a direct part in desk 87384 (and via the six drawers with number 44660 it also occurs 6 x 6 times as an indirect part in desk 87384). [FIGURE 1 OMITTED] We can represent our bill of material from Figure 1 by means of two tables, the table PROD (products, or nodes) with (at least) the attributes PNR (product number) and DESCR (description), and the table EDGES with (at least) the attributes BNODE (begin node), ENODE (end node), and NUM (number of pieces). The key of the table PROD is {PNR} and the (composite) key of the table EDGES is {BNODE, ENODE}. In the table EDGES, both BNODE and ENODE are foreign keys, each referring to PNR in the table PROD. In Figure 2 we represent a part of each table. Here we already note that ad hoc querying for management applications in areas like production management, for instance, is often laborious due to the recursive character of many queries (e.g., computing pack lists, total assembly times, and the like). But before we return to this problem, we will first give an example of an even more general network. This network, represented in Figure 3, consists of 23 nodes and 37 edges. [FIGURE 3 OMITTED] Sometimes, network structures are only implicitly present in databases. This raises the question how students can recognize in a general way whether such network structures are hidden in their data. A simple criterion, illustrated by Figure 4, is the following: a data model with two different referential integrities from a given entity E to a given entity N can be an indication that the N-occurrences can be considered as nodes and the E-occurrences as edges between those nodes, and hence that N together with E in fact represent a network. (It would be an interesting exercise to check existing database schemes for these situations; one might get surprisingly new insights in those data structures.) [FIGURE 4 OMITTED] Algorithms on such data structures often ask for recursion or iteration with an unknown number of repetitions because the depth of the tree or the diameter of the network can be arbitrarily large. …
Read full abstract