Determining Bounds on Execution Times

Reinhard Wilhelm

doi:10.1201/9781420038163-20

Abstract

Interpretation [CC77] is used to compute invariants about cache contents. How the behavior of programs on processor pipelines is predicted follows in Section 0.3. 0.2.1 Cache Memories A cache can be characterized by three major parameters: • capacity is the number of bytes it may contain. • line size (also called block size) is the number of contiguous bytes that are transferred from memory on a cache miss. The cache can hold at most n = capacity/line size blocks. • associativity is the number of cache locations where a particular block may reside. n/associativity is the number of sets of a cache. If a block can reside in any cache location, then the cache is called fully associative. If a block can reside in exactly one location, then it is called direct mapped. If a block can reside in exactly A locations, then the cache is called A-way set associative. The fully associative and the direct mapped caches are special cases of the A-way set associative cache where A = n and A = 1, resp. In the case of an associative cache, a cache line has to be selected for replacement when the cache is full and the processor requests further data. This is done according to a replacement strategy. Common strategies are LRU (Least Recently Used), FIFO (First In First Out), and random. The set where a memory block may reside in the cache is uniquely determined by the address of the memory block, i.e., the behavior of the sets is independent of each other. The behavior of an A-way set associative cache is completely described by the 0.2. CACHE-BEHAVIOUR PREDICTION 9 behavior of its n/A fully associative sets. This holds also for direct mapped caches where A = 1. For the sake of space, we restrict our description to the semantics of fully associative caches with LRU replacement strategy. More complete descriptions that explicitly describe direct mapped and A-way set associative caches can be found in [Fer97, FMW99]. 0.2.2 Cache Semantics In the following, we consider a (fully associative) cache as a set of cache lines L = {l1, . . . , ln} and the store as a set of memory blocks S = {s1, . . . ,sm}. To indicate the absence of any memory block in a cache line, we introduce a new element I; S′ = S∪{I}. Definition 2 (concrete cache state) A (concrete) cache state is a function c : L → S′. Cc denotes the set of all concrete cache states. The initial cache state cI maps all cache lines to I. If c(li) = sy for a concrete cache state c, then i is the relative age of the memory block according to the LRU replacement strategy and not necessarily the physical position in the cache hardware. The update function describes the effect on the cache of referencing a block in memory. The referenced memory block sx moves into l1 if it was in the cache already. All memory blocks in the cache that had been used more recently than sx increase their relative age by one, i.e., they are shifted by one position to the next cache line. If the referenced memory block was not yet in the cache, it is loaded into l1 after all memory blocks in the cache have been shifted and the ‘oldest’, i.e., least recently used memory block, has been removed from the cache if the cache was full. Definition 3 (cache update) A cache update function U : Cc ×S →Cc determines the new cache state for a given cache state and a referenced memory block. Updates of fully associative caches with LRU replacement strategy are pictured as in Figure 4. Control Flow Representation We represent programs by control flow graphs consisting of nodes and typed edges. The nodes represent basic blocks. A basic block is a sequence (of fragments) of instructions in which control flow enters at the beginning and leaves at the end without halt or possibility of branching except at the end. For cache analysis, it is most convenient to have one memory reference per control flow node. Therefore, our nodes may represent the different fragments of machine instructions that access memory. For non-precisely determined addresses of data references, one can use a set of possibly referenced memory blocks. We assume that for each basic block, the sequence of references to memory is known (This is appropriate for instruction caches and can be too restricted for data caches and combined caches. See

Full Text