Abstract

Distributed controlled coarse-grained reconfigurable arrays (CGRAs) enable efficient execution of irregular control flows by reconciling divergence in the processing elements (PEs). To further improve performance by better exploiting spatial parallelism, the triggered instruction architecture (TIA) eliminates the program counter and branch instructions by converting control flows into predicate dependencies as triggers. However, pipeline stalls, which occur in pipelines composed of both intra and inter-PEs, remain a major obstacle to the overall performance. In fact, the stalls in distributed controlled CGRAs pose a unique problem that is difficult to resolve by previous techniques. This work presents a triggered-issuance and triggered-execution (TITE) paradigm in which the issuance and execution of instructions are separately triggered to further relax the predicate dependencies in TIA. In this paradigm, instructions are paired as dual instructions to eliminate stalls caused by control divergence. Tags that identify the data transmitted between PEs are forwarded for acceleration. As a result, pipeline stalls of both intra- and inter-PEs can be significantly minimized. Experiments show that TITE improves performance by 21 percent, energy efficiency by 17 percent, and area efficiency by 12 percent compared with a baseline TIA.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call