Abstract

The scheduled dataflow (SDF) architecture deviates from current trend of building complex hardware to exploit instruction level parallelism (ILP) by exploring a simpler, yet powerful execution paradigm that is based on dataflow, multithreading and decoupling of memory accesses from execution. A program is partitioned into non-blocking threads and all memory accesses are decoupled from the thread's execution. Data is pre-loaded into the thread's context (registers), and all results are post-stored after the completion of the thread's execution. This paper presents an efficient way of storing of data into the thread's register context directly as opposed to storing of data into the frame memory. This way eliminates the need for creating thread frames when there are sufficient register contexts available in the system. Thus, it is possible to explore the scalability of SDF architecture's performance when more register contexts are available on the chip. All the benchmarks ran using these two methods show performance improvement of at least about 20%. This method of allocating data to a consecutive thread in a multithreaded architecture could be applied generally.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.