Abstract
Aggressive embedded processors are often equipped with general purpose cores and special purpose acceleration logics. In our paper, we consider a reconfigurable processor that consists of very long instruction word (VLIW) cores and coarse grained reconfigurable arrays (CGRAs). CGRAs are particularly used to enhance the performance by exploiting loop parallelism, while VLIW cores rely on discovering instruction level parallelism. For time consuming loops, CGRAs can accelerate them with powerful pipeline scheduling. However, not all loops can be accelerated by CGRAs. Outer loops and loops containing function calls cannot be candidates for CGRA acceleration. In our paper, we adopt instruction extensions to convert code fragments in outer loops and simple functions into single instructions. With the extended instructions in CGRAs, more loops can be accelerated with CGRAs. Our experiment with mpeg2dec from Mediabench shows 32% performance increase.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.