Abstract

Statically scheduled scientific computing problems represent a large set of problems which require intensive amount of computation. The common feature characteristics of this set of problems could be used to optimize an architecture, where the utilization exceeds 90% of the peak performance. The proposed architecture is an array of reconfigurable NISC (No Instruction Set Computer) processing elements (PE) connected by a reconfigurable NOC (Network On Chip). An optimized data path for a group of problems is suggested. The control of each PE is reconfigurable to customize for each application so as the NOC. The architecture is simulated using a tile of 64 PEs to run LU decomposition algorithm of a dense matrix, and the results show a performance of 177 GFLOPS, which outperforms the GPU NVIDIA 6800 & 7800 implementations and the OpenMP parallel programming multicore solution using an Intel core 2 quad cpu with four processors cores.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.