Abstract
The newly emerging many-core-on-a-chip designs have renewed an intense interest in parallel processing. By applying Amdahl's formulation to the programs in the PARSEC and SPLASH-2 benchmark suites, we find that most applications may not have sufficient parallelism to efficiently utilize modern parallel machines. The long sequential portions in these application programs are caused by computation as well as communication latency. However, value prediction techniques may allow the ldquoparallelizationrdquo of the sequential portion by predicting values before they are produced. In conventional superscalar architectures, the computation latency dominates the sequential sections. Thus, value prediction techniques may be used to predict the computation result before it is produced. In many-core architectures, since the communication latency increases with the number of cores, value prediction techniques may be used to reduce both the communication and computation latency. In this paper, we extend Amdahl's formulation to model the data redundancy inherent to each benchmark, thereby identifying the potential of value prediction techniques. Our analysis shows that the performance of PARSEC benchmarks may improve by a factor of 180 and 230 percent for the SPLASH-2 suite, compared to when only the intrinsic parallelism is considered. This demonstrates the immense potential of fine-grained value prediction in reducing the communication latency in many-core architectures.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.