Abstract
The effectiveness of large scale computing depends to a great extent on the performance of the memory system. As shared memory multiprocessors grow in size, their memory hierarchy deepens, resulting in a design with non-uniform latencies. In this paper, we explore the implications of multi-valued memory latencies. In particular, we study the effect of a non-uniform traffic distribution on a hierarchical large scale NUMA multiprocessor named Hector. Memory analysis is of interest because memory is a frequent source of poor performance in large scale multiprocessors. We have developed an analytical model that includes the effects of increased contention for system resources, and the impact of the arbitration algorithm on the network traffic. Our analysis has been validated with a detailed simulator. Also, we have examined two techniques for reducing memory latency. We assess the potential performance gains from replication of data and investigate the improvement in memory utilization by allowing memory request buffering. Furthermore, we studied the sensitivity of the memory performance to changes in background traffic. We found that inter-station traffic has a significant performance effect. >
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.