Download or read book Reducing Memory Access Delays in Large scale Shared memory Multiprocessors written by University of Illinois at Urbana-Champaign. Center for Supercomputing Research and Development and published by . This book was released on 1992 with total page 266 pages. Available in PDF, EPUB and Kindle. Book excerpt: Memory access time is a key factor limiting the performance of large-scale, shared-memory multiprocessors. In such systems, limited bandwidth in the interconnection between the processors and the memories, coupled with long delays resulting from network and memory conflicts, can produce serious memory access delays. Incorporating memory hierarchies and asynchronous block transfer mechanisms are common methods for reducing these delays. However, for these two mechanisms to be wed advantageously, they must be managed effectively, either in hardware or in software. Although this memory management problem is becoming increasingly important, good techniques are still lacking. The problem of reducing memory access delays can be attacked at several levels. The first is to attempt to improve the performance of the shared-memory system itself, where the shared-memory system includes implicitly both the network and the memory modules themselves. The second is to develop techniques to manage the memory hierarchy more effectively and to make use of the block transfer mechanisms. This thesis addresses this problem at both of these levels. The first part examines the behavior of a realistic shared-memory system and evaluates cost-effective hardware modifications for improving this balance. An additional goal is to achieve memory system scalability, where the term scalable describes systems whose per-processor performance is roughly constant across the range of system sizes examined. The remainder of this thesis addresses the problem of improving utilization of local storage in shared-memory systems where, at the very least, each processor has access to local (private) storage in addition to the global (shared) memory. A combined flow-and-dependence analysis algorithm is developed which produces the analytical information needed to optimize data accesses. It is shown how this information can be used as part of an intergrated hardware/software approach to eliminating redundant (unnecessary) memory accesses and prefetching data.