EBookClubs

Read Books & Download eBooks Full Online

EBookClubs

Read Books & Download eBooks Full Online

Book Hardware and Compiler directed Cache Coherence in Large scale Multiprocessors

Download or read book Hardware and Compiler directed Cache Coherence in Large scale Multiprocessors written by Lynn Choi and published by . This book was released on 1996 with total page 40 pages. Available in PDF, EPUB and Kindle. Book excerpt: Abstract: "In this paper, we study a hardware-supported, compiler-directed (HSCD) cache coherence scheme, which can be implemented on a large-scale multiprocessor using off-the-shelf microprocessors, such as the Cray T3D. The scheme can be adapted to various cache organizations, including multi-word cache lines and byte-addressable architectures. Several system related issues, including critical sections, inter-thread communication, and task migration have also been addressed. The cost of the required hardware support is minimal and proportional to the cache size. The necessary compiler algorithms, including intra- and interprocedural array data flow analysis, have been implemented on the Polaris parallelizing compiler [33]. From our simulation study using the Perfect Club benchmarks [5], we found that in spite of the conservative analysis made by the compiler, the performance of the proposed HSCD scheme can be comparable to that of a full-map hardware directory scheme. Given its comparable performance and reduced hardware cost, the proposed scheme can be a viable alternative for large-scale multiprocessors such as the Cray T3D, which rely on users to maintain data coherence."

Book Combining Hardware and Software Cache Coherence Strategies

Download or read book Combining Hardware and Software Cache Coherence Strategies written by David J. Lilja and published by . This book was released on 1991 with total page 11 pages. Available in PDF, EPUB and Kindle. Book excerpt: Abstract: "Efficiently maintaining cache coherence is a major problem in large-scale shared memory multiprocessors. Hardware directory schemes have very high memory requirements, while software-directed schemes must rely on imprecise compile-time memory disambiguation. Recently proposed dynamic directory schemes allocate pointers to blocks only as they are referenced, which significantly reduces their memory requirements, but they still allocate pointers to blocks that do not need them. We show how compiler marking can further reduce the directory size by allocating pointers only when necessary. Using trace-driven simulations, we find that the performance of this new approach is comparable to other coherence schemes, but with significantly lower memory requirements."

Book Compiler Analysis for Cache Coherence

Download or read book Compiler Analysis for Cache Coherence written by Lynn Choi and published by . This book was released on 1996 with total page 44 pages. Available in PDF, EPUB and Kindle. Book excerpt: Abstract: "In this paper, we present compiler algorithms for detecting references to stale data in shared-memory multiprocessors. The algorithm consists of two key analysis techniques, stale reference detection and locality preserving analysis. While the stale reference detection finds the memory reference patterns that may violate cache coherence, the locality preserving analysis minimizes the number of such stale references by analyzing both temporal and spatial reuses. By computing the regions referenced by arrays inside loops, we extend the previous scalar algorithms [8] for more precise analysis. We develop a full interprocedural array data-flow algorithm, which performs both bottom- up side-effect analysis and top-down context analysis on the procedure call graph to further exploit locality across procedure boundaries. The interprocedural algorithm eliminates cache invalidations at procedure boundaries, which were assumed in the previous compiler algorithms [9]. We have fully implemented the algorithm in the Polaris parallelizing compiler [27]. Using execution-driven simulations on Perfect Club benchmarks, we demonstrate how unnecessary cache misses can be eliminated by the automatic stale reference detection. The algorithm can be used to implement cache coherence in the shared-memory multiprocessors that do not have hardware directories, such as Cray T3D [20]."

Book Distributed Sparse Gaussian Elimination and Orthogonal Factorization

Download or read book Distributed Sparse Gaussian Elimination and Orthogonal Factorization written by Padma Raghavan and published by . This book was released on 1993 with total page 40 pages. Available in PDF, EPUB and Kindle. Book excerpt: Abstract: "We consider the solution of a linear system Ax = b on a distributed memory machine when the matrix A has full rank and is large, sparse and nonsymmetric. We use our Cartesian Nested Dissection algorithm to compute a fill-reducing column ordering of the matrix. We develop algorithms that use the associated separator tree to estimate the structure of the factor and to distribute and perform numeric computations. When the matrix is nonsymmetric but square, the numeric computations involve Gaussian elimination with row pivoting; when the matrixis overdetermined, row-oriented Householder transforms are applied to compute the triangular factor of an orthogonal factorization. We compare the fill incurred by our approach to that incurred by well known sequential methods and report on the performance of our implementation on the Intel iPSC/860."

Book Cache and Interconnect Architectures in Multiprocessors

Download or read book Cache and Interconnect Architectures in Multiprocessors written by Michel Dubois and published by Springer Science & Business Media. This book was released on 2012-12-06 with total page 286 pages. Available in PDF, EPUB and Kindle. Book excerpt: Cache And Interconnect Architectures In Multiprocessors Eilat, Israel May 25-261989 Michel Dubois UniversityofSouthernCalifornia Shreekant S. Thakkar SequentComputerSystems The aim of the workshop was to bring together researchers working on cache coherence protocols for shared-memory multiprocessors with various interconnect architectures. Shared-memory multiprocessors have become viable systems for many applications. Bus based shared-memory systems (Eg. Sequent's Symmetry, Encore's Multimax) are currently limited to 32 processors. The fIrst goal of the workshop was to learn about the performance ofapplications on current cache-based systems. The second goal was to learn about new network architectures and protocols for future scalable systems. These protocols and interconnects would allow shared-memory architectures to scale beyond current imitations. The workshop had 20 speakers who talked about their current research. The discussions were lively and cordial enough to keep the participants away from the wonderful sand and sun for two days. The participants got to know each other well and were able to share their thoughts in an informal manner. The workshop was organized into several sessions. The summary of each session is described below. This book presents revisions of some of the papers presented at the workshop.

Book Publications of the State of Illinois 1994

Download or read book Publications of the State of Illinois 1994 written by and published by . This book was released on with total page 668 pages. Available in PDF, EPUB and Kindle. Book excerpt:

Book Software Cache Coherence for Large Scale Multiprocessors

Download or read book Software Cache Coherence for Large Scale Multiprocessors written by University of Rochester. Department of Computer Science and published by . This book was released on 1994 with total page 0 pages. Available in PDF, EPUB and Kindle. Book excerpt: Abstract: "Shared memory provides an attractive and intuitive programming model that makes good use of programmer time and effort. Shared memory however requires a coherence mechanism to allow caching for performance and to ensure that processors do not use stale data in their caches. We evaluate several algorithmic and architectural alternatives in the design space of NCC-NUMA machines with a globally-accessible physical address space. We present a new adaptive algorithm for software cache coherence that reduces interprocessor communication and scales to large numbers of processors; we compare it to existing software and hardware coherence schemes. We also evaluate (1) the tradeoffs among various write policies (write-through, write-back, write-through with a write-collect buffer) and (2) the effect on performance of using remote memory access. Finally, we observe that certain simple program changes can greatly improve performance. For example, we find that the use of reader-writer locks, synchronization variable relocation, and data structure padding and alignment can allow a protocol to avoid significant amounts of coherence overhead."

Book Compiler directed Cache Coherence Strategies for Large scale Shared memory Multiprocessor Systems

Download or read book Compiler directed Cache Coherence Strategies for Large scale Shared memory Multiprocessor Systems written by Hoichi Cheong and published by . This book was released on 1990 with total page 278 pages. Available in PDF, EPUB and Kindle. Book excerpt: The cache coherence maintenance problem has been the major obstacle in using private cache memory to reduce memory access latency in large-scale multiprocessor systems. Two compiler-directed solutions, the fast selective invalidation scheme and the version control scheme, are proposed in this work. Contrary to the existing hardware-based approach, the proposed schemes expose caches to software-directed management techniques which have the advantage of requiring no global communication and maintaining expandability of the multiprocessor systems. The fast selective scheme employs compile-time flow analysis techniques to detect cache data that contain obsolete values, and uses simple hardware to prevent using such data. The version control scheme defines the concept of version of a program variable to maintain up-to-date copies in the cache and solves the difficult problem of preserving temporal locality in parallel execution. Unlike existing software-directed schemes, both schemes achieve selective invalidation with very low time penalty. The version control scheme is also extended to hierarchical cache systems for which no satisfactory solutions exist. Detailed discussion on the development of these schemes and their proofs are presented. Finally, experimental data by simulation are shown to support the advantage of the schemes.

Book Cache Coherence Protocols for Large scale Multiprocessors

Download or read book Cache Coherence Protocols for Large scale Multiprocessors written by D. L. Chaiken and published by . This book was released on 1990 with total page 153 pages. Available in PDF, EPUB and Kindle. Book excerpt:

Book Publications of the State of Illinois

Download or read book Publications of the State of Illinois written by Illinois. Office of Secretary of State and published by . This book was released on 1996 with total page 90 pages. Available in PDF, EPUB and Kindle. Book excerpt:

Book The Cache Group Scheme for Hardware controlled Cache Coherence and the General Need for Hardware Coherence Control in Large scale Multiprocessors

Download or read book The Cache Group Scheme for Hardware controlled Cache Coherence and the General Need for Hardware Coherence Control in Large scale Multiprocessors written by Joseph Edward Hoag and published by . This book was released on 1991 with total page 166 pages. Available in PDF, EPUB and Kindle. Book excerpt:

Book Software Cache Coherence for Large Scale Multiprocessors

Download or read book Software Cache Coherence for Large Scale Multiprocessors written by University of Rochester. Dept. of Computer Science and published by . This book was released on 1994 with total page 17 pages. Available in PDF, EPUB and Kindle. Book excerpt: Abstract: "Shared memory provides an attractive and intuitive programming model that makes good use of programmer time and effort. Shared memory however requires a coherence mechanism to allow caching for performance and to ensure that processors do not use stale data in their caches. We evaluate several algorithmic and architectural alternatives in the design space of NCC-NUMA machines with a globally-accessible physical address space. We present a new adaptive algorithm for software cache coherence that reduces interprocessor communication and scales to large numbers of processors; we compare it to existing software and hardware coherence schemes. We also evaluate (1) the tradeoffs among various write policies (write-through, write-back, write-through with a write-collect buffer) and (2) the effect on performance of using remote memory access. Finally, we observe that certain simple program changes can greatly improve performance. For example, we find that the use of reader-writer locks, synchronization variable relocation, and data structure padding and alignment can allow a protocol to avoid significant amounts of coherence overhead."

Book The Cache Coherence Problem in Shared Memory Multiprocessors

Download or read book The Cache Coherence Problem in Shared Memory Multiprocessors written by Igor Tartalja and published by Wiley-IEEE Computer Society Press. This book was released on 1996-02-13 with total page 368 pages. Available in PDF, EPUB and Kindle. Book excerpt: The book illustrates state-of-the-art software solutions for cache coherence maintenance in shared-memory multiprocessors. It begins with a brief overview of the cache coherence problem and introduces software solutions to the problem. The text defines and details static and dynamic software schemes, techniques for modeling performance evaluation mechanisms, and performance evaluation studies.

Book Scalable Shared Memory Multiprocessors

Download or read book Scalable Shared Memory Multiprocessors written by Michel Dubois and published by Springer Science & Business Media. This book was released on 1992 with total page 360 pages. Available in PDF, EPUB and Kindle. Book excerpt: Mathematics of Computing -- Parallelism.

Book Scalable Shared Memory Multiprocessing

Download or read book Scalable Shared Memory Multiprocessing written by Daniel E. Lenoski and published by Elsevier. This book was released on 2014-06-28 with total page 364 pages. Available in PDF, EPUB and Kindle. Book excerpt: Dr. Lenoski and Dr. Weber have experience with leading-edge research and practical issues involved in implementing large-scale parallel systems. They were key contributors to the architecture and design of the DASH multiprocessor. Currently, they are involved with commercializing scalable shared-memory technology.