EBookClubs

Read Books & Download eBooks Full Online

EBookClubs

Read Books & Download eBooks Full Online

Book Energy efficient Data Processing Using Accelerators

Download or read book Energy efficient Data Processing Using Accelerators written by and published by . This book was released on 2015 with total page 127 pages. Available in PDF, EPUB and Kindle. Book excerpt: Energy efficiency of computing systems has become crucial with the end of Dennardian scaling in which voltage scaling has stalled, thereby increasing power density with decreasing transistor size. One approach to improve energy efficiency is to use accelerators specialized for a certain set of computing problems. Unlike traditional general-purpose processors, accelerators avoid the overhead of fetching and scheduling instructions. This dissertation investigates architectural techniques to enable energy-efficient data processing using reconfigurable accelerators: customizing L1 data caches for computing systems integrated with reconfigurable accelerators, and proposing a near-memory processing architecture using reconfigurable accelerators. Data transfers between accelerators and memory are often a bottleneck for both performance and energy efficiency. This dissertation demonstrates the potential of a configurable L1 data cache to exploit diversity in cache requirements across hybrid applications that use accelerators. One configurable feature is the cache topology; it can be reconfigured as a set of private L1 caches, or a single L1 cache shared by a processor and an accelerator. This dissertation also proposes a technique to provide a configurable tradeoff between number of ports and capacity of the L1 cache. To further reduce the overhead of transferring data between compute-engines and memory, this dissertation proposes NDA (Near-DRAM Acceleration), an architecture that stacks reconfigurable accelerators atop off-chip commodity DRAM devices. To make this architecture practical in the short run, NDA uses commodity 2D DRAM devices and provides, in a practical way, high-bandwidth connections between accelerators and DRAM for the purpose of near-memory processing. This dissertation explores three NDA microarchitectures to stack accelerators atop DRAM and analyzes the impact of supporting such microarchitectures on DRAM area, timing, and energy. The first microarchitecture connects accelerators and DRAM through global I/O lines that are shared between all DRAM banks. In the second microarchitecture, global I/O lines are doubled to increase the internal bandwidth between accelerators and DRAM. The third microarchitecture connects accelerators and DRAM through global datalines that are private to each DRAM bank, substantially increasing internal DRAM bandwidth. This dissertation also identifies various software and hardware challenges in implementing the NDA architecture and provides cost-effective solutions.

Book Hardware Accelerators in Data Centers

Download or read book Hardware Accelerators in Data Centers written by Christoforos Kachris and published by Springer. This book was released on 2018-08-21 with total page 280 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book provides readers with an overview of the architectures, programming frameworks, and hardware accelerators for typical cloud computing applications in data centers. The authors present the most recent and promising solutions, using hardware accelerators to provide high throughput, reduced latency and higher energy efficiency compared to current servers based on commodity processors. Readers will benefit from state-of-the-art information regarding application requirements in contemporary data centers, computational complexity of typical tasks in cloud computing, and a programming framework for the efficient utilization of the hardware accelerators.

Book Accelerators for Data Processing

Download or read book Accelerators for Data Processing written by and published by . This book was released on 2015 with total page 105 pages. Available in PDF, EPUB and Kindle. Book excerpt: Mots-clés de l'auteur: performance ; energy efficiency ; database systems ; analytics ; hash tables ; trees ; indexes ; accelerators ; prefetching.

Book Computing with Memory for Energy Efficient Robust Systems

Download or read book Computing with Memory for Energy Efficient Robust Systems written by Somnath Paul and published by Springer Science & Business Media. This book was released on 2013-09-07 with total page 210 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book analyzes energy and reliability as major challenges faced by designers of computing frameworks in the nanometer technology regime. The authors describe the existing solutions to address these challenges and then reveal a new reconfigurable computing platform, which leverages high-density nanoscale memory for both data storage and computation to maximize the energy-efficiency and reliability. The energy and reliability benefits of this new paradigm are illustrated and the design challenges are discussed. Various hardware and software aspects of this exciting computing paradigm are described, particularly with respect to hardware-software co-designed frameworks, where the hardware unit can be reconfigured to mimic diverse application behavior. Finally, the energy-efficiency of the paradigm described is compared with other, well-known reconfigurable computing platforms.

Book Energy Efficient Data Centers

Download or read book Energy Efficient Data Centers written by Sonja Klingert and published by Springer. This book was released on 2014-05-21 with total page 120 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book constitutes the thoroughly refereed post-conference proceedings of the Second International Workshop on Energy Efficient Data Centers, E2DC 2013, held in Berkeley, CA, USA, in May 2013; co-located with SIGCOMM e-Energy 2013. The 8 revised full papers presented were carefully reviewed and selected from numerous submissions. The papers are organized in topical sections on energy and workload measurement; energy management; simulators and control.

Book Energy Efficient High Performance Processors

Download or read book Energy Efficient High Performance Processors written by Jawad Haj-Yahya and published by Springer. This book was released on 2018-03-22 with total page 176 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book explores energy efficiency techniques for high-performance computing (HPC) systems using power-management methods. Adopting a step-by-step approach, it describes power-management flows, algorithms and mechanism that are employed in modern processors such as Intel Sandy Bridge, Haswell, Skylake and other architectures (e.g. ARM). Further, it includes practical examples and recent studies demonstrating how modem processors dynamically manage wide power ranges, from a few milliwatts in the lowest idle power state, to tens of watts in turbo state. Moreover, the book explains how thermal and power deliveries are managed in the context this huge power range. The book also discusses the different metrics for energy efficiency, presents several methods and applications of the power and energy estimation, and shows how by using innovative power estimation methods and new algorithms modern processors are able to optimize metrics such as power, energy, and performance. Different power estimation tools are presented, including tools that break down the power consumption of modern processors at sub-processor core/thread granularity. The book also investigates software, firmware and hardware coordination methods of reducing power consumption, for example a compiler-assisted power management method to overcome power excursions. Lastly, it examines firmware algorithms for dynamic cache resizing and dynamic voltage and frequency scaling (DVFS) for memory sub-systems.

Book Energy Efficient Hardware Software Co Synthesis Using Reconfigurable Hardware

Download or read book Energy Efficient Hardware Software Co Synthesis Using Reconfigurable Hardware written by Jingzhao Ou and published by CRC Press. This book was released on 2009-10-14 with total page 225 pages. Available in PDF, EPUB and Kindle. Book excerpt: Rapid energy estimation for energy efficient applications using field-programmable gate arrays (FPGAs) remains a challenging research topic. Energy dissipation and efficiency have prevented the widespread use of FPGA devices in embedded systems, where energy efficiency is a key performance metric. Helping overcome these challenges, Energy Efficient

Book Fast and Energy Efficient Big Data Processing on FPGAs

Download or read book Fast and Energy Efficient Big Data Processing on FPGAs written by Sahand Salamat and published by . This book was released on 2021 with total page 114 pages. Available in PDF, EPUB and Kindle. Book excerpt: With the rapid development of the Internet of things (IoT), networks, software, and computing platforms, the size of the generated data is dramatically increasing, bringing the dawn of the big data era. These ever-increasing data volumes and complexity require new algorithms and hardware platforms to deliver sufficient performance. Data from sensors, such as images, video, and text, contributed to 2.5 quintillions bytes generated every day in 2020. The rate of generating data is outpacing the computational capabilities of conventional computing platforms and algorithms. CPU performance improvement has been stagnating in recent years, which is one of the causes of the rise of application-specific accelerators that process big data applications. FPGAs are also more commonly used for accelerating big data algorithms, such as machine learning. In this work, we develop and optimize both the hardware implementation and also algorithms for FPGA-based accelerators to increase the performance of machine learning applications. We leverage Residue Number System (RNS) to optimize the deep neural networks (DNNs) execution and develop an FPGA-based accelerator, called Residue-Net, to entirely execute DNNs using RNS on FPGAs. Residue-Net improves the DNNs throughput by 2.8x compared to running the FPGA-based baseline. Even though running DNNs on FPGAs provides higher performance compared to running on general-purpose processors, due to their intrinsic computation complexity, it is challenging to deliver high performance and low energy consumption using FPGAs, especially for the edge devices. Less complex and more hardware-friendly machine learning algorithms are needed in order to revolutionize the performance at and beyond the edge. Hyperdimensional computing (HD) is a great example of a very efficient paradigm for machine learning. HD is intrinsically parallelizable with significantly fewer operations than DNNs, and thus can easily be accelerated in hardware. We develop an automated tool to generate an FPGA-based accelerator, called HD2FPGA, for classification and clustering applications, with accuracy that is comparable to the state-of-the-art machine learning algorithms, but orders of magnitude higher efficiency. HD2FPGA achieves 578x speedup and 1500x energy reduction in end-to-end execution of HD classification compared to the CPU baseline. HD2FPGA, compared to state-of-the-art DNN running on an FPGA, delivers 277x speedup and 172x energy reduction. As the volume of data increases, a single FPGA is not enough to get the desired performance. Thus, many cloud service providers offer multi-FPGA platforms. The size of the data centers workloads varies dramatically over time, leading to significant underutilization of computing resources such as FPGAs while consuming a large amount of power, which is a critical contributor to data center inefficiency. We propose an efficient framework to throttle the power consumption of multi-FPGA platforms by dynamically scaling the voltage and hereby frequency at run time according to the prediction of, and adjustment to the workload level while maintaining the desired Quality of Service (QoS). Our evaluations by implementing state-of-the-art deep neural network accelerators revealed that providing an average power reduction of 4.0x, the proposed framework surpasses the previous works by 33.6% (up to 83%).

Book From Variability Tolerance to Approximate Computing in Parallel Integrated Architectures and Accelerators

Download or read book From Variability Tolerance to Approximate Computing in Parallel Integrated Architectures and Accelerators written by Abbas Rahimi and published by Springer. This book was released on 2017-04-23 with total page 204 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book focuses on computing devices and their design at various levels to combat variability. The authors provide a review of key concepts with particular emphasis on timing errors caused by various variability sources. They discuss methods to predict and prevent, detect and correct, and finally conditions under which such errors can be accepted; they also consider their implications on cost, performance and quality. Coverage includes a comparative evaluation of methods for deployment across various layers of the system from circuits, architecture, to application software. These can be combined in various ways to achieve specific goals related to observability and controllability of the variability effects, providing means to achieve cross layer or hybrid resilience.

Book Energy Efficiency in Data Centers and Clouds

Download or read book Energy Efficiency in Data Centers and Clouds written by and published by Academic Press. This book was released on 2016-01-28 with total page 298 pages. Available in PDF, EPUB and Kindle. Book excerpt: Advances in Computers carries on a tradition of excellence, presenting detailed coverage of innovations in computer hardware, software, theory, design, and applications. The book provides contributors with a medium in which they can explore their subjects in greater depth and breadth than journal articles typically allow. The articles included in this book will become standard references, with lasting value in this rapidly expanding field. - Presents detailed coverage of recent innovations in computer hardware, software, theory, design, and applications - Includes in-depth surveys and tutorials on new computer technology pertaining to computing: combinatorial testing, constraint-based testing, and black-box testing - Written by well-known authors and researchers in the field - Includes extensive bibliographies with most chapters - Presents volumes devoted to single themes or subfields of computer science

Book Compact and Fast Machine Learning Accelerator for IoT Devices

Download or read book Compact and Fast Machine Learning Accelerator for IoT Devices written by Hantao Huang and published by Springer. This book was released on 2018-12-07 with total page 157 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book presents the latest techniques for machine learning based data analytics on IoT edge devices. A comprehensive literature review on neural network compression and machine learning accelerator is presented from both algorithm level optimization and hardware architecture optimization. Coverage focuses on shallow and deep neural network with real applications on smart buildings. The authors also discuss hardware architecture design with coverage focusing on both CMOS based computing systems and the new emerging Resistive Random-Access Memory (RRAM) based systems. Detailed case studies such as indoor positioning, energy management and intrusion detection are also presented for smart buildings.

Book Efficient Processing of Deep Neural Networks

Download or read book Efficient Processing of Deep Neural Networks written by Vivienne Sze and published by Springer Nature. This book was released on 2022-05-31 with total page 254 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book provides a structured treatment of the key principles and techniques for enabling efficient processing of deep neural networks (DNNs). DNNs are currently widely used for many artificial intelligence (AI) applications, including computer vision, speech recognition, and robotics. While DNNs deliver state-of-the-art accuracy on many AI tasks, it comes at the cost of high computational complexity. Therefore, techniques that enable efficient processing of deep neural networks to improve key metrics—such as energy-efficiency, throughput, and latency—without sacrificing accuracy or increasing hardware costs are critical to enabling the wide deployment of DNNs in AI systems. The book includes background on DNN processing; a description and taxonomy of hardware architectural approaches for designing DNN accelerators; key metrics for evaluating and comparing different designs; features of DNN processing that are amenable to hardware/algorithm co-design to improve energy efficiency and throughput; and opportunities for applying new technologies. Readers will find a structured introduction to the field as well as formalization and organization of key concepts from contemporary work that provide insights that may spark new ideas.

Book High Performance Computing Systems  Performance Modeling  Benchmarking  and Simulation

Download or read book High Performance Computing Systems Performance Modeling Benchmarking and Simulation written by Stephen A. Jarvis and published by Springer. This book was released on 2015-04-20 with total page 284 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book constitutes the thoroughly refereed proceedings of the 5th International Workshop, PMBS 2014 in New Orleans, LA, USA in November 2014. The 12 full and 2 short papers presented in this volume were carefully reviewed and selected from 53 submissions. The papers cover topics on performance benchmarking and optimization; performance analysis and prediction; and power, energy and checkpointing.

Book Accelerating Data Center Applications Through Energy efficient Recongurable Computing

Download or read book Accelerating Data Center Applications Through Energy efficient Recongurable Computing written by and published by . This book was released on 2019 with total page 109 pages. Available in PDF, EPUB and Kindle. Book excerpt: Many data-intensive applications need to process massive data sets (e.g., scientific data, photographs, and videos) to discover useful information such as hidden patterns or market trends. Delivering these large data sets all the way from a storage system to host CPU(s) puts a tremendous pressure on a traditional host-CPU based computing architecture as it incurs a substantial data transfer latency and energy consumption. Thus, a new computing paradigm is greatly needed. Non-volatile memory (NVM) technologies like NAND flash advanced rapidly in the past decade, which offers huge potential for processing data in-situ or offloading computation near the data. In this dissertation research, we first propose a new in-storage processing architecture called RISP (Reconfigurable In-Storage Processing), which employs field-programmable gate array (FPGA) as data processing unit and NVM controller. Unlike traditional ISP techniques, RISP can reconfigure storage data processing resources to achieve a high energy-efficiency without any performance degradation for big data analysis applications. Three case studies are provided in this project. Experimental results show that RISP significantly outperforms a CPU-centric processing architecture in terms of performance and energy-efficiency. Second, a near-data processing (NDP) server architecture is proposed to evaluate its impact on a diverse range of data center applications from data-intensive to compute-intensive. Several new findings have been observed. For example, we found that an FPGA-based NDP server can offer performance benefits not only for data-intensive applications but also for compute-intensive applications. Third, by applying the idea of reconfigurability proposed in the first project on an NDP server, we developed a reconfigurable NDP server that can dynamically reconfigure its computing resources according to the characteristics of an application. Our results shown that the NDP server can achieve a higher energy-efficiency without any performance degradation compared to the NDP architecture proposed in the second project. Finally, two memory-access-efficient implementations of kNN (k-Nearest Neighbors) on FPGA are presented. Principal component analysis (PCA) and low-precision data representation are employed to reduce data accesses. Results show that external memory accesses are substantially reduced (>28x), which obviously improves performance in terms of execution time and energy-efficiency.

Book Handbook of Semiconductors

Download or read book Handbook of Semiconductors written by Ram K. Gupta and published by CRC Press. This book was released on 2024-07-10 with total page 396 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book provides readers with state-of-the-art knowledge of established and emerging semiconducting materials, their processing, and the fabrication of chips and microprocessors. In addition to covering the fundamentals of these materials, it details the basics and workings of many semiconducting devices and their role in modern electronics and explores emerging semiconductors and their importance in future devices. • Provides readers with latest advances in semiconductors. • Covers diodes, transistors, and other devices using semiconducting materials. • Covers advances and challenges in semiconductors and their technological applications. • Discusses fundamentals and characteristics of emerging semiconductors for chip manufacturing. This book provides directions to scientists, engineers, and researchers in materials engineering and related disciplines to help them better understand the physics, characteristics, and applications of modern semiconductors.

Book Energy Efficient Computing Through Compiler Assisted Dynamic Specialization

Download or read book Energy Efficient Computing Through Compiler Assisted Dynamic Specialization written by and published by . This book was released on 2014 with total page 0 pages. Available in PDF, EPUB and Kindle. Book excerpt: Due to the failure of threshold voltage scaling, per-transistor switching power is not scaling down at the pace of Moore's Law, causing the power density to rise for each successive generation. Consequently, computer architects need to improve the energy efficiency of microarchitecture designs to sustain the traditional performance growth. Hardware specialization or using accelerators is a promising direction to improve the energy efficiency without sacrificing performance. However, it requires disruptive changes in hardware and software including the programming model, applications, and operating systems. Moreover, specialized accelerators cannot help with the general purpose computing. Going forward, we need a solution that avoids such disruptive changes and can accelerate or specialize even general purpose workloads. This thesis develops a hardware/software co-designed solution called Dynamically Specialized Execution, which uses compiler assisted dynamic specialization to improve the energy efficiency without radical changes to microarchitecture, the ISA or the programming model. This dissertation first develops a decoupled access/execute coarse-grain reconfigurable architecture called DySER: Dynamically Specialized Execution Resources, which achieves energy efficiency by creating specialized hardware at runtime for hot code regions. DySER exposes a well defined interface and execution model, which makes it easier to integrate DySER with an existing core microarchitecture. To address the challenges of compiling for a specialized accelerator, this thesis develops a novel compiler intermediate representation called the Access/Execute Program Dependence Graph (AEPDG), which accurately models DySER and captures the spatio-temporal aspects of its execution. This thesis shows that using this representation, we can implement a compiler that generates highly optimized code for a coarse-grain reconfigurable architecture without manual intervention for programs written in the traditional programming model. Detailed evaluation shows that automatic specialization of data parallel workloads with DySER provides a mean speedup of 3.8x with 60% energy reduction when compared to a 4-wide out-of-order processor. On irregular workloads, exemplified by SPECCPU, DySER provides on average speedup of 11% with 10% reduction in energy consumption. On a highly relevant application, database query processing, which has a mix of data parallel kernels and irregular kernels, DySER provides an 2.7x speedup over the 4-wide out-of-order processor.

Book Management and Scheduling of Accelerators for Heterogeneous High Performance Computing

Download or read book Management and Scheduling of Accelerators for Heterogeneous High Performance Computing written by Tobias Beisel and published by . This book was released on 2015 with total page 0 pages. Available in PDF, EPUB and Kindle. Book excerpt: The use of heterogeneous computing resources, such as graphics processing units or other specialized co-processors, has become widespread in recent years because of their performance and energy efficiency advantages. Operating system approaches that are limited to optimizing CPU usage are no longer sufficient for the efficient utilization of systems that comprise diverse resource types. Enabling task preemption on these architectures and migration of tasks between different resource types at run-time is not only key to improving the performance and energy consumption but also to enabling automatic scheduling methods for heterogeneous compute nodes. This thesis proposes novel techniques for run-time management of heterogeneous resources and enabling tasks to migrate between diverse hardware. It provides fundamental work towards future operating systems by discussing implications, limitations, and chances of the heterogeneity and introducing solutions for energy- and performance-efficient run-time systems. Scheduling methods to utilize heterogeneous systems by the use of a centralized scheduler are presented that show benefits over existing approaches in varying case studies.