Discovery Logo
Sign In
Paper
Search Paper
Cancel
Pricing Sign In
  • My Feed iconMy Feed
  • Search Papers iconSearch Papers
  • Library iconLibrary
  • Explore iconExplore
  • Ask R Discovery iconAsk R Discovery Star Left icon
  • Chat PDF iconChat PDF Star Left icon
  • Citation Generator iconCitation Generator
  • Chrome Extension iconChrome Extension
    External link
  • Use on ChatGPT iconUse on ChatGPT
    External link
  • iOS App iconiOS App
    External link
  • Android App iconAndroid App
    External link
  • Contact Us iconContact Us
    External link
  • Paperpal iconPaperpal
    External link
  • Mind the Graph iconMind the Graph
    External link
  • Journal Finder iconJournal Finder
    External link
Discovery Logo menuClose menu
  • My Feed iconMy Feed
  • Search Papers iconSearch Papers
  • Library iconLibrary
  • Explore iconExplore
  • Ask R Discovery iconAsk R Discovery Star Left icon
  • Chat PDF iconChat PDF Star Left icon
  • Citation Generator iconCitation Generator
  • Chrome Extension iconChrome Extension
    External link
  • Use on ChatGPT iconUse on ChatGPT
    External link
  • iOS App iconiOS App
    External link
  • Android App iconAndroid App
    External link
  • Contact Us iconContact Us
    External link
  • Paperpal iconPaperpal
    External link
  • Mind the Graph iconMind the Graph
    External link
  • Journal Finder iconJournal Finder
    External link

Related Topics

  • Chip Multiprocessors
  • Chip Multiprocessors
  • Many-core Systems
  • Many-core Systems
  • Multicore Systems
  • Multicore Systems

Articles published on Dark silicon

Authors
Select Authors
Journals
Select Journals
Duration
Select Duration
116 Search results
Sort by
Recency
  • Open Access Icon
  • Research Article
  • 10.3390/jlpea15010010
The REGALE Library: A DDS Interoperability Layer for the HPC PowerStack
  • Feb 12, 2025
  • Journal of Low Power Electronics and Applications
  • Giacomo Madella + 5 more

Large-scale computing clusters have been the basis of scientific progress for several decades and have now become a commodity fuelling the AI revolution. Dark Silicon, energy efficiency, power consumption, and hot spots are no longer looming threats of an Information and Communication Technologies (ICT) niche but are today the limiting factor of the capability of the entire human society and a contributor to global carbon emissions. However, from the end user, system administrators, and system integrator perspective, handling and optimising the system for these constraints is not straightforward due to the elevated degree of fragmentation in the software tools and interfaces which handles the power management in high-performance computing (HPC) clusters. In this paper, we present the REGALE Library. It is the result of a collaborative effort in the EU EuroHPC JU REGALE project, which aims to effectively materialize the HPC PowerStack initiative, providing a single layer of communication among different power management tools, libraries, and software. The proposed framework is based on the data distribution service (DDS) and real-time publish–subscribe (RTPS) protocols and FastDDS as their implementation. This enables the various actors in the ecosystem to communicate and exchange messages without any further modification inside their implementation. In this paper, we present the blueprint, functionality tests, and performance and scalability evaluation of the DDS implementation currently used in the REGALE Library in the HPC context.

  • Open Access Icon
  • Research Article
  • 10.37917/ijeee.20.2.24
Understanding Power Gating Mechanism Based on Workload Classification of Modern Heterogeneous Many-Core Mobile Platform in the Dark Silicon Era
  • Oct 18, 2024
  • Iraqi Journal for Electrical and Electronic Engineering
  • Haider Alrudainy + 3 more

The rapid progress in mobile computing necessitates energy efficient solutions to support substantially diverse and complex workloads. Heterogeneous many core platforms are progressively being adopted in contemporary embedded implementations for high performance at low power cost estimations. These implementations experience diverse workloads that offer drastic opportunities to improve energy efficiency. In this paper, we propose a novel per core power gating (PCPG) approach based on workload classifications (WLC) for drastic energy cost minimization in the dark silicon era. Core of our paradigm is to use an integrated sleep mode management based on workloads classification indicated by the performance counters. A number of real applications benchmark (PARSEC) are adopted as a practical example of diverse workloads, including memory- and CPU-intensive ones. In this paper, these applications are exercised on Samsung Exynos 5422 heterogeneous many core system showing up to 37% to 110% energy efficient when compared with our most recent published work, and ondemand governor, respectively. Furthermore, we illustrate low-complexity and low-cost runtime per core power gating algorithm that consistently maximize IPS/Watt at all state space.

  • Research Article
  • 10.62904/tsrn1e82
DECODING RE-CONFIGURABLE PROCESSOR ARCHITECTURES
  • Jun 30, 2024
  • International Journal of Engineering Science and Humanities
  • Rohit Kumar

There have been significant advancements in the designs of processors and the tools used to produce and modify them. The issues related to using wall, dark silicon, and hardware reconfiguration along with code change have motivated progress in this area. These factorsare crucial for microprocessor-based electronic projects and products that need significant, real-time, and mobile computational power in today's world. Efforts have also been madetolower the communication overhead in key areas of code through the use of specializedhardware and significantly decrease the fetch decode cycle. Recent studies have demonstratedthat addressing both types of communication delays and addressing the issue of darksiliconcan enhance power efficiency by 7 to 1000 times.

  • Open Access Icon
  • Research Article
  • Cite Count Icon 7
  • 10.1109/tetc.2016.2563323
Notice of Violation of IEEE Publication Principles: An Energy-Efficient Heterogeneous Memory Architecture for Future Dark Silicon Embedded Chip-Multiprocessors
  • Jan 1, 2024
  • IEEE Transactions on Emerging Topics in Computing
  • Salman Onsori + 3 more

Main memories play an important role in overall energy consumption of embedded systems. Using conventional memory technologies in future designs in nanoscale era causes a drastic increase in leakage power consumption and temperature-related problems. Emerging non-volatile memory (NVM) technologies offer many desirable characteristics such as near-zero leakage power, high density and non-volatility. They can significantly mitigate the issue of memory leakage power in future embedded chip-multiprocessor (eCMP) systems. However, they suffer from challenges such as limited write endurance and high write energy consumption which restrict them for adoption in modern memory systems. In this article, we present a convex optimization model to design a 3D stacked hybrid memory architecture in order to minimize the future embedded systems energy consumption in the dark silicon era. This proposed approach satisfies endurance constraint in order to design a reliable memory system. Our convex model optimizes numbers and placement of eDRAM and STT-RAM memory banks on the memory layer to exploit the advantages of both technologies in future eCMPs. Energy consumption, the main challenge in the dark silicon era, is represented as a major target in this work and it is minimized by the detailed optimization model in order to design a dark silicon aware 3D Chip-Multiprocessor. Experimental results show that in comparison with the Baseline memory design, the proposed architecture improves the energy consumption and performance of the 3D CMP on average about 61.33 and 9 percent respectively.

  • Open Access Icon
  • Research Article
  • Cite Count Icon 8
  • 10.1109/tc.2023.3266592
A Framework for Automated Exploration of Trojan Attack Space in FPGA Netlists
  • Oct 1, 2023
  • IEEE Transactions on Computers
  • Jonathan Cruz + 5 more

Field Programmable Gate Arrays (FPGAs) provide a flexible compute platform for quick prototyping or hardware acceleration in diverse application domains. However, similar to the global semiconductor life-cycle in the modern supply chain, FPGA-based product development includes processes and interactions with potentially untrusted parties outside the traditional scrutiny of a completely in-house development cycle. An untrusted party or software can maliciously alter a hardware intellectual property (IP) block mapped to an FPGA device during various stages of the FPGA life-cycle. Such malicious alterations, also known as hardware Trojan attacks, have garnered significant research into their detection and prevention in the context of application-specific integrated circuit (ASIC) design flow. However, Trojan attacks in FPGAs have not enjoyed this same attention. Designers often rely on mapping ASIC-specific solutions and evaluation benchmarks to the FPGA domain, which leaves much of the FPGA-specific Trojan space uncovered. We note that the distinctive business model as well as the architectural configurations of FPGAs present unique opportunities for Trojan attacks to an adversary. To this end, we introduce a framework to automatically explore the hardware Trojan attack space in FPGA netlists. It is capable of inserting different types of FPGA-specific Trojans in a netlist enabling rapid exploration of potential Trojan attacks in an FPGA design: soft-template, monolithic, and distributed dark silicon. Soft template Trojans use behavioral templates with random synthesis constraints to increase Trojan structural diversity. Monolithic and distributed dark silicon Trojans use the under-utilized input space (FPGA dark silicon) in FPGA primitives to realize Trojans with effectively zero area and power footprint. Further optimizations are also presented to remove any potential delay impact. We then generate over 1300 Trojan-inserted benchmarks using each of the introduced FPGA Trojan classes, and compare their impact on utilization, delay, and power. Finally, we evaluate our Trojans against a machine learning-based Trojan detection to highlight their evasiveness.

  • Open Access Icon
  • Research Article
  • Cite Count Icon 7
  • 10.1145/3591470
Machine Learning Enabled Solutions for Design and Optimization Challenges in Networks-on-Chip based Multi/Many-Core Architectures
  • Jun 30, 2023
  • ACM Journal on Emerging Technologies in Computing Systems
  • Md Farhadur Reza

Due to the advancement of transistor technology, a single chip processor can now have hundreds of cores. Network-on-Chip (NoC) has been the superior interconnect fabric for multi/many-core on-chip systems because of its scalability and parallelism. Due to the rise of dark silicon with the end of Dennard Scaling, it becomes essential to design energy efficient and high performance heterogeneous NoC-based multi/many-core architectures. Because of the large and complex design space, the solution space becomes difficult to explore within a reasonable time for optimal trade-offs of energy-performance-reliability. Furthermore, reactive resource management is not effective in preventing problems from happening in adaptive systems. Therefore, in this work, we explore machine learning techniques to design and configure the NoC resources based on the learning of the system and applications workloads. Machine learning can automatically learn from past experiences and guide the NoC intelligently to achieve its objective on performance, power, and reliability. We present the challenges of NoC design and resource management and propose a generalized machine learning framework to uncover near-optimal solutions quickly. We propose and implement a NoC design and optimization solution enabled by neural networks, using the generalized machine learning framework. Simulation results demonstrated that the proposed neural networks-based design and optimization solution improves performance by 15% and reduces energy consumption by 6% compared to an existing non-machine learning-based solution while the proposed solution improves NoC latency and throughput compared to two existing machine learning-based NoC optimization solutions. The challenges of machine learning technique adaptation in multi/many-core NoC have been presented to guide future research.

  • Research Article
  • Cite Count Icon 4
  • 10.1109/tc.2022.3211417
COP: A Combinational Optimization Power Budgeting Method for Manycore Systems in Dark Silicon
  • May 1, 2023
  • IEEE Transactions on Computers
  • Xin Li + 5 more

Dark silicon is a phenomenon of under-utilization in today's manycore systems due to power and thermal limitations. In order to improve the performance of dark silicon systems, it is necessary to adopt dynamic power constraints for different core mapping decisions. However, existing power budgeting methods are generally over pessimistic, e.g. Thermal Safe Power (TSP), or over optimistic, e.g. Greedy based Dynamic Power (GDP). This paper proposes a practical power budgeting method, called Combinational Optimization Power (COP). Different from existing methods, which ignore some actual factors, such as communication overhead and lifetime reliability, COP formulates the power budgeting problem as a thermal-constrained combinational optimization power problem. For the steady-state case, COP achieves the target fusion of optimized temperature and communication energy consumption by applying task priority ranking and task-to-core mapping. For the transient case, COP uses the rainflow counting algorithm to construct the reliability framework based on the thermal cycling failure mechanism, and then establishes a linear time-invariant transient temperature model to obtain the core mapping selection and the corresponding dynamic power budget. Experimental results demonstrate that COP is capable of providing an optimized core mapping decision, which can maximize power budget while ensuring the system performance.

  • Open Access Icon
  • Research Article
  • Cite Count Icon 1
  • 10.1016/j.procs.2023.01.029
Customized FPGA Design and Analysis of Soft-Core Processor for DNN
  • Jan 1, 2023
  • Procedia Computer Science
  • Harini Sriraman + 1 more

Customized FPGA Design and Analysis of Soft-Core Processor for DNN

  • Research Article
  • Cite Count Icon 3
  • 10.1109/tcad.2022.3157685
DBP: Distributed Power Budgeting for Many-Core Systems in Dark Silicon
  • Dec 1, 2022
  • IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
  • Hai Wang + 4 more

Power budget is an important power constraint provided to guarantee the thermal reliability of an integrated system. In this work, we present DBP, a distributed power budgeting method, for dark silicon many-core systems. In DBP, there are two new techniques proposed to bring accurate and optimized power budgets in a distributed way. First, a distributed active core locating technique is developed to find an active core distribution that leads to a high-power budget. Second, a distributed power budget computing technique is introduced which computes the power budget for each active core accurately. Experiments show DBP outperforms the state-of-the-art power budgeting methods’ thermal safe power (TSP) and greedy dynamic power (GDP) on many-core dark silicon systems by providing a high and accurate power budget with low overhead and good scalability.

  • Open Access Icon
  • Research Article
  • Cite Count Icon 9
  • 10.1145/3501771
Thermal and Performance Efficient On-Chip Surface-Wave Communication for Many-Core Systems in Dark Silicon Era
  • Mar 22, 2022
  • ACM Journal on Emerging Technologies in Computing Systems
  • Ammar Karkar + 3 more

Due to the exceedingly high integration density of VLSI circuits and the resulting high power density, thermal integrity became a major challenge. One way to tackle this problem is Dark silicon. Dark silicon is the amount of circuitry in a chip that is forced to switch off to insure thermal integrity of the system and prevent permanent thermal-related faults. In many-core systems, the presence of Dark Silicon adds new design constraints, in general, and on the communication fabric of such systems, in particular. This is due to the fact that system-level thermal-management systems tend to increase the distance between high activity cores to insure better thermal balancing and integrity. Consequently, a designing dilemma is created where a compromise has to be made between interconnect performance and power consumption. This study proposes a hybrid wire and surface-wave interconnect (SWI) based Network-on-Chip (NoC) to address the dark silicon challenge. Through efficient utilization of one-hop cross the chip communication SWI links, the proposed architecture is able to offer an efficient and scalable communication platform in terms of performance, power, and thermal impact. As a result, evaluations of the proposed architecture compared to baseline architecture under dark silicon scenarios show reduction in maximum temperature by 15∘C, average delay up to 73.1%, and energy-saving up to ∼3X. This study explores the promising potential of the proposed architecture in extending the utilization wall for current and future many-core systems in dark silicon era.

  • Open Access Icon
  • Research Article
  • Cite Count Icon 5
  • 10.1155/2022/3505439
Power and Area Efficient Cascaded Effectless GDI Approximate Adder for Accelerating Multimedia Applications Using Deep Learning Model
  • Mar 19, 2022
  • Computational Intelligence and Neuroscience
  • Manikandan Nagarajan + 4 more

Approximate computing is an upsurging technique to accelerate the process through less computational effort while keeping admissible accuracy of error-tolerant applications such as multimedia and deep learning. Inheritance properties of the deep learning process aid the designer to abridge the circuitry and also to increase the computation speed at the cost of the accuracy of results. High computational complexity and low-power requirement of portable devices in the dark silicon era sought suitable alternate for Complementary Metal Oxide Semiconductor (CMOS) technology. Gate Diffusion Input (GDI) logic is one of the prompting alternatives to CMOS logic to reduce transistors and low-power design. In this work, a novel energy and area efficient 1-bit GDI-based full swing Energy and Area efficient Full Adder (EAFA) with minimum error distance is proposed. The proposed architecture was constructed to mitigate the cascaded effect problem in GDI-based circuits. It is proved by extending the proposed 1-bit GDI-based adder for different 16-bit Energy and Area Efficient High-Speed Error-Tolerant Adders (EAHSETA) segmented as accurate and inaccurate adder circuits. The proposed adder's design metrics in terms of delay, area, and power dissipation are verified through simulation using the Cadence tool. The proposed logic is deployed to accelerate the convolution process in the Low-Weight Digit Detector neural network for real-time handwritten digit classification application as a case study in the Intel Cyclone IV Field Programmable Gate Array (FPGA). The results confirm that our proposed EAHSETA occupies fewer logic elements and improves operation speed with the speed-up factor of 1.29 than other similar techniques while producing 95% of classification accuracy.

  • Research Article
  • Cite Count Icon 2
  • 10.1145/3494535
Terminator : A Secure Coprocessor to Accelerate Real-Time AntiViruses Using Inspection Breakpoints
  • Mar 4, 2022
  • ACM Transactions on Privacy and Security
  • Marcus Botacin + 4 more

AntiViruses (AVs) are essential to face the myriad of malware threatening Internet users. AVs operate in two modes: on-demand checks and real-time verification. Software-based real-time AVs intercept system and function calls to execute AV’s inspection routines, resulting in significant performance penalties as the monitoring code runs among the suspicious code. Simultaneously, dark silicon problems push the industry to add more specialized accelerators inside the processor to mitigate these integration problems. In this article, we propose Terminator , an AV-specific coprocessor to assist software AVs by outsourcing their matching procedures to the hardware, thus saving CPU cycles and mitigating performance degradation. We designed Terminator to be flexible and compatible with existing AVs by using YARA and ClamAV rules. Our experiments show that our approach can save up to 70 million CPU cycles per rule when outsourcing on-demand checks for matching typical, unmodified YARA rules against a dataset of 30 thousand in-the-wild malware samples. Our proposal eliminates the AV’s need for blocking the CPU to perform full system checks, which can now occur in parallel. We also designed a new inspection breakpoint mechanism that signals to the coprocessor the beginning of a monitored region, allowing it to scan the regions in parallel with their execution. Overall, our mechanism mitigated up to 44% of the overhead imposed to execute and monitor the SPEC benchmark applications in the most challenging scenario.

  • Open Access Icon
  • Research Article
  • Cite Count Icon 2
  • 10.1002/cpe.6667
Advances in parallel and distributed computing and its applications
  • Oct 16, 2021
  • Concurrency and Computation: Practice and Experience
  • Hui Tian + 2 more

Parallel and distributed computing has been the basis to many emerging areas, such as smart networks, cloud computing, big data analysis, and blockchain technology. Without the development of parallel and distributed computing technologies and various types of systems, it is not possible to meet the requirement on efficiency, accuracy, scalability, and reliability for various critical applications that support our modern economy and society. While parallel and distributed computing played a vital role in modern science, engineering, biology, medicine, pharmacy, astronomy, geology, and archaeology, its application has also been extended to business, finance, economics, management, government, and defense, covering all aspects of our modern society and life. Furthermore, parallel and distributed computing has emerged in recent advances of many hotspot research directions including artificial intelligence, machine learning, Internet of Things, bioinformatics, digital medicine, cybersecurity, and social computing, resulting in numerous ground-breading discoveries that are changing our society and life.

  • Research Article
  • Cite Count Icon 7
  • 10.1016/j.micpro.2021.104055
Energy-efficient task-resource co-allocation and heterogeneous multi-core NoC design in dark silicon era
  • Feb 20, 2021
  • Microprocessors and Microsystems
  • Md Farhadur Reza + 2 more

Energy-efficient task-resource co-allocation and heterogeneous multi-core NoC design in dark silicon era

  • Open Access Icon
  • PDF Download Icon
  • Research Article
  • Cite Count Icon 3
  • 10.1109/access.2021.3109717
PEW: Prediction-Based Early Dark Cores Wake-up Using Online Ridge Regression for Many-Core Systems
  • Jan 1, 2021
  • IEEE Access
  • Mohammed Sultan Mohammed + 5 more

Future many-core systems need to address the dark silicon problem, where some cores would be turned off to control the chip’s thermal and power density, which effectively limits the performance gain from having a large number of processing cores. Task migration technique has been previously proposed to improve many-core system performance by moving tasks between active and dark cores. As task migration imposes system performance overhead due to the large wake-up latency of the dark cores, this paper proposes a prediction-based early wake-up (PEW) to reduce the dark cores’ wake-up latency during task migration. A window-based online ridge regression (RR) is used as the prediction model. The prediction model uses the past window’s thermal, power, and core status (i.e., active or dark) to predict the future core temperatures at run-time. If task migration is predicted in the next control period, the proposed PEW puts the dark cores in a power state with low wake-up latency. Thus, the proposed PEW reduces the time for the dark cores to start executing the tasks. The comparison results show that our proposed PEW reduces the completion time by up to 7.9% and 4.1% compared to non-early wake-up (NoEW) and a fixed threshold wake-up (FEW), respectively. It also shows that the proposed PEW increases the MIPS/Watt by up to 5.5% and 2.3% over NoEW and FEW, respectively. These results show that the proposed PEW improves the many-core system’s overall performance in terms of reducing dark cores’ wake-up latency and increasing the number of executed instructions per Watt.

  • Open Access Icon
  • PDF Download Icon
  • Research Article
  • Cite Count Icon 9
  • 10.3390/electronics9111980
DTaPO: Dynamic Thermal-Aware Performance Optimization for Dark Silicon Many-Core Systems
  • Nov 23, 2020
  • Electronics
  • Mohammed Sultan Mohammed + 4 more

Future many-core systems need to handle high power density and chip temperature effectively. Some cores in many-core systems need to be turned off or ‘dark’ to manage chip power and thermal density. This phenomenon is also known as the dark silicon problem. This problem prevents many-core systems from utilizing and gaining improved performance from a large number of processing cores. This paper presents a dynamic thermal-aware performance optimization of dark silicon many-core systems (DTaPO) technique for optimizing dark silicon a many-core system performance under temperature constraint. The proposed technique utilizes both task migration and dynamic voltage frequency scaling (DVFS) for optimizing the performance of a many-core system while keeping system temperature in a safe operating limit. Task migration puts hot cores in low-power states and moves tasks to cooler dark cores to aggressively reduce chip temperature while maintaining high overall system performance. To reduce task migration overhead due to cold start, the source core (i.e., active core) keeps its L2 cache content during the initial migration phase. The destination core (i.e., dark core) can access it to reduce the impact of cold start misses. Moreover, the proposed technique limits tasks migration among cores that share the last level cache (LLC). In the case of major thermal violation and no cooler cores being available, DVFS is used to reduce the hot cores temperature gradually by reducing their frequency. Experimental results for different threshold temperatures show that DTaPO can keep the average system temperature below the thermal limit. Affirmatively, the execution time penalty is reduced by up to 18% compared with using only DVFS for all thermal thresholds. Moreover, the average peak temperature is reduced by up to 10.8°C. In addition, the experimental results show that DTaPO improves the system’s performance by up to 80% compared to optimal sprinting patterns (OSP) and reduces the temperature by up to 13.6°C.

  • Research Article
  • Cite Count Icon 3
  • 10.1109/tcad.2020.3007555
Investigating Frequency Scaling, Nonvolatile, and Hybrid Memory Technologies for On-Chip Routers to Support the Era of Dark Silicon
  • Jul 7, 2020
  • IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
  • Khushboo Rani + 1 more

In the era of dark silicon, several components on the chip [i.e., cores, memory, and network on chip (NoC)] need to be powered-off or run in low-power mode. This is mainly due to the increased leakage power consumption at smaller technology nodes. Other than the power consumed by cores and caches, power and performance of the interconnects is a significant factor as the communication network consumes a considerable share of the power budget. In particular, the buffers used at every port of the NoC router consume considerable dynamic as well as static power. To support dark silicon and save energy, a popular approach is to power off the routers and wake them up when needed. However, this affects the packet latency, and we need to observe the traffic through the nodes to decide turning the routers ON-OFF. In this article, we propose to keep the routers always powered ON to maintain constant connectivity and investigate various approaches. One proposal is to frequency scale the routers connected to powered OFF nodes, and the other proposals are to use a combination of SRAM and nonvolatile spin-transfer torque random access memory-based VCs in the routers. By managing which VCs to be active at a given time, we achieve energy savings. The proposals are evaluated by varying the percentage of dark nodes on the chip. The experimental results show that all proposals yield significant energy savings while maintaining connectivity.

  • Research Article
  • Cite Count Icon 14
  • 10.1109/tcad.2020.3003288
READY: Reliability- and Deadline-Aware Power-Budgeting for Heterogeneous Multicore Systems
  • Jun 25, 2020
  • IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
  • Javad Saber-Latibari + 5 more

Tackling the dark silicon problem in a heterogeneous multicore system, the temperature constraints across the system should be addressed carefully by assigning a proper set of tasks to a pool of the heterogeneous cores during the run-time. When such a system is utilized in a reliable/real-time application, the reliability/timing constraints of the application should also be augmented to the temperature constraints and make the tasks mapping problem more and more complex. To solve the mapping problem in such a situation, we propose READY; an online reliability- and deadline-aware mapping and scheduling algorithm for heterogeneous multicore systems. READY utilizes an adaptive power constraint (as a metric for temperature measurement) that is updated according to the number and position of the active cores on the chip. READY, first, attempts to meet the reliability target of the system by improving the reliability of each task. Then, it performs the mapping and scheduling of the tasks on cores of different islands, so that the peak power and timing constraints are met. The simulation results illustrate that while READY guarantees the timing constraints and meets reliability targets, it improves the peak-power-aware system schedulability (chip performance) by 23.77% (up to 40.69%).

  • Research Article
  • Cite Count Icon 3
  • 10.1109/tc.2020.3015711
Runtime Performance Optimization of 3-D Microprocessors in Dark Silicon
  • Jan 1, 2020
  • IEEE Transactions on Computers
  • Hai Wang + 5 more

Because the increasing power density is limited by the thermal constraint, multi-core integrated systems have stepped into the dark silicon era recently, meaning not all parts of the system can be powered on at the same time. Dark silicon effects are, especially severe for 3-D microprocessors due to the even higher power density caused by the stacked structures, which greatly limit the system performances. In this article, we propose a greedy based core-cache co-optimization algorithm to optimize the performance of 3-D microprocessors in dark silicon at runtime. The new method determines many runtime settings of the 3-D system on the fly, including the active core and cache bank positions, active cache bank number, and the voltage/frequency (V/f) level of each active core, which optimizes the performance of the 3-D microprocessor under thermal constraint. Because the core-cache settings are co-optimized in the 3-D space and the power budgets are computed dynamically according to the running state of the 3-D microprocessor, the new method leads to a higher system performance compared with the existing methods. Experiments on two 3-D microprocessors show the greedy-based core-cache co-optimization algorithm outperforms the state-of-the-art 3-D dark silicon microprocessor performance optimization method by achieving a higher processing throughput with guaranteed thermal safety.

  • Research Article
  • Cite Count Icon 4
  • 10.1109/tetc.2019.2890867
Optimal Sprinting Pattern in Thermal Constrained CMPs
  • Jan 1, 2020
  • IEEE Transactions on Emerging Topics in Computing
  • Jian Wang + 4 more

As studied in literatures, Computational Sprinting (CS) is a promising technique to tackle the thermal challenge for Chip Multi-Processors (CMPs) in dark silicon era. Sprinting pattern, the boosted chip and voltage during the sprinting time, greatly impacts the CMP performance. In the paper, we address how to find out the optimal sprinting pattern which maximizes the performance of CMPs within thermal limitation. First, we conduce a mathematical proof to show that any thermal-constrained CMP, when it executes an application, has a specialized, sustainable configuration (v <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">o</sub> , f <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">o</sub> ), under which the CMP can keep sprinting without resting and meanwhile its performance is maximized. Then, we design a self-adaptive algorithm automatically altering the chip frequency with adjustable step size and voltage in runtime to reach the optimal value. Finally, our extensive experimental results reveal that our Optimal Sprinting Pattern (OSP) outperforms state-of-the-art sprinting techniques, Full Sprinting Policy (FSP) and Adaptive Sprinting Pacing (ASP). Specifically, our OSP improves the computational efficiency in MIPS by up to 59 percent against FSP and 40 percent against ASP. It also achieves higher energy efficiency in MIPJ, by up to 41 and 25 percent over FSP and ASP, respectively. Moreover, we demonstrate that our method is effective for various CMPs with different scales, CPU architectures and chip nano-technologies.

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 1
  • 2
  • 3
  • 4
  • 5

Popular topics

  • Latest Artificial Intelligence papers
  • Latest Nursing papers
  • Latest Psychology Research papers
  • Latest Sociology Research papers
  • Latest Business Research papers
  • Latest Marketing Research papers
  • Latest Social Research papers
  • Latest Education Research papers
  • Latest Accounting Research papers
  • Latest Mental Health papers
  • Latest Economics papers
  • Latest Education Research papers
  • Latest Climate Change Research papers
  • Latest Mathematics Research papers

Most cited papers

  • Most cited Artificial Intelligence papers
  • Most cited Nursing papers
  • Most cited Psychology Research papers
  • Most cited Sociology Research papers
  • Most cited Business Research papers
  • Most cited Marketing Research papers
  • Most cited Social Research papers
  • Most cited Education Research papers
  • Most cited Accounting Research papers
  • Most cited Mental Health papers
  • Most cited Economics papers
  • Most cited Education Research papers
  • Most cited Climate Change Research papers
  • Most cited Mathematics Research papers

Latest papers from journals

  • Scientific Reports latest papers
  • PLOS ONE latest papers
  • Journal of Clinical Oncology latest papers
  • Nature Communications latest papers
  • BMC Geriatrics latest papers
  • Science of The Total Environment latest papers
  • Medical Physics latest papers
  • Cureus latest papers
  • Cancer Research latest papers
  • Chemosphere latest papers
  • International Journal of Advanced Research in Science latest papers
  • Communication and Technology latest papers

Latest papers from institutions

  • Latest research from French National Centre for Scientific Research
  • Latest research from Chinese Academy of Sciences
  • Latest research from Harvard University
  • Latest research from University of Toronto
  • Latest research from University of Michigan
  • Latest research from University College London
  • Latest research from Stanford University
  • Latest research from The University of Tokyo
  • Latest research from Johns Hopkins University
  • Latest research from University of Washington
  • Latest research from University of Oxford
  • Latest research from University of Cambridge

Popular Collections

  • Research on Reduced Inequalities
  • Research on No Poverty
  • Research on Gender Equality
  • Research on Peace Justice & Strong Institutions
  • Research on Affordable & Clean Energy
  • Research on Quality Education
  • Research on Clean Water & Sanitation
  • Research on COVID-19
  • Research on Monkeypox
  • Research on Medical Specialties
  • Research on Climate Justice
Discovery logo
FacebookTwitterLinkedinInstagram

Download the FREE App

  • Play store Link
  • App store Link
  • Scan QR code to download FREE App

    Scan to download FREE App

  • Google PlayApp Store
FacebookTwitterTwitterInstagram
  • Universities & Institutions
  • Publishers
  • R Discovery PrimeNew
  • Ask R Discovery
  • Blog
  • Accessibility
  • Topics
  • Journals
  • Open Access Papers
  • Year-wise Publications
  • Recently published papers
  • Pre prints
  • Questions
  • FAQs
  • Contact us
Lead the way for us

Your insights are needed to transform us into a better research content provider for researchers.

Share your feedback here.

FacebookTwitterLinkedinInstagram
Cactus Communications logo

Copyright 2026 Cactus Communications. All rights reserved.

Privacy PolicyCookies PolicyTerms of UseCareers