News

More than Moore and Granular Powering - the Next Revolution in Power

October 20, 2013 by Jeff Shepard

Darnell Group, Inc. has embarked on a groundbreaking analysis of the next frontier in power conversion, so-called “granular power.” Granular power refers to deeply-embedded dc-dc converters that power individual microprocessor cores or similar partitions in large-scale integrated circuits such as SoCs, SiPs, FPGAs, and so on. Solving the complex technical challenges needed to implement granular power will not only provide efficient power for large-scale ICs, it will dramatically impact the power paradigm of a variety of portable devices including tablets and handsets which have dozens of power rails.

It is well-known that large digital ICs have hit a “power wall” The power wall problem is exacerbated with leakage power which increases with CMOS scaling. With modern CMOS processes, leakage is a significant part of total power (whereas dynamic power historically comprised the majority of total power). Leakage power is exponentially dependent on temperature, creating a thermal runaway problem: increased temperature increases leakage power, and increased total power (switching + leakage) increases temperature. Leakage power is exerting pressure to decrease junction temperature, exacerbating the power wall problem.

The effects of the power wall are already apparent in modern processors: Even though native transistor switching speeds have continued to double every two process generations, processor frequencies have not increased substantially over the last 8 years. The utilization wall is getting exponentially worse, roughly by a factor of two, with each process generation. The emergence of three-dimensional (3D) CMOS integration will further exacerbate this problem by substantially increasing device count without improving transistor energy efficiency.

Consequently, multiple innovations in power delivery and chip design are required to take full advantage of future improvements in IC technology. Near-term innovations (within the next few years) involve “More-than-Moore” (MtM) scaling. MtM scaling will attempt to extend the same design principles that have driven digital device scaling for decades over to analog/power, and to integrate those technologies on-die within a SoC/SiP. The ultimate goal of MtM power is to increase system-level power efficiency and capabilities through the integration of both digital and analog/power into compact systems.

A definition supplied by Sarda Technologies states, “Granular power delivery, consisting of fast power gating and fast dynamic voltage scaling of each load, can significantly improve energy efficiency. Granular power delivery uses a dedicated VR for each load – i.e., each component in a system or core (or clusters of cores) in a multi-core processor or system-on-chip (SoC). An analogy is to replace the garden hose with a drip sprinkler system. Each VR supplies only as much power as is needed by either power gating or dynamic voltage scaling. Systems use multiple components – processors, SoCs, FPGAs, memories, radios, etc. – and each component requires multiple voltage levels – for different cores, I/O, etc. Individually optimizing the power delivery to each of the 10-20 loads in a tablet or ultrabook or 50-100 loads in a server or router dramatically improves system energy efficiency and performance.”

Next-generation VHF power switches, such as the GaAs devices under development at Sarda (as well as GaN devices under development elsewhere), are expected to be a key to implementing MtM power. In addition to the significant device design, fabrication and packaging difficulties; power gating and dynamic voltage and frequency scaling (DVFS) have been identified as key challenges to the development of commercially-successful MtM powering systems.

Power gating simply turns off the power to loads not actively being used at the moment. Processor cores, banks of memory, mass storage, I/O ports, etc. can all be selectively powered down. Although it is a common technique for power management, it suffers from some serious limitations. The performance degradation and power consumption costs become significant due to the time it takes to power back up the load. Multi-core SoCs use microarchitectural predictive control techniques to determine if a core is likely to be idle for a relatively long duration. The overhead due to wake-up latency and frequent mis-predictions can have a significant negative impact on power-performance.

Dynamic voltage and frequency scaling (DVFS) reduces the voltage of a circuit to its minimum energy point (MEP). DVFS alters the circuit’s performance and power consumption on the fly by changing its supply voltage (V) and frequency (f) to provide a cubic reduction in dynamic power, which is proportional to C V2 f (where C is the effective load capacitance). By reducing the voltage by a small amount, dynamic power is reduced by the square of that factor. However, reducing the voltage means that transistors need more time to switch on and off, which forces a reduction in the operating frequency. For a given circuit, the MEP is not a fixed voltage.

It can vary widely depending on its workload and environmental conditions (e.g., temperature) due to opposing trends in the dynamic and leakage energy per clock cycle as the supply voltage scales down. Significant reductions in power consumption (up to 100x) are achieved by decreasing the supply voltage from ~1.1V at high load powers when performing active operation down to ~0.3V (close to the transistor’s threshold voltage) when idling. DVFS seeks to reduce power consumption when cores are idle, boost single-threaded performance in the presence of large workloads, and remap voltage and frequency settings to improve performance and energy utilization.

The effectiveness of power gating and DVFS depend upon its granularity and transient response. Per-chip DVFS, where a single VR supplies all cores, is widely used today but constrains all the cores to scale their voltage / frequency uniformly and simultaneously. Per-core DVFS, which requires a dedicated VR for each core, provides the greatest flexibility in controlling power. Architectural simulations show 20-30% power savings with fast, per-core DVFS. Per-cluster DVFS, an intermediate point between the extremes of per-chip and per-core DVFS, clusters together several cores in a common voltage / frequency domain. All cores on a domain use a common voltage / frequency setting. This approach takes advantage of the natural granularity of the division into cores and shared cache.

Trends in current multi-core systems suggest the following: Future high-throughput systems are likely to pack together a large number of simple cores hosting many more applications. And even though per-core, independent voltage control is currently impractical, future systems with a multitude of cores can be expected to have a small number of independent voltage / frequency domains. As such, cores that differ in power-performance capabilities will exist. This clustered approach to DVFS domains is a way forward for multi-core ICs.

DVFS reduces voltage and frequency on the fly to reduce the total power during such periods. Reducing frequency can usually be done quickly, whereas for changing voltage the regulators have to settle their output voltage. Changes in voltage must, thus, be carefully scheduled in advance to align ramping up voltage with activity in the chip.

The semiconductor and electronics industries have hit the energy efficiency wall. The consequence is diminishing performance improvement with each new generation of products and commoditization (which is increasingly apparent in systems ranging from smart phones to servers). MtM, DVFS, power gating and a variety of other developments will be necessary to address this changing and challenging situation.

Moore’s Law cycle of “scaling” reduces the transistor size, which lowers its operating voltage and, in turn, decreases power consumption per transistor. Consequently, each new generation of ICs has historically doubled transistor density and increased their operating speed within the same power budget.

The problem is that CMOS IC scaling is no longer providing the required energy efficiency improvement, because the decrease in the transistor’s threshold voltage has stopped to keep leakage current under control. This, in turn, has prevented the supply voltage from scaling. Given the strong dependence of the dynamic energy on supply voltage, as more transistors are integrated on a fixed-sized chip with every generation the chip power increases rapidly.

Sarda is only one of several start-ups as well as established power management companies seeking solutions to MtM powering challenges. The first commercial products are already on the horizon from multiple vendors. Darnell’s analysis will provide critical insights into emerging powering technologies, potential competitors, customer needs, pricing requirements and other factors that will drive this revolution in power conversion.

The report is scheduled for release in the first quarter of 2014. For additional information, or to become a sponsor, contact Jeff Shepard at Darnell Group, [email protected] or +1-951-279-6684.