We introduce a methodology for characterizing the effect of workload-induced logic states on chip leakage power. Experimental evaluation on the IBM POWER8, showed that the variation in chip leakage induced by workload-driven logic states is 0.1 to 3.3%. First work to study impact of logic states on post-silicon leakage power of an industry-class chip in a real server system, while running workloads. With the backing of such data, we translate that to new takeaways for design and automation, especially in the context of high performance microprocessor design.
08448380779 Call Girls In Civil Lines Women Seeking Men
Experimental Characterization of Workload-induced Logic States on Chip Leakage Power of a Server Class Microprocessor
1. Experimental Characterization of Workload-
induced Logic States on Chip Leakage Power
of a Server Class Microprocessor
Arun Joseph, Rahul M Rao, Anand Haridass, Spandana Rachamalla, Diyanesh B
IBM Systems Group, Bangalore, India
Contact: arujosep@in.ibm.com
2. Over the last decade or so, prior work has focused on addressing different
aspects of dependency of leakage power on logic states.
[Shiue, Chen] discussed accounting for this during pre-silicon power modelling
and analysis.
[Pedram, Najm] proposed techniques for minimizing leakage power
consumption, by performing logic state optimization.
Other prior work [Naidu, Aloul] studied efficient techniques for finding a low
leakage vectors, which are applied when the chip enters sleep mode.
Slide 2
Background
3. Interestingly, to the best of our knowledge, no prior work has carefully
characterized the impact of logic states on post-silicon leakage power of an
industry-class microprocessor chip in a real server system, while running
workloads.
This might be because of challenges relating to the separation of leakage
power from total chip power, while running workloads.
Additionally, careful isolation of the impact on leakage power from other
sources of variation like voltage and on-chip temperature, is a must for
performing this characterization.
We believe, this experimental analysis will provide valuable new insight for
future research in related areas of design automation and design.
Slide 3
Motivation
4. FreqLeak [Joseph et al.] enables accurate separation of leakage power from total power,
while running a workload, and while carefully maintaining constant on-chip voltage and
temperature.
Comparing separated chip leakage across a range of workload conditions, provides a good
estimate of leakage power variation induced by workload-driven logic states.
Slide 4
Main Idea
Fig. 1. FreqLeak Overview Fig. 2. Separating contribution of workload-induced logic states
5. Used Power S824 server [IBM RedBooks] that uses 22nm IBM POWER8
microprocessors [Fluhr et al.]
Applied to evaluate the impact on the POWER8 VDD leakage power
Done across workload conditions (zero to high utilization), VDD voltages (1.1,
1.2V), on-chip temperatures (55C, 75C, 85C), and two unique hardware parts
Slide 5
Experimental Setup
Fig. 3. Power S824 Server
Fig. 4. IBM POWER8 microprocessor
6. The processor was configured to disable power gating.
Nothing was done during the design of the processor to retain logic states.
All measurements done while carefully maintaining constant on-chip voltage and
temperature (using a combination of hardware voltage & fan controls, and careful
choice of workloads).
The on-chip temperature profile was kept within 2 Celsius across all experiments.
Across the entire range of conditions, the variation caused by workload-induced
logic states (WLS), was observed.
Slide 6
Experimental Setup
7. WLS Variation = 100 * (Leakage power
while running an active workload – Leakage
power while running no workload) / Leakage
power while running no workload.
Maximum WLS variation was observed to be
3.3% of chip VDD leakage power.
Slide 7
Experimental Results
Workload WLS Variation (%)
W1 0.3
W2 3.3
W3 1.5
W4 0.6
W5 1.7
WLS Variation @T=85C, V=1.1V
Workload WLS Variation (%)
W1 0.5
W2 2.1
W3 1.1
W4 1.0
W5 1.5
WLS Variation @T=55C, V=1.1V
Workload WLS Variation (%)
W1 0.6
W2 2.6
W3 1.7
W4 2.0
W5 2.6
WLS Variation @T=85C, V=1.2V
Table 1
Table 2 Table 3
8. Backed with such results, we make new observations, especially in the context of
high performance microprocessor and server system design.
The best case benefit of per-state leakage analysis is ~3%, when compared to
state-independent analysis. This is marginal considering effort (in creating per-
state models) and challenges (in handling state explosion) involved.
Power-grid analysis of IR drop due to leakage only needs to consider a max of
~3% variation across workloads.
Realistic returns on solutions trying to reduce leakage power by optimizing for
logic states is fairly low, also considering the investment and trade-offs involved.
These observations might be more true in future technologies (like 14nm), which
have reported reduced DIBL [Mouli, Zyuban], and thereby further reduced
variation in leakage power across states.
Slide 8
Experimental Analysis
9. We introduce a methodology for characterizing the effect of workload-induced
logic states on chip leakage power.
Experimental evaluation on the IBM POWER8, showed that the variation in
chip leakage induced by workload-driven logic states is 0.1 to 3.3%.
First work to study impact of logic states on post-silicon leakage power of an
industry-class chip in a real server system, while running workloads.
With the backing of such data, we translate that to new takeaways for design
and automation, especially in the context of high performance microprocessor
design.
In future work, the proposed methodology needs to be applied on other class
of chips (non-high performance), to perform similar characterization.
Slide 9
Summary
10. We introduce a methodology for characterizing the effect of workload-induced
logic states on chip leakage power.
Experimental evaluation on the IBM POWER8, showed that the variation in
chip leakage induced by workload-driven logic states is 0.1 to 3.3%.
First work to study impact of logic states on post-silicon leakage power of an
industry-class chip in a real server system, while running workloads.
With the backing of such data, we translate that to new takeaways for design
and automation, especially in the context of high performance microprocessor
design.
In future work, the proposed methodology needs to be applied on other class
of chips (non-high performance), to perform similar characterization.
Slide 9
Summary
Notes de l'éditeur
[1] W. Shiue, “Leakage power estimation and minimization in VLSI circuits”, ISCAS, 2001, pp.178-181
[2] Z. Chen, et. al., “Estimation of Standby Leakage Power in CMOS Circuits Considering Accurate Modeling of Transistor Stacks”. Int. ISLPED, 1998, pp.239-244.
[3] Abdollahi, A.; Fallah, F.; Pedram, M., "Leakage current reduction in CMOS VLSI circuits by input vector control," in Very Large Scale Integration (VLSI) Systems, IEEE Transactions on , Feb. 2004
[4] J. Halter and F. Najm, “A Gate-Level Leakage Power Reduction Method for Ultra Low Power CMOS Circuits,” Proc. of CICC, pp.475-478, 1997
[5] Naidu, S.R.; Jacobs, E.T.A.F., "Minimizing stand-by leakage power in static CMOS circuits," in Design, Automation and Test in Europe, 2001. Conference and Exhibition 2001. Proceedings , vol., no., pp.370-376, 2001
[6] F. Aloul, S. Hassoun, K. Sakallah, D. Blaauw, “Robust SATBased Search Algorithm for Leakage Power Reduction,” PATMOS, 2002.
[7] Joseph, A.; Haridass, A.; Lefurgy, C.; Pai, S.; Rachamalla, S.; Campisano, F., "FreqLeak: A frequency step based method for efficient leakage power characterization in a system," in Low Power Electronics and Design (ISLPED), 2015 IEEE/ACM International Symposium on , vol., no., pp.195-200, 22-24 July 2015
[8] IBM RedBooks. IBM Power Systems S814 and S824 Technical Overview and Introduction. Aug 2014.
[9] Fluhr, E.J., et al. “POWER8: A server-class processor in 22nm SOI with 7.6 Tb/s off-chip bandwidth,” In Proc. Int’l Solid State Circuits Conference (ISSCC), Feb. 2014.
[10] R. Bertran, et al. Systematic Energy Characterization of CMP/SMT Processor Systems via Automated Micro-Benchmarks. International Symposium on Microarchitecture (MICRO), December 2012
This is a two socket server in a 19 inch rack mounted, 4U (EIA units) mechanical form factor. It ships 2 x IBM POWER8 chips (in 6/12, 8/16, 24 core configurations) supporting a maximum of 1024 GB total memory (16 DDR3 CDIMM slots -16 GB, 32 GB, 64 GB @1600 MHz).
While some workloads were custom designed, others were created using frameworks like MicroProbe [10]. The workloads range from zero to high utilization of the POWER8 processor. Some workloads perform arithmetic or a combination of different arithmetic operations. Another workload represents a customer worst case, while yet another represents a true worst case workload. All these workloads are such that the temperature profile across the chip is fairly uniform.
In POWER8, there are four digital temperature sensors per core in the processor. The max variation of 2C was noted across the different on-chip thermal sensors on the processor.
For other systems without fine hardware controls, an external heater can be used for achieving a constant temperature profile, but at an additional cost.
Pessimistic operating conditions (like 85C, 1.2V) were deliberately chosen to magnify the contribution of leakage power to overall chip power. This is valuable in the context of the paper.
[11] US 7235468 B1: FinFET device with reduced DIBL
[12] Zyuban, V, “Power Optimized Processor Design” in The 2014 International Solid State Circuits Conference (ISSCC).
Though WLS Variation is presented with reference as leakage power while running no workload, the variation was again within 3.3% even while using leakage at any other workload (W1-W5) as reference.
Overall leakage power variation is more at extreme operating conditions (like table1: 85C, 1.2V) since leakage power is magnified here.