SlideShare une entreprise Scribd logo
1  sur  10
Experimental Characterization of Workload-
induced Logic States on Chip Leakage Power
of a Server Class Microprocessor
Arun Joseph, Rahul M Rao, Anand Haridass, Spandana Rachamalla, Diyanesh B
IBM Systems Group, Bangalore, India
Contact: arujosep@in.ibm.com
 Over the last decade or so, prior work has focused on addressing different
aspects of dependency of leakage power on logic states.
 [Shiue, Chen] discussed accounting for this during pre-silicon power modelling
and analysis.
 [Pedram, Najm] proposed techniques for minimizing leakage power
consumption, by performing logic state optimization.
 Other prior work [Naidu, Aloul] studied efficient techniques for finding a low
leakage vectors, which are applied when the chip enters sleep mode.
Slide 2
Background
 Interestingly, to the best of our knowledge, no prior work has carefully
characterized the impact of logic states on post-silicon leakage power of an
industry-class microprocessor chip in a real server system, while running
workloads.
 This might be because of challenges relating to the separation of leakage
power from total chip power, while running workloads.
 Additionally, careful isolation of the impact on leakage power from other
sources of variation like voltage and on-chip temperature, is a must for
performing this characterization.
 We believe, this experimental analysis will provide valuable new insight for
future research in related areas of design automation and design.
Slide 3
Motivation
 FreqLeak [Joseph et al.] enables accurate separation of leakage power from total power,
while running a workload, and while carefully maintaining constant on-chip voltage and
temperature.
 Comparing separated chip leakage across a range of workload conditions, provides a good
estimate of leakage power variation induced by workload-driven logic states.
Slide 4
Main Idea
Fig. 1. FreqLeak Overview Fig. 2. Separating contribution of workload-induced logic states
 Used Power S824 server [IBM RedBooks] that uses 22nm IBM POWER8
microprocessors [Fluhr et al.]
 Applied to evaluate the impact on the POWER8 VDD leakage power
 Done across workload conditions (zero to high utilization), VDD voltages (1.1,
1.2V), on-chip temperatures (55C, 75C, 85C), and two unique hardware parts
Slide 5
Experimental Setup
Fig. 3. Power S824 Server
Fig. 4. IBM POWER8 microprocessor
 The processor was configured to disable power gating.
 Nothing was done during the design of the processor to retain logic states.
 All measurements done while carefully maintaining constant on-chip voltage and
temperature (using a combination of hardware voltage & fan controls, and careful
choice of workloads).
 The on-chip temperature profile was kept within 2 Celsius across all experiments.
 Across the entire range of conditions, the variation caused by workload-induced
logic states (WLS), was observed.
Slide 6
Experimental Setup
 WLS Variation = 100 * (Leakage power
while running an active workload – Leakage
power while running no workload) / Leakage
power while running no workload.
 Maximum WLS variation was observed to be
3.3% of chip VDD leakage power.
Slide 7
Experimental Results
Workload WLS Variation (%)
W1 0.3
W2 3.3
W3 1.5
W4 0.6
W5 1.7
WLS Variation @T=85C, V=1.1V
Workload WLS Variation (%)
W1 0.5
W2 2.1
W3 1.1
W4 1.0
W5 1.5
WLS Variation @T=55C, V=1.1V
Workload WLS Variation (%)
W1 0.6
W2 2.6
W3 1.7
W4 2.0
W5 2.6
WLS Variation @T=85C, V=1.2V
Table 1
Table 2 Table 3
 Backed with such results, we make new observations, especially in the context of
high performance microprocessor and server system design.
 The best case benefit of per-state leakage analysis is ~3%, when compared to
state-independent analysis. This is marginal considering effort (in creating per-
state models) and challenges (in handling state explosion) involved.
 Power-grid analysis of IR drop due to leakage only needs to consider a max of
~3% variation across workloads.
 Realistic returns on solutions trying to reduce leakage power by optimizing for
logic states is fairly low, also considering the investment and trade-offs involved.
 These observations might be more true in future technologies (like 14nm), which
have reported reduced DIBL [Mouli, Zyuban], and thereby further reduced
variation in leakage power across states.
Slide 8
Experimental Analysis
 We introduce a methodology for characterizing the effect of workload-induced
logic states on chip leakage power.
 Experimental evaluation on the IBM POWER8, showed that the variation in
chip leakage induced by workload-driven logic states is 0.1 to 3.3%.
 First work to study impact of logic states on post-silicon leakage power of an
industry-class chip in a real server system, while running workloads.
 With the backing of such data, we translate that to new takeaways for design
and automation, especially in the context of high performance microprocessor
design.
 In future work, the proposed methodology needs to be applied on other class
of chips (non-high performance), to perform similar characterization.
Slide 9
Summary
 We introduce a methodology for characterizing the effect of workload-induced
logic states on chip leakage power.
 Experimental evaluation on the IBM POWER8, showed that the variation in
chip leakage induced by workload-driven logic states is 0.1 to 3.3%.
 First work to study impact of logic states on post-silicon leakage power of an
industry-class chip in a real server system, while running workloads.
 With the backing of such data, we translate that to new takeaways for design
and automation, especially in the context of high performance microprocessor
design.
 In future work, the proposed methodology needs to be applied on other class
of chips (non-high performance), to perform similar characterization.
Slide 9
Summary

Contenu connexe

Plus de Arun Joseph

Plus de Arun Joseph (10)

Rapidly Building Next Generation Web-based EDA Applications and Platforms fro...
Rapidly Building Next Generation Web-based EDA Applications and Platforms fro...Rapidly Building Next Generation Web-based EDA Applications and Platforms fro...
Rapidly Building Next Generation Web-based EDA Applications and Platforms fro...
 
Techniques for Efficient RTL Clock and Memory Gating Takedown of Next Generat...
Techniques for Efficient RTL Clock and Memory Gating Takedown of Next Generat...Techniques for Efficient RTL Clock and Memory Gating Takedown of Next Generat...
Techniques for Efficient RTL Clock and Memory Gating Takedown of Next Generat...
 
FVCAG: A framework for formal verification driven power modelling and verific...
FVCAG: A framework for formal verification driven power modelling and verific...FVCAG: A framework for formal verification driven power modelling and verific...
FVCAG: A framework for formal verification driven power modelling and verific...
 
FreqLeak
FreqLeakFreqLeak
FreqLeak
 
Process synchronization in multi core systems using on-chip memories
Process synchronization in multi core systems using on-chip memoriesProcess synchronization in multi core systems using on-chip memories
Process synchronization in multi core systems using on-chip memories
 
FirmLeak
FirmLeakFirmLeak
FirmLeak
 
A Hybrid Approach to Standard Cell Power Characterization based on PVT Indepe...
A Hybrid Approach to Standard Cell Power Characterization based on PVT Indepe...A Hybrid Approach to Standard Cell Power Characterization based on PVT Indepe...
A Hybrid Approach to Standard Cell Power Characterization based on PVT Indepe...
 
Empirically Derived Abstractions in Uncore Power Modeling for a Server-Class...
Empirically Derived Abstractions in Uncore Power Modeling for a  Server-Class...Empirically Derived Abstractions in Uncore Power Modeling for a  Server-Class...
Empirically Derived Abstractions in Uncore Power Modeling for a Server-Class...
 
End to End Self-Heating Analysis Methodology and Toolset for High Performance...
End to End Self-Heating Analysis Methodology and Toolset for High Performance...End to End Self-Heating Analysis Methodology and Toolset for High Performance...
End to End Self-Heating Analysis Methodology and Toolset for High Performance...
 
Per domain power analysis
Per domain power analysisPer domain power analysis
Per domain power analysis
 

Dernier

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 

Dernier (20)

Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 

Experimental Characterization of Workload-induced Logic States on Chip Leakage Power of a Server Class Microprocessor

  • 1. Experimental Characterization of Workload- induced Logic States on Chip Leakage Power of a Server Class Microprocessor Arun Joseph, Rahul M Rao, Anand Haridass, Spandana Rachamalla, Diyanesh B IBM Systems Group, Bangalore, India Contact: arujosep@in.ibm.com
  • 2.  Over the last decade or so, prior work has focused on addressing different aspects of dependency of leakage power on logic states.  [Shiue, Chen] discussed accounting for this during pre-silicon power modelling and analysis.  [Pedram, Najm] proposed techniques for minimizing leakage power consumption, by performing logic state optimization.  Other prior work [Naidu, Aloul] studied efficient techniques for finding a low leakage vectors, which are applied when the chip enters sleep mode. Slide 2 Background
  • 3.  Interestingly, to the best of our knowledge, no prior work has carefully characterized the impact of logic states on post-silicon leakage power of an industry-class microprocessor chip in a real server system, while running workloads.  This might be because of challenges relating to the separation of leakage power from total chip power, while running workloads.  Additionally, careful isolation of the impact on leakage power from other sources of variation like voltage and on-chip temperature, is a must for performing this characterization.  We believe, this experimental analysis will provide valuable new insight for future research in related areas of design automation and design. Slide 3 Motivation
  • 4.  FreqLeak [Joseph et al.] enables accurate separation of leakage power from total power, while running a workload, and while carefully maintaining constant on-chip voltage and temperature.  Comparing separated chip leakage across a range of workload conditions, provides a good estimate of leakage power variation induced by workload-driven logic states. Slide 4 Main Idea Fig. 1. FreqLeak Overview Fig. 2. Separating contribution of workload-induced logic states
  • 5.  Used Power S824 server [IBM RedBooks] that uses 22nm IBM POWER8 microprocessors [Fluhr et al.]  Applied to evaluate the impact on the POWER8 VDD leakage power  Done across workload conditions (zero to high utilization), VDD voltages (1.1, 1.2V), on-chip temperatures (55C, 75C, 85C), and two unique hardware parts Slide 5 Experimental Setup Fig. 3. Power S824 Server Fig. 4. IBM POWER8 microprocessor
  • 6.  The processor was configured to disable power gating.  Nothing was done during the design of the processor to retain logic states.  All measurements done while carefully maintaining constant on-chip voltage and temperature (using a combination of hardware voltage & fan controls, and careful choice of workloads).  The on-chip temperature profile was kept within 2 Celsius across all experiments.  Across the entire range of conditions, the variation caused by workload-induced logic states (WLS), was observed. Slide 6 Experimental Setup
  • 7.  WLS Variation = 100 * (Leakage power while running an active workload – Leakage power while running no workload) / Leakage power while running no workload.  Maximum WLS variation was observed to be 3.3% of chip VDD leakage power. Slide 7 Experimental Results Workload WLS Variation (%) W1 0.3 W2 3.3 W3 1.5 W4 0.6 W5 1.7 WLS Variation @T=85C, V=1.1V Workload WLS Variation (%) W1 0.5 W2 2.1 W3 1.1 W4 1.0 W5 1.5 WLS Variation @T=55C, V=1.1V Workload WLS Variation (%) W1 0.6 W2 2.6 W3 1.7 W4 2.0 W5 2.6 WLS Variation @T=85C, V=1.2V Table 1 Table 2 Table 3
  • 8.  Backed with such results, we make new observations, especially in the context of high performance microprocessor and server system design.  The best case benefit of per-state leakage analysis is ~3%, when compared to state-independent analysis. This is marginal considering effort (in creating per- state models) and challenges (in handling state explosion) involved.  Power-grid analysis of IR drop due to leakage only needs to consider a max of ~3% variation across workloads.  Realistic returns on solutions trying to reduce leakage power by optimizing for logic states is fairly low, also considering the investment and trade-offs involved.  These observations might be more true in future technologies (like 14nm), which have reported reduced DIBL [Mouli, Zyuban], and thereby further reduced variation in leakage power across states. Slide 8 Experimental Analysis
  • 9.  We introduce a methodology for characterizing the effect of workload-induced logic states on chip leakage power.  Experimental evaluation on the IBM POWER8, showed that the variation in chip leakage induced by workload-driven logic states is 0.1 to 3.3%.  First work to study impact of logic states on post-silicon leakage power of an industry-class chip in a real server system, while running workloads.  With the backing of such data, we translate that to new takeaways for design and automation, especially in the context of high performance microprocessor design.  In future work, the proposed methodology needs to be applied on other class of chips (non-high performance), to perform similar characterization. Slide 9 Summary
  • 10.  We introduce a methodology for characterizing the effect of workload-induced logic states on chip leakage power.  Experimental evaluation on the IBM POWER8, showed that the variation in chip leakage induced by workload-driven logic states is 0.1 to 3.3%.  First work to study impact of logic states on post-silicon leakage power of an industry-class chip in a real server system, while running workloads.  With the backing of such data, we translate that to new takeaways for design and automation, especially in the context of high performance microprocessor design.  In future work, the proposed methodology needs to be applied on other class of chips (non-high performance), to perform similar characterization. Slide 9 Summary

Notes de l'éditeur

  1. [1] W. Shiue, “Leakage power estimation and minimization in VLSI circuits”, ISCAS, 2001, pp.178-181 [2] Z. Chen, et. al., “Estimation of Standby Leakage Power in CMOS Circuits Considering Accurate Modeling of Transistor Stacks”. Int. ISLPED, 1998, pp.239-244. [3] Abdollahi, A.; Fallah, F.; Pedram, M., "Leakage current reduction in CMOS VLSI circuits by input vector control," in Very Large Scale Integration (VLSI) Systems, IEEE Transactions on , Feb. 2004 [4] J. Halter and F. Najm, “A Gate-Level Leakage Power Reduction Method for Ultra Low Power CMOS Circuits,” Proc. of CICC, pp.475-478, 1997 [5] Naidu, S.R.; Jacobs, E.T.A.F., "Minimizing stand-by leakage power in static CMOS circuits," in Design, Automation and Test in Europe, 2001. Conference and Exhibition 2001. Proceedings , vol., no., pp.370-376, 2001 [6] F. Aloul, S. Hassoun, K. Sakallah, D. Blaauw, “Robust SATBased Search Algorithm for Leakage Power Reduction,” PATMOS, 2002.
  2. [7] Joseph, A.; Haridass, A.; Lefurgy, C.; Pai, S.; Rachamalla, S.; Campisano, F., "FreqLeak: A frequency step based method for efficient leakage power characterization in a system," in Low Power Electronics and Design (ISLPED), 2015 IEEE/ACM International Symposium on , vol., no., pp.195-200, 22-24 July 2015
  3. [8] IBM RedBooks. IBM Power Systems S814 and S824 Technical Overview and Introduction. Aug 2014. [9] Fluhr, E.J., et al. “POWER8: A server-class processor in 22nm SOI with 7.6 Tb/s off-chip bandwidth,” In Proc. Int’l Solid State Circuits Conference (ISSCC), Feb. 2014. [10] R. Bertran, et al. Systematic Energy Characterization of CMP/SMT Processor Systems via Automated Micro-Benchmarks. International Symposium on Microarchitecture (MICRO), December 2012 This is a two socket server in a 19 inch rack mounted, 4U (EIA units) mechanical form factor. It ships 2 x IBM POWER8 chips (in 6/12, 8/16, 24 core configurations) supporting a maximum of 1024 GB total memory (16 DDR3 CDIMM slots -16 GB, 32 GB, 64 GB @1600 MHz). While some workloads were custom designed, others were created using frameworks like MicroProbe [10]. The workloads range from zero to high utilization of the POWER8 processor. Some workloads perform arithmetic or a combination of different arithmetic operations. Another workload represents a customer worst case, while yet another represents a true worst case workload. All these workloads are such that the temperature profile across the chip is fairly uniform. In POWER8, there are four digital temperature sensors per core in the processor. The max variation of 2C was noted across the different on-chip thermal sensors on the processor. For other systems without fine hardware controls, an external heater can be used for achieving a constant temperature profile, but at an additional cost.
  4. Pessimistic operating conditions (like 85C, 1.2V) were deliberately chosen to magnify the contribution of leakage power to overall chip power. This is valuable in the context of the paper.
  5. [11] US 7235468 B1: FinFET device with reduced DIBL [12] Zyuban, V, “Power Optimized Processor Design” in The 2014 International Solid State Circuits Conference (ISSCC). Though WLS Variation is presented with reference as leakage power while running no workload, the variation was again within 3.3% even while using leakage at any other workload (W1-W5) as reference. Overall leakage power variation is more at extreme operating conditions (like table1: 85C, 1.2V) since leakage power is magnified here.