2. CONTENT
• Introduction
• Approach
Power consumption in CMOS
Circuit level power optimization
Logic synthesis for low power
Algorithmic level Design
Architecture level power optimization
• Conclusion
• Acknowledgement
• References
3. Introduction
Why Low Power???
• Low power architecture has become necessary
with new-age demands:
– Increasing design complexity
– Demands of and for portable equipment
• ƒCommunication
• ƒMedia
• ƒMobile computers
• Most embedded systems run on batteries
– ƒObjective to extend battery life as long as possible
without sacrificing too much performance
4. Objective
• Lower running costs
• Reduce cooling requirements.
• To reduce noise.
• To reduce operating costs for energy and
cooling.
• Reduce overall energy consumption
• Energy battery will not grow drastically in the
near future due to technology and safety
reasons!!!
5. POWER CONSUMPTION IN CMOS
CMOS CUIRCUIT:
• CMOS devices best known for its low power
consumption.
• Different types of power consumption in a CMOS
logic circuit, focusing on calculation of power-
dissipation capacitance (Cpd)and, finally , the
determination of total power consumption in a CMOS
device.
6. WHY CMOS?
• To determine power-supply sizing
• To determine current requirements,
• To determine cooling/ heat sink requirements
• Criteria for device selection
• Determine the maximum reliable operating
frequency.
7. COMPONENTS OF POWER
DISSIPATION
• Static Power Dissipation- this is a type of
dissipation , which does not have any effect of
level change in the input and output.
• Dynamic Power Dissipation- whenever the
logic level changes at different points in the
circuit because of the change in the
inputsignals the dynamic power dissipation
occurs.
8. Static Power Consumption
• Static power consumption is the product of
the device leakage current and the supply
voltage.
• Total static power consumption PS can be
obtained .
• As, PS = (leakage current)*(supply voltage)
PS = VCC*ICC
9. Dynamic Power Consumption
• Dynamic Power Consumption can be calculated
• As, Cpd =ICC/(VCC*Fi)-CL(eff)
Where:
fI = input frequency (Hz)
VCC = supply voltage (V)
CL(eff) = effective load capacitance on the board (F)
ICC = measured value of current into the device (A)
10. • The effective load capacitance is calculated,
CL(eff)=(CL*Nsw*fO)/fI
Where:
fO/fI = ratio of output and input frequency (Hz)
Nsw = number of bits switching
CL = load capacitance (F)
11. Power consumption minimization can be
achieved in a number of ways:-
• Reducing dc power consumption through leakage.
• The use of minimum-size devices is an advantage.
• The choice of low-power devices , with systems today
using devices in the 1.5-V to 3.3-V VCC range.
• Dynamic power consumption can be limited by reducing
supply voltage, switched capacitance, and frequency at
which the logic is clocked.
12. Benefits
• Power consumption is a function of load capacitance,
frequency of operation, and supply voltage. A reduction of
any one of these is beneficial.
• A reduction in power consumption provides several benefits:-
• Less heat is generated.
• the reliability of the system is increased
• the extended life of the battery in battery-powered systems.
13. Circuit-level power optimization
Techniques used to reduce power consumption at the
circuit level
• Transistor Sizing
• Voltage Scaling
• Voltage Islands
• Variable VDD
•Multiple threshold voltages
• Power gating
• Long channel transistors
• Stacking and parking states
14. Transistor Sizing
• Adjusting the size of each
gate or transistor for
minimum power.
• Transistor sizing involves
increasing the gate width
to increase its speed.
•All Combinational Static
CMOS devices are
composed of Pull up
networks and Pull down
networks. These times are
in reference to a Load
Capacitor charging and
discharging through the
device.
15. Voltage Scaling
• Lower supply voltages use less power, but go slower.
• voltage scaling is a power management technique in computer
architecture, where the voltage used in a component is increased
or decreased, depending upon circumstances.
Undervolting : Decrease voltage to conserve power
Overvolting : Increase voltage to allow operation at higher speed .
Voltage Island
• Different blocks can be run at different voltages, saving power. This
design practice may require the use of level-shifters when two
blocks with different supply voltages communicate with each
other.
16. Variable VDD
• The voltage for a single block can be varied during operation - high
voltage (and high power) when the block needs to go fast, low
voltage when slow operation is acceptable.
Multiple threshold voltages
• Power can be saved by using a mixture of CMOS transistors with
two or more different threshold voltages.
• Two different thresholds available:
1. High-Vt 2. Low-Vt
• High threshold transistors are slower but leak less, and can be
used in non-critical circuits.
17. Power Gating
• Technique uses high Vt sleep transistors which cut-off a circuit
block when the block is not switching.
• Also known as MTCMOS, or Multi-Threshold CMOS reduces
stand-by or leakage power.
• Enables Iddq testing: method for testing CMOS integrated
testing for the presence of manufacturing fault
Long Channel Transistors
• Transistors of more than minimum length leak less, but are bigger
and slower.
18. Stacking and parking states
• Logic gates may leak differently during logically equivalent input
states (say 10 on a NAND gate, as opposed to 01).
• State machines may have less leakage in certain states.
Logic styles
• Dynamic and static logic.
• Have different speed/power tradeoffs.
19. Logic synthesis for low power
Logic synthesis : a process by which an abstract form of desired
circuit behavior, typically at register transfer level(RTL), is turned
into a design implementation in terms of logic gates, typically by a
computer program called a synthesis tool.
The following steps can have a significant impact on power
optimization:
• Clock Gating
• Technology Mapping
• Finite-State Machine Decomposition
20. Clock Gating
• Saves power by adding more logic to a circuit to prune
the clock tree.
• Pruning the clock disables portions of the circuitry so that
the flip flops in them do not have to switch states. Switching
states consumes power. When not being switched, the
switching power consumption goes to zero, and only leakage
current are incurred.
Technology mapping
• Process where we convert a schematic (expression) with AND,
OR, and NOT gates to NAND and NOR gates.
• To reduce implementation cost and turnaround time,
designers use gate-arrays.
• These gate-arrays contains only m-input NAND and NOR gates
where m is usually 3.
21.
22. Finite State Machine Decomposition
• Compute two sub-FSMs together having the same functionality as
the original FSM.
• For all the transitions within one sub-FSM, the clock for the other
sub-FSM is disabled.
• To minimize the average switching activity, we search for a small
cluster of states with high stationary state probability and use it to
create the small sub FSM.
• This way we will have a small amount of logic that is active most of
the time, during which is disabling a much larger circuit, the other
sub-FSM.
• Power consumption can be substantially reduced, in some cases up
to 80%
23. Algorithmic Level Design
•Minimizing the switching activity, at high level, is one way to reduce
the power dissipation of digital processors.
• One method to minimize the switching signals, at the algorithmic
level, is to use an appropriate coding for the signals rather than
straight binary code.
24. State Encoding for a Counter
•Two-bit binary counter:
State sequence, 00 → 01 → 10 → 11 → 00
Six bit transitions in four clock cycles
6/4 = 1.5 transitions per clock
•Two-bit Gray-code counter
State sequence, 00 → 01 → 11 → 10 → 00
Four bit transitions in four clock cycles
4/4 = 1.0 transition per clock
•Gray-code counter is more power efficient
28. N-Bit Counter: Toggles in Counting Cycle
• Binary counter: T(binary) = 2(2N–1)
• Gray-code counter: T(gray) = 2N
• T(gray)/T(binary) = 2N-1/(2N–1) → 0.5
(N is power of 2)
29. ARCHITECTURE LEVEL
Statically Pipelined Processor Supported By An
Optimizing Compiler-
Idea: Control during each cycle for each
portion of the processor is explicitly
represented in each instruction.
31. TRADITONAL PIPELINE:
1. Instructions spend several cycles in pipeline.
2. Information about each instruction flows through pipeline
via pipeline registers to control each portion of processor
that will take a specific action during each cycle.
STATIC PIPELINE:
1. Data still passes through processor in multiple cycles.
2. How each portion of processor is controlled during each
cycle is explicitly represented in each instruction.
3. Instructions are encoded to simultaneously perform actions
associated with separate pipeline stages.
32. FEATURES OF STATIC PIPELINE:
• It is determined statically by the compiler as opposed to
dynamically by hardware.
• It doesn’t need pipeline registers as it doesn’t need to break
the instructions into multiple stages.
• It has ten internal registers which are explicitly read and
written by instructions and can hold their values across
multiple cycles.
• Internal registers are accessed only when needed unlike
pipeline registers which are read and written every cycle.
33. • It is two stage processor including fetch and everything after
fetch.
• Everything after the fetch operation happens in parallel.
• Instructions are already partially decode as compared to
traditional pipeline.
• Branch penalty is reduced to one cycle.
• Instruction Set Architecture is quite different as compared to
traditional processors.
• Each instruction consists of a set of effects.
• Each effect updates some portion of the processor.
34. • It includes : 1 ALU operation, 1 memory operation, 2 register
reads, 1 register write and 1 sign extension.
• Next PC can be assigned the value of 1 of the internal
registers.
• If ALU operation is a branch operation, next PC will be set
according to the outcome of the branch.
• A register isn’t read in same instructions as arithmetic
operation that uses it.
• To have both integer and floating point register files we would
need 1 extra bit for each register field.
• To avoid this problem, we use a single register file to hold
both integer and floating point values.
35. COMPILATION:
• Static Pipeline Architecture exposes more details of
data path to the compiler.
• It allows compiler to perform optimizations that
would not be possible on a conventional machine.
36. TYPES OF OPTIMIZATIONS
• COPY PROPAGATION:
It is used for an assignment like x = y, where the
compiler replaces later uses of x with y as long as
intervening instructions have not changed the value
of x or y.
• DEAD ASSIGNMENT ELIMINATION:
It removes assignments to registers when the value
is never read.
37. • SUB-EXPRESSION ELIMINATON:
It looks for instances when values are produced more
than once and replaces subsequent productions of
the value with the initial one.
• REDUNDANT ASSIGNMENT ELIMINATION:
It removes assignments that have been made
previously so long as that values have not changed
since the last assignment.
38. CONCLUSION
• Power is critical in processor design: cost and
dependability.
• New approaches on architectural and circuit level are
being proposed.
• Accurate energy consumption values are yet to be
assessed.
• Compiler optimizations and circuit level techniques
suggested have potential to be a viable technique for
significantly reducing processor energy consumption.
39. FUTURE ENHANCEMENTS:
• Several other possibilities for encoding instructions
exist (using different formats for different sets of
effects to perform).
• Verilog Model which will allow for accurate
measurement of energy consumption as well as area
and timing.
40. ACKNOWLEDGEMENT
We would like to thank our Professor and Guide , Mr.
Koushlendra Kumar Singh for giving us the
opportunity to present the topic of Low Power
Architecture in such creative manner.
The journey was indeed very illuminating and we got
to know several new things.
41. REFERENCES
• Wikipedia- The Free Encyclopaedia
• Low-Power Architecture : Bill Dally, Stanford University.
• T. Austin, E. Larson, and D. Ernst. Simple Scalar: An
Infrastructure for Computer System Modelling.
Computer, 35(2):59–67, 2002.
• P. Sassone, D. Wills, and G. Loh. Static Strands: Safely
Collapsing Dependence Chains for Increasing
Embedded Power Efficiency. In Proceedings of the 2005
ACM SIGPLAN/SIGBED conference on Languages,
compilers, and tools for embedded systems, pages
127–136. ACM, 2005.