SlideShare a Scribd company logo
1 of 27
EC6009-Advanced Computer Architecture
UNIT I FUNDAMENTALS OF COMPUTER DESIGN 9
Review of Fundamentals of CPU, Memory and IO – Trends in technology, power, energy
and cost, Dependability - Performance Evaluation
UNIT II INSTRUCTION LEVEL PARALLELISM 9
ILP concepts – Pipelining overview - Compiler Techniques for Exposing ILP – Dynamic
Branch Prediction – Dynamic Scheduling – Multiple instruction Issue – Hardware Based
Speculation – Static scheduling - Multi-threading - Limitations of ILP – Case Studies.
UNIT III DATA-LEVEL PARALLELISM 9
Vector architecture – SIMD extensions – Graphics Processing units – Loop level
parallelism.
UNIT IV THREAD LEVEL PARALLELISM 9
Symmetric and Distributed Shared Memory Architectures – Performance Issues –
Synchronization – Models of Memory Consistency – Case studies: Intel i7 Processor, SMT
& CMP Processors
UNIT V MEMORY AND I/O 9
Cache Performance – Reducing Cache Miss Penalty and Miss Rate – Reducing Hit Time –
Main Memory and Performance – Memory Technology. Types of Storage Devices – Buses
– RAID – Reliability, Availability and Dependability – I/O Performance Measures.
7/3/2019 VII-ECE-B
On completion of the course, the students will be able to
CO1
Explain the performance of different architectures with
respect to various parameters
K2
CO2 Describe the performance of different ILP techniques K2
CO3
Discuss the performance of different architectures &
exploiting DLP
K2
CO4 Illustrate the concepts of Transport level protocol. K2
CO5
Distinguish cache and memory related issues in
multiprocessor.
K2
EC6009-Advanced Computer Architecture
CO/P
O
PO
1
PO
2
PO
3
PO
4
PO
5
PO
6
PO
7
PO
8
PO
9
PO1
0
PO1
1
PO1
2
PSO
1
PSO
2
CO1 3 2 1
CO2 3 2 1
CO3 3 2 1
CO4 3 2 1
CO5 3 2 2
C404 3 2 1 - - - - - - - -- - - -7/3/2019 VII-ECE-B
• PO 1 - Engineering Knowledge
• PO 2 - Problem analysis
• PO 3 - Design / development of solutions
• PO 4 - Conduct investigations of complex problems
• PO 5 - Modern tool usage:
• PO 6 - Engineer and Society:
• PO 7 - Environment and sustainability:
• PO 8 - Ethics:
• PO 9 - Individual and Team-work:
• PO 10 - Communication:
• PO 11 - Project management and finance:
• PO 12 - Life-long learning:
7/3/2019 VII-ECE-B
• EC6009
• Advanced Computer Architecture
• UNIT-1
• Fundamentals of Computer Design
7/3/2019 VII-ECE-B
Outline
• 1.1 Introduction
• 1.2 Classes of Computers
• 1.3 Defining Computer Architecture
• 1.4 Trends in Technology
• 1.5 Trends in Power in Integrated Circuits
• 1.6 Trends in Cost
• 1.7 Dependability
• 1.8 Measuring, Reporting, and Summarizing Performance
• 1.9 Quantitative Principles of Computer Design
• 1.10Putting It All Together: Performance and Price-
Performance7/3/2019 VII-ECE-B
Computer Technology
• Performance improvements:
– Improvements in semiconductor technology
• Feature size, clock speed
– Improvements in computer architectures
• Enabled by HLL compilers, UNIX
• Lead to RISC (Simple INS Set)architectures
– Together have enabled:
• Lightweight computers
• Productivity-based managed/interpreted programming
languages
Introduction
7/3/2019 VII-ECE-B
Single Processor Performance
Introduction
RISC
Move to multi-processor
VII-ECE-B
Current Trends in Architecture
• Cannot continue to leverage Instruction-Level parallelism (ILP)
– Single processor performance improvement ended in 2003
• New models for performance:
– Data-level parallelism (DLP)
– Thread-level parallelism (TLP)
– Request-level parallelism (RLP)
• These require explicit restructuring of the application
Introduction
7/3/2019 VII-ECE-B
Organization, Hardware, and Architecture
• Organization: includes the high-level aspects of a computer’s
design.
– Memory system, the memory interconnect, and the design of
the internal processor or CPU (arithmetic, logic, branching, and
data transfer).
– For example: AMD Opteron 64 and Intel P4 have same ISA,
but they have different internal pipeline and cache
organizations.
• Hardware: detailed logic design and the packaging technology.
– For example, P4 and Mobile P4 have same ISA and
organization, but they have different clock frequency and
memory system.
• Architecture: covers all three aspects of computer design –
instruction set architecture, organization, and hardware.
– Designer must meet functional requirements as well as price,
power, performance, and availability goals.7/3/2019 VII-ECE-B
Instruction Set Architecture: Critical Interface
• Properties of a good abstraction
– Lasts through many generations (portability)
– Used in many different ways (generality)
– Provides convenient functionality to higher levels
– Permits an efficient implementation at lower levels
instruction set
software
hardware
7/3/2019 VII-ECE-B
Instruction Set Architecture (ISA)
– Class of ISA:
ISA is the actual programmer-visible instruction set.
– General purpose Architecture( Reg Memory, Load-Store )
– Stack Architecture
– Memory addressing;
(if Program running 32-bit processor can address upto
4GB (2*32bytes) of address space)
– Addressing modes;
(Direct & Indirect) apart etc…
– Types and sizes of operands:
The common type Supported by ISA, includes, signed ,
unsigned, single & double precision Floating point
numbers)
– Data processing & Control flow instructions;
7/3/2019 VII-ECE-B
Classes of Computers
• Personal Mobile Device (PMD)
– e.g. start phones, tablet computers
– Emphasis on energy efficiency and real-time
• Desktop Computing (Work stations)
– Emphasis on price-performance
• Servers (Main frame)
– Emphasis on availability, scalability, throughput
• Clusters / Warehouse Scale Computers
– Used for “Software as a Service (SaaS)”
– Emphasis on availability and price-performance
– Sub-class: Supercomputers, emphasis: floating-point
performance and fast internal networks
• Embedded Computers
– Emphasis: price
ClassesofComputers
7/3/2019 VII-ECE-B
Trends in Technology
• A successful new ISA may last decades, for example, IBM
mainframe.
• Four critical technologies
– Integrated circuit logic technology: transistor density
increased by about 35% per year, quadrupling in
somewhat over four years;
– Semiconductor DRAM (Dynamic Random-Access
Memory): capacity increases by about 40% per year,
doubling roughly every two years;
– Magnetic disk technology: roller coaster of rates, disk
are 50-100 times cheaper per bit than DRAM .
– Network technology: network performance depends
both on the performance of switches and transmission.
7/3/2019 VII-ECE-B
Scaling of Transistor Performance and Wires
• Feature size: the minimum size of a transistor or a wire in
either the x or y dimension.
– From 10 microns in 1971 to 0.09 microns (90 nm) in 2006;
– The density of transistors increases quadratically with a
linear decrease in feature size;
– Transistor performance improves linearly with decreasing
feature size;
– Since improvement in transistor density, thus CPU move
quickly from 4-bit to 8-bit, to 16-bit, to 32-bit
microprocessors;
7/3/2019 VII-ECE-B
Performance Trends: Bandwidth over Latency
• Bandwidth or
throughput:
• the total amount of
work done in a given
time.
– Such as megabyte per
second for a disk
transfer.
• Latency or response
time: the time between
the start and the
completion of an event.
– Such as milliseconds
for a disk access.
7/3/2019 VII-ECE-B
Power
• Power also provides challenges as devices are
scaled.
– Dynamic power (watts, W)in CMOS chip: the
traditional dominant energy consumption has been
in switching transistors.
– For mobile devices: they care about battery life more
than power, so energy is the proper metric,
measured in joules:
switchedFrequencyVoltageloadCapacitive
2
1
Power 2
dynamic 
† In modern VLSI, the exact power measurement is the sum of,
Powertotal=Powerdynamic+Powerstatic+Powerleakage
2
dynamic VoltageloadCapacitiveEnergy 
7/3/2019 VII-ECE-B
Power
• Static power: an important issue because leakage
current flows even when a transistor is off:
– Thus, transistor ↑, power ↑;
– Feature size ↓, power ↑ (why? You can find out in
VLSI area).
VoltageCurrentPower staticstatic 
7/3/2019 VII-ECE-B
Silicon Wafer and Dies
• Exponential cost decrease – technology
basically the same:
A wafer is tested and chopped into dies that are
packaged. Die (晶粒)
Wafer (晶圓)
AMD K8, source: http://www.amd.com
dies along the edge
7/3/2019 VII-ECE-B
Cost of an Integrated Circuit (IC)
yieldDiewaferperDies#
waferofCost
dieofCost


yieldtestFinal
testfinalandpackagingofCostdietestingofCostdieofCost
ICofCost


 
areaDie2
diameterWaferπ
areaDie
radiusWaferπ
waferperDies#
2





α
α
areaDiedesityDefect
1yieldWaferyieldDie






 

Today’s technology:   4.0, defect density 0.4 ~ 0.8 per cm2
(A greater portion of the cost that varies between
machines)
(sensitive to die size) (# of dies along the edge)
7/3/2019 VII-ECE-B
Response Time, Throughput, and Performance
• Response time : the time between the start and the
completion of an event – also referred to as execution
time.
– The computer user is interested.
• Throughput : the total amount of work done in a given
time.
– The administrator of a large data processing center may be
interested.
• In comparing design alternatives,
– The phrase “X is faster than Y” is used here to mean that the
response time or execution time is lower on X than on Y.
– In particular, “X is n times faster than Y” or “the throughput of
X is n times higher than Y” will mean
n
X
Y
timeExecution
timeExecution
7/3/2019 VII-ECE-B
Performance Measuring
• Execution is the reciprocal of performance,
X
X
timeExecution
1
ePerformanc 
Y
X
X
Y
X
Y
ePerformanc
ePerformanc
ePerformanc
1
ePerformanc
1
TimeExecution
TimeExecution
n
7/3/2019 VII-ECE-B
Reliable Measure – User CPU Time
• Response time may include disk access, memory access,
input/output activities, CPU event and operating system
overhead – everything…
• In order to get an accurate measure of performance, we use
CPU time instead of using response time.
• CPU time is the time the CPU spends computing a program and
does not include time spent waiting for I/O or running other
programs.
• CPU time can also be divided into user CPU time (program) and
system CPU time (OS).
• Key in UNIX command time, we have, 90.7s 12.9s 2:39 65%
(user CPU, system CPU, total response,%).
• In our performance measures, we use user CPU time – because
of its independence on the OS and other factors.
7/3/2019 VII-ECE-B
CPU Performance
• Essentially all computers are constructed using clock
(all called ticks, clock ticks, clock periods, clocks,
cycles, or clock cycles) running at a constant rate.
– Clock rate: today in GHz
– Clock cycle time: clock cycle time = 1/clock rate
– Ex. 1 GHz clock rate = 1 ns cycle time
• Thus, the CPU time for a program can be expressed
two ways:
Or,
timecycleClockprogramaforcyclesclockCPUTimeCPU 
rateClock
programaforcyclesclockCPU
TimeCPU 
7/3/2019 VII-ECE-B
CPU Performance
• We can also count the number of instructions executed – the
instruction path length or instruction count (IC).
• If we know the number of clock cycles and IC, then the average
number of clock cycles per instruction (CPI).
• CPI is computed as
• Thus, clock cycles can be defined as IC × CPI, this allows us to use CPI
in the execution time formula:
IC
programaforcyclesclockCPU
CPI 
† This figure provides insight into different styles of instruction sets and
implementations.
rateClock
CPIIC
timecycleClockCPIICtimeCPU


7/3/2019 VII-ECE-B
7/3/2019 VII-ECE-B
CPU Performance
• The pieces fit together of CPU time
• A α% improvement in any one of three pieces leads to a α% improvement
in CPU time.
– Unfortunately, it is difficult to change one parameter in complete isolation
form others, because the technologies of them are interdependent:
• Clock cycle time: Hardware technology and organization;
• CPI: Organization and instruction set architecture;
• Instruction count: Instruction set architecture and compiler technology.
timeCPU
program
Seconds
cycleClock
Seconds
nInstructio
cyclesClock
Program
nsInstructio
program
timecyclecyclesclock
timecycleClockprogramaforcyclesclockCPUTimeCPU



† Processor performance is dependent upon three characteristics:
instruction count, clock cycles per instruction and clock cycle (or rate).
† Computer architecture is focus on CPI and IC parameters.
7/3/2019 VII-ECE-B
CPU Performance
• To calculate the number of total processor clock cycles as
• To express CPU time again
– And overall CPI as
i
n
i
i CPIICcyclesclockCPU
1
 
ICi: the number of times instruction i is executed in a program.
CPIi: the average number of clocks per instruction for instruction i.
† ICi/IC presents the fraction of occurrences of that instruction in a program.
† It is useful in designing the processor.
timecycleClockCPIICtimeCPU
1






 
i
n
i
i













n
i
i
i
i
n
i
i
1
1
CPI
countnInstructio
IC
countnInstructio
CPIIC
CPI
Hint: CPIi should be measured
because pipeline effects, cache
misses, and any other memory
system inefficiencies.
7/3/2019 VII-ECE-B

More Related Content

What's hot

Pipeline hazard
Pipeline hazardPipeline hazard
Pipeline hazardAJAL A J
 
Computer organization memory
Computer organization memoryComputer organization memory
Computer organization memoryDeepak John
 
Computer Memory Hierarchy Computer Architecture
Computer Memory Hierarchy Computer ArchitectureComputer Memory Hierarchy Computer Architecture
Computer Memory Hierarchy Computer ArchitectureHaris456
 
Introduction to Embedded System I: Chapter 2 (5th portion)
Introduction to Embedded System I: Chapter 2 (5th portion)Introduction to Embedded System I: Chapter 2 (5th portion)
Introduction to Embedded System I: Chapter 2 (5th portion)Moe Moe Myint
 
Introduction to embedded system design
Introduction to embedded system designIntroduction to embedded system design
Introduction to embedded system designMukesh Bansal
 
Real Time Operating system (RTOS) - Embedded systems
Real Time Operating system (RTOS) - Embedded systemsReal Time Operating system (RTOS) - Embedded systems
Real Time Operating system (RTOS) - Embedded systemsHariharan Ganesan
 
Memory & I/O interfacing
Memory & I/O  interfacingMemory & I/O  interfacing
Memory & I/O interfacingdeval patel
 
Computer Organisation & Architecture (chapter 1)
Computer Organisation & Architecture (chapter 1) Computer Organisation & Architecture (chapter 1)
Computer Organisation & Architecture (chapter 1) Subhasis Dash
 
Difference between Single core, Dual core and Quad core Processors
Difference between Single core, Dual core and Quad core ProcessorsDifference between Single core, Dual core and Quad core Processors
Difference between Single core, Dual core and Quad core ProcessorsDeep Kakkad
 
Embedded system Design
Embedded system DesignEmbedded system Design
Embedded system DesignAJAL A J
 
8257 DMA Controller
8257 DMA Controller8257 DMA Controller
8257 DMA ControllerShivamSood22
 
Real Time OS For Embedded Systems
Real Time OS For Embedded SystemsReal Time OS For Embedded Systems
Real Time OS For Embedded SystemsHimanshu Ghetia
 
Computer architecture virtual memory
Computer architecture virtual memoryComputer architecture virtual memory
Computer architecture virtual memoryMazin Alwaaly
 

What's hot (20)

Pipeline hazard
Pipeline hazardPipeline hazard
Pipeline hazard
 
Computer organization memory
Computer organization memoryComputer organization memory
Computer organization memory
 
Computer Memory Hierarchy Computer Architecture
Computer Memory Hierarchy Computer ArchitectureComputer Memory Hierarchy Computer Architecture
Computer Memory Hierarchy Computer Architecture
 
Introduction to Embedded System I: Chapter 2 (5th portion)
Introduction to Embedded System I: Chapter 2 (5th portion)Introduction to Embedded System I: Chapter 2 (5th portion)
Introduction to Embedded System I: Chapter 2 (5th portion)
 
Introduction to embedded system design
Introduction to embedded system designIntroduction to embedded system design
Introduction to embedded system design
 
Real Time Operating system (RTOS) - Embedded systems
Real Time Operating system (RTOS) - Embedded systemsReal Time Operating system (RTOS) - Embedded systems
Real Time Operating system (RTOS) - Embedded systems
 
ARM Processor
ARM ProcessorARM Processor
ARM Processor
 
Embedded System Basics
Embedded System BasicsEmbedded System Basics
Embedded System Basics
 
Chapter1.slides
Chapter1.slidesChapter1.slides
Chapter1.slides
 
Memory & I/O interfacing
Memory & I/O  interfacingMemory & I/O  interfacing
Memory & I/O interfacing
 
Computer Organisation & Architecture (chapter 1)
Computer Organisation & Architecture (chapter 1) Computer Organisation & Architecture (chapter 1)
Computer Organisation & Architecture (chapter 1)
 
Pipelining In computer
Pipelining In computer Pipelining In computer
Pipelining In computer
 
Dram and its types
Dram and its typesDram and its types
Dram and its types
 
Difference between Single core, Dual core and Quad core Processors
Difference between Single core, Dual core and Quad core ProcessorsDifference between Single core, Dual core and Quad core Processors
Difference between Single core, Dual core and Quad core Processors
 
Embedded system Design
Embedded system DesignEmbedded system Design
Embedded system Design
 
Real-Time Operating Systems
Real-Time Operating SystemsReal-Time Operating Systems
Real-Time Operating Systems
 
8257 DMA Controller
8257 DMA Controller8257 DMA Controller
8257 DMA Controller
 
Memory Organization
Memory OrganizationMemory Organization
Memory Organization
 
Real Time OS For Embedded Systems
Real Time OS For Embedded SystemsReal Time OS For Embedded Systems
Real Time OS For Embedded Systems
 
Computer architecture virtual memory
Computer architecture virtual memoryComputer architecture virtual memory
Computer architecture virtual memory
 

Similar to FUNDAMENTALS OF COMPUTER DESIGN

Trends and challenges in IP based SOC design
Trends and challenges in IP based SOC designTrends and challenges in IP based SOC design
Trends and challenges in IP based SOC designAishwaryaRavishankar8
 
System On Chip
System On ChipSystem On Chip
System On ChipA B Shinde
 
Cell Today and Tomorrow - IBM Systems and Technology Group
Cell Today and Tomorrow - IBM Systems and Technology GroupCell Today and Tomorrow - IBM Systems and Technology Group
Cell Today and Tomorrow - IBM Systems and Technology GroupSlide_N
 
System On Chip (SOC)
System On Chip (SOC)System On Chip (SOC)
System On Chip (SOC)Shivam Gupta
 
Top 10 Supercomputers With Descriptive Information & Analysis
Top 10 Supercomputers With Descriptive Information & AnalysisTop 10 Supercomputers With Descriptive Information & Analysis
Top 10 Supercomputers With Descriptive Information & AnalysisNomanSiddiqui41
 
Syste O CHip Concepts for Students.ppt
Syste O CHip Concepts for Students.pptSyste O CHip Concepts for Students.ppt
Syste O CHip Concepts for Students.pptmonzhalabs
 
Task allocation on many core-multi processor distributed system
Task allocation on many core-multi processor distributed systemTask allocation on many core-multi processor distributed system
Task allocation on many core-multi processor distributed systemDeepak Shankar
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Fpga asic technologies_flow
Fpga asic technologies_flowFpga asic technologies_flow
Fpga asic technologies_flowravi4all
 
Performance and Flexibility for Mmultiple-Processor SoC Design
Performance and Flexibility for Mmultiple-Processor SoC DesignPerformance and Flexibility for Mmultiple-Processor SoC Design
Performance and Flexibility for Mmultiple-Processor SoC DesignYalagoud Patil
 
Designing Cloud and Grid Computing Systems with InfiniBand and High-Speed Eth...
Designing Cloud and Grid Computing Systems with InfiniBand and High-Speed Eth...Designing Cloud and Grid Computing Systems with InfiniBand and High-Speed Eth...
Designing Cloud and Grid Computing Systems with InfiniBand and High-Speed Eth...Mason Mei
 
Heterogeneous Computing : The Future of Systems
Heterogeneous Computing : The Future of SystemsHeterogeneous Computing : The Future of Systems
Heterogeneous Computing : The Future of SystemsAnand Haridass
 
CSE675_01_Introduction.ppt
CSE675_01_Introduction.pptCSE675_01_Introduction.ppt
CSE675_01_Introduction.pptAshokRachapalli1
 
Deview 2013 rise of the wimpy machines - john mao
Deview 2013   rise of the wimpy machines - john maoDeview 2013   rise of the wimpy machines - john mao
Deview 2013 rise of the wimpy machines - john maoNAVER D2
 
CSE675_01_Introduction.ppt
CSE675_01_Introduction.pptCSE675_01_Introduction.ppt
CSE675_01_Introduction.pptAshokRachapalli1
 
software engineering CSE675_01_Introduction.ppt
software engineering CSE675_01_Introduction.pptsoftware engineering CSE675_01_Introduction.ppt
software engineering CSE675_01_Introduction.pptSomnathMule5
 
Exploring emerging technologies in the HPC co-design space
Exploring emerging technologies in the HPC co-design spaceExploring emerging technologies in the HPC co-design space
Exploring emerging technologies in the HPC co-design spacejsvetter
 

Similar to FUNDAMENTALS OF COMPUTER DESIGN (20)

Trends and challenges in IP based SOC design
Trends and challenges in IP based SOC designTrends and challenges in IP based SOC design
Trends and challenges in IP based SOC design
 
SOC Design Challenges and Practices
SOC Design Challenges and PracticesSOC Design Challenges and Practices
SOC Design Challenges and Practices
 
System On Chip
System On ChipSystem On Chip
System On Chip
 
Cell Today and Tomorrow - IBM Systems and Technology Group
Cell Today and Tomorrow - IBM Systems and Technology GroupCell Today and Tomorrow - IBM Systems and Technology Group
Cell Today and Tomorrow - IBM Systems and Technology Group
 
System On Chip (SOC)
System On Chip (SOC)System On Chip (SOC)
System On Chip (SOC)
 
Disruptive Technologies
Disruptive TechnologiesDisruptive Technologies
Disruptive Technologies
 
Top 10 Supercomputers With Descriptive Information & Analysis
Top 10 Supercomputers With Descriptive Information & AnalysisTop 10 Supercomputers With Descriptive Information & Analysis
Top 10 Supercomputers With Descriptive Information & Analysis
 
Syste O CHip Concepts for Students.ppt
Syste O CHip Concepts for Students.pptSyste O CHip Concepts for Students.ppt
Syste O CHip Concepts for Students.ppt
 
Task allocation on many core-multi processor distributed system
Task allocation on many core-multi processor distributed systemTask allocation on many core-multi processor distributed system
Task allocation on many core-multi processor distributed system
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
computer architecture.
computer architecture.computer architecture.
computer architecture.
 
Fpga asic technologies_flow
Fpga asic technologies_flowFpga asic technologies_flow
Fpga asic technologies_flow
 
Performance and Flexibility for Mmultiple-Processor SoC Design
Performance and Flexibility for Mmultiple-Processor SoC DesignPerformance and Flexibility for Mmultiple-Processor SoC Design
Performance and Flexibility for Mmultiple-Processor SoC Design
 
Designing Cloud and Grid Computing Systems with InfiniBand and High-Speed Eth...
Designing Cloud and Grid Computing Systems with InfiniBand and High-Speed Eth...Designing Cloud and Grid Computing Systems with InfiniBand and High-Speed Eth...
Designing Cloud and Grid Computing Systems with InfiniBand and High-Speed Eth...
 
Heterogeneous Computing : The Future of Systems
Heterogeneous Computing : The Future of SystemsHeterogeneous Computing : The Future of Systems
Heterogeneous Computing : The Future of Systems
 
CSE675_01_Introduction.ppt
CSE675_01_Introduction.pptCSE675_01_Introduction.ppt
CSE675_01_Introduction.ppt
 
Deview 2013 rise of the wimpy machines - john mao
Deview 2013   rise of the wimpy machines - john maoDeview 2013   rise of the wimpy machines - john mao
Deview 2013 rise of the wimpy machines - john mao
 
CSE675_01_Introduction.ppt
CSE675_01_Introduction.pptCSE675_01_Introduction.ppt
CSE675_01_Introduction.ppt
 
software engineering CSE675_01_Introduction.ppt
software engineering CSE675_01_Introduction.pptsoftware engineering CSE675_01_Introduction.ppt
software engineering CSE675_01_Introduction.ppt
 
Exploring emerging technologies in the HPC co-design space
Exploring emerging technologies in the HPC co-design spaceExploring emerging technologies in the HPC co-design space
Exploring emerging technologies in the HPC co-design space
 

Recently uploaded

UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performancesivaprakash250
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756dollysharma2066
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startQuintin Balsdon
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile
 
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Bookingroncy bisnoi
 
Intze Overhead Water Tank Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank  Design by Working Stress - IS Method.pdfIntze Overhead Water Tank  Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank Design by Working Stress - IS Method.pdfSuman Jyoti
 
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Bookingdharasingh5698
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptDineshKumar4165
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdfKamal Acharya
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . pptDineshKumar4165
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdfankushspencer015
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueBhangaleSonal
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringmulugeta48
 
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Bookingroncy bisnoi
 

Recently uploaded (20)

UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the start
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
 
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced LoadsFEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
 
Intze Overhead Water Tank Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank  Design by Working Stress - IS Method.pdfIntze Overhead Water Tank  Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank Design by Working Stress - IS Method.pdf
 
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdf
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineering
 
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
 

FUNDAMENTALS OF COMPUTER DESIGN

  • 1. EC6009-Advanced Computer Architecture UNIT I FUNDAMENTALS OF COMPUTER DESIGN 9 Review of Fundamentals of CPU, Memory and IO – Trends in technology, power, energy and cost, Dependability - Performance Evaluation UNIT II INSTRUCTION LEVEL PARALLELISM 9 ILP concepts – Pipelining overview - Compiler Techniques for Exposing ILP – Dynamic Branch Prediction – Dynamic Scheduling – Multiple instruction Issue – Hardware Based Speculation – Static scheduling - Multi-threading - Limitations of ILP – Case Studies. UNIT III DATA-LEVEL PARALLELISM 9 Vector architecture – SIMD extensions – Graphics Processing units – Loop level parallelism. UNIT IV THREAD LEVEL PARALLELISM 9 Symmetric and Distributed Shared Memory Architectures – Performance Issues – Synchronization – Models of Memory Consistency – Case studies: Intel i7 Processor, SMT & CMP Processors UNIT V MEMORY AND I/O 9 Cache Performance – Reducing Cache Miss Penalty and Miss Rate – Reducing Hit Time – Main Memory and Performance – Memory Technology. Types of Storage Devices – Buses – RAID – Reliability, Availability and Dependability – I/O Performance Measures. 7/3/2019 VII-ECE-B
  • 2. On completion of the course, the students will be able to CO1 Explain the performance of different architectures with respect to various parameters K2 CO2 Describe the performance of different ILP techniques K2 CO3 Discuss the performance of different architectures & exploiting DLP K2 CO4 Illustrate the concepts of Transport level protocol. K2 CO5 Distinguish cache and memory related issues in multiprocessor. K2 EC6009-Advanced Computer Architecture CO/P O PO 1 PO 2 PO 3 PO 4 PO 5 PO 6 PO 7 PO 8 PO 9 PO1 0 PO1 1 PO1 2 PSO 1 PSO 2 CO1 3 2 1 CO2 3 2 1 CO3 3 2 1 CO4 3 2 1 CO5 3 2 2 C404 3 2 1 - - - - - - - -- - - -7/3/2019 VII-ECE-B
  • 3. • PO 1 - Engineering Knowledge • PO 2 - Problem analysis • PO 3 - Design / development of solutions • PO 4 - Conduct investigations of complex problems • PO 5 - Modern tool usage: • PO 6 - Engineer and Society: • PO 7 - Environment and sustainability: • PO 8 - Ethics: • PO 9 - Individual and Team-work: • PO 10 - Communication: • PO 11 - Project management and finance: • PO 12 - Life-long learning: 7/3/2019 VII-ECE-B
  • 4. • EC6009 • Advanced Computer Architecture • UNIT-1 • Fundamentals of Computer Design 7/3/2019 VII-ECE-B
  • 5. Outline • 1.1 Introduction • 1.2 Classes of Computers • 1.3 Defining Computer Architecture • 1.4 Trends in Technology • 1.5 Trends in Power in Integrated Circuits • 1.6 Trends in Cost • 1.7 Dependability • 1.8 Measuring, Reporting, and Summarizing Performance • 1.9 Quantitative Principles of Computer Design • 1.10Putting It All Together: Performance and Price- Performance7/3/2019 VII-ECE-B
  • 6. Computer Technology • Performance improvements: – Improvements in semiconductor technology • Feature size, clock speed – Improvements in computer architectures • Enabled by HLL compilers, UNIX • Lead to RISC (Simple INS Set)architectures – Together have enabled: • Lightweight computers • Productivity-based managed/interpreted programming languages Introduction 7/3/2019 VII-ECE-B
  • 8. Current Trends in Architecture • Cannot continue to leverage Instruction-Level parallelism (ILP) – Single processor performance improvement ended in 2003 • New models for performance: – Data-level parallelism (DLP) – Thread-level parallelism (TLP) – Request-level parallelism (RLP) • These require explicit restructuring of the application Introduction 7/3/2019 VII-ECE-B
  • 9. Organization, Hardware, and Architecture • Organization: includes the high-level aspects of a computer’s design. – Memory system, the memory interconnect, and the design of the internal processor or CPU (arithmetic, logic, branching, and data transfer). – For example: AMD Opteron 64 and Intel P4 have same ISA, but they have different internal pipeline and cache organizations. • Hardware: detailed logic design and the packaging technology. – For example, P4 and Mobile P4 have same ISA and organization, but they have different clock frequency and memory system. • Architecture: covers all three aspects of computer design – instruction set architecture, organization, and hardware. – Designer must meet functional requirements as well as price, power, performance, and availability goals.7/3/2019 VII-ECE-B
  • 10. Instruction Set Architecture: Critical Interface • Properties of a good abstraction – Lasts through many generations (portability) – Used in many different ways (generality) – Provides convenient functionality to higher levels – Permits an efficient implementation at lower levels instruction set software hardware 7/3/2019 VII-ECE-B
  • 11. Instruction Set Architecture (ISA) – Class of ISA: ISA is the actual programmer-visible instruction set. – General purpose Architecture( Reg Memory, Load-Store ) – Stack Architecture – Memory addressing; (if Program running 32-bit processor can address upto 4GB (2*32bytes) of address space) – Addressing modes; (Direct & Indirect) apart etc… – Types and sizes of operands: The common type Supported by ISA, includes, signed , unsigned, single & double precision Floating point numbers) – Data processing & Control flow instructions; 7/3/2019 VII-ECE-B
  • 12. Classes of Computers • Personal Mobile Device (PMD) – e.g. start phones, tablet computers – Emphasis on energy efficiency and real-time • Desktop Computing (Work stations) – Emphasis on price-performance • Servers (Main frame) – Emphasis on availability, scalability, throughput • Clusters / Warehouse Scale Computers – Used for “Software as a Service (SaaS)” – Emphasis on availability and price-performance – Sub-class: Supercomputers, emphasis: floating-point performance and fast internal networks • Embedded Computers – Emphasis: price ClassesofComputers 7/3/2019 VII-ECE-B
  • 13. Trends in Technology • A successful new ISA may last decades, for example, IBM mainframe. • Four critical technologies – Integrated circuit logic technology: transistor density increased by about 35% per year, quadrupling in somewhat over four years; – Semiconductor DRAM (Dynamic Random-Access Memory): capacity increases by about 40% per year, doubling roughly every two years; – Magnetic disk technology: roller coaster of rates, disk are 50-100 times cheaper per bit than DRAM . – Network technology: network performance depends both on the performance of switches and transmission. 7/3/2019 VII-ECE-B
  • 14. Scaling of Transistor Performance and Wires • Feature size: the minimum size of a transistor or a wire in either the x or y dimension. – From 10 microns in 1971 to 0.09 microns (90 nm) in 2006; – The density of transistors increases quadratically with a linear decrease in feature size; – Transistor performance improves linearly with decreasing feature size; – Since improvement in transistor density, thus CPU move quickly from 4-bit to 8-bit, to 16-bit, to 32-bit microprocessors; 7/3/2019 VII-ECE-B
  • 15. Performance Trends: Bandwidth over Latency • Bandwidth or throughput: • the total amount of work done in a given time. – Such as megabyte per second for a disk transfer. • Latency or response time: the time between the start and the completion of an event. – Such as milliseconds for a disk access. 7/3/2019 VII-ECE-B
  • 16. Power • Power also provides challenges as devices are scaled. – Dynamic power (watts, W)in CMOS chip: the traditional dominant energy consumption has been in switching transistors. – For mobile devices: they care about battery life more than power, so energy is the proper metric, measured in joules: switchedFrequencyVoltageloadCapacitive 2 1 Power 2 dynamic  † In modern VLSI, the exact power measurement is the sum of, Powertotal=Powerdynamic+Powerstatic+Powerleakage 2 dynamic VoltageloadCapacitiveEnergy  7/3/2019 VII-ECE-B
  • 17. Power • Static power: an important issue because leakage current flows even when a transistor is off: – Thus, transistor ↑, power ↑; – Feature size ↓, power ↑ (why? You can find out in VLSI area). VoltageCurrentPower staticstatic  7/3/2019 VII-ECE-B
  • 18. Silicon Wafer and Dies • Exponential cost decrease – technology basically the same: A wafer is tested and chopped into dies that are packaged. Die (晶粒) Wafer (晶圓) AMD K8, source: http://www.amd.com dies along the edge 7/3/2019 VII-ECE-B
  • 19. Cost of an Integrated Circuit (IC) yieldDiewaferperDies# waferofCost dieofCost   yieldtestFinal testfinalandpackagingofCostdietestingofCostdieofCost ICofCost     areaDie2 diameterWaferπ areaDie radiusWaferπ waferperDies# 2      α α areaDiedesityDefect 1yieldWaferyieldDie          Today’s technology:   4.0, defect density 0.4 ~ 0.8 per cm2 (A greater portion of the cost that varies between machines) (sensitive to die size) (# of dies along the edge) 7/3/2019 VII-ECE-B
  • 20. Response Time, Throughput, and Performance • Response time : the time between the start and the completion of an event – also referred to as execution time. – The computer user is interested. • Throughput : the total amount of work done in a given time. – The administrator of a large data processing center may be interested. • In comparing design alternatives, – The phrase “X is faster than Y” is used here to mean that the response time or execution time is lower on X than on Y. – In particular, “X is n times faster than Y” or “the throughput of X is n times higher than Y” will mean n X Y timeExecution timeExecution 7/3/2019 VII-ECE-B
  • 21. Performance Measuring • Execution is the reciprocal of performance, X X timeExecution 1 ePerformanc  Y X X Y X Y ePerformanc ePerformanc ePerformanc 1 ePerformanc 1 TimeExecution TimeExecution n 7/3/2019 VII-ECE-B
  • 22. Reliable Measure – User CPU Time • Response time may include disk access, memory access, input/output activities, CPU event and operating system overhead – everything… • In order to get an accurate measure of performance, we use CPU time instead of using response time. • CPU time is the time the CPU spends computing a program and does not include time spent waiting for I/O or running other programs. • CPU time can also be divided into user CPU time (program) and system CPU time (OS). • Key in UNIX command time, we have, 90.7s 12.9s 2:39 65% (user CPU, system CPU, total response,%). • In our performance measures, we use user CPU time – because of its independence on the OS and other factors. 7/3/2019 VII-ECE-B
  • 23. CPU Performance • Essentially all computers are constructed using clock (all called ticks, clock ticks, clock periods, clocks, cycles, or clock cycles) running at a constant rate. – Clock rate: today in GHz – Clock cycle time: clock cycle time = 1/clock rate – Ex. 1 GHz clock rate = 1 ns cycle time • Thus, the CPU time for a program can be expressed two ways: Or, timecycleClockprogramaforcyclesclockCPUTimeCPU  rateClock programaforcyclesclockCPU TimeCPU  7/3/2019 VII-ECE-B
  • 24. CPU Performance • We can also count the number of instructions executed – the instruction path length or instruction count (IC). • If we know the number of clock cycles and IC, then the average number of clock cycles per instruction (CPI). • CPI is computed as • Thus, clock cycles can be defined as IC × CPI, this allows us to use CPI in the execution time formula: IC programaforcyclesclockCPU CPI  † This figure provides insight into different styles of instruction sets and implementations. rateClock CPIIC timecycleClockCPIICtimeCPU   7/3/2019 VII-ECE-B
  • 26. CPU Performance • The pieces fit together of CPU time • A α% improvement in any one of three pieces leads to a α% improvement in CPU time. – Unfortunately, it is difficult to change one parameter in complete isolation form others, because the technologies of them are interdependent: • Clock cycle time: Hardware technology and organization; • CPI: Organization and instruction set architecture; • Instruction count: Instruction set architecture and compiler technology. timeCPU program Seconds cycleClock Seconds nInstructio cyclesClock Program nsInstructio program timecyclecyclesclock timecycleClockprogramaforcyclesclockCPUTimeCPU    † Processor performance is dependent upon three characteristics: instruction count, clock cycles per instruction and clock cycle (or rate). † Computer architecture is focus on CPI and IC parameters. 7/3/2019 VII-ECE-B
  • 27. CPU Performance • To calculate the number of total processor clock cycles as • To express CPU time again – And overall CPI as i n i i CPIICcyclesclockCPU 1   ICi: the number of times instruction i is executed in a program. CPIi: the average number of clocks per instruction for instruction i. † ICi/IC presents the fraction of occurrences of that instruction in a program. † It is useful in designing the processor. timecycleClockCPIICtimeCPU 1         i n i i              n i i i i n i i 1 1 CPI countnInstructio IC countnInstructio CPIIC CPI Hint: CPIi should be measured because pipeline effects, cache misses, and any other memory system inefficiencies. 7/3/2019 VII-ECE-B

Editor's Notes

  1. 3 July 2019
  2. 3 July 2019
  3. 10
  4. 3 July 2019