SlideShare a Scribd company logo
1 of 21
Download to read offline
1
 Very long instruction word or VLIW refers to a processor
architecture designed to take advantage of instruction level
parallelism(ILP).
 Whereas conventional processors mostly only allow programs that
specify instructions to be executed one after another.
 A VLIW processor allows programs that can explicitly specify
instructions to be executed at the same time (i.e. in parallel).
 This type of processor architecture is intended to allow higher
performance without the inherent complexity of some other
approaches.
 The term VLIW, and the concept of VLIW architecture itself, were
invented by Josh Fisher in his research group at Yale University in
the early 1980s.
Very long instruction word
2
 Explicitly parallel instruction computing (EPIC) is a term coined in
1997 by the HP–Intel alliance to describe a computing paradigm that
researchers had been investigating since the early 1980s.
 This paradigm is also called Independence architectures. It was the
basis for Intel and HP development of the Intel Itanium architecture,
and HP later asserted that "EPIC" was merely an old term for the
Itanium architecture.
 EPIC permits microprocessors to execute software instructions in
parallel by using the compiler, rather than complex on-die circuitry,
to control parallel instruction execution.
 This was intended to allow simple performance scaling without
resorting to higher clock frequencies.
Explicitly parallel instruction computing
3
 VLIW (very long instruction word):
- Compiler packs a fixed number of instructions into a single VLIW
instruction.
- The instructions within a VLIW instruction are issued and executed in
parallel
Example: High-end signal processors (TMS320C6201)
 EPIC (explicit parallel instruction computing):
Evolution of VLIW
Example: Intel’s IA-64, exemplified by the Itanium processor
VLIW or EPIC
4
VLIW
 VLIW (very long instruction word)
processors use a long instruction word that
contains a usually fixed number of
operations that are fetched, decoded,
issued, and executed synchronously.
 All operations specified within a VLIW
instruction must be independent of one
another.
5
VLIW
 Some of the key issues of a (V)LIW processor:
• (very) long instruction word (up to 1 024 bits per
instruction),
• each instruction consists of multiple independent
parallel operations,
• each operation requires a statically known number
of cycles to complete,
• a central controller that issues a long instruction
word every cycle,
• multiple FUs connected through a global shared
register file.
6
VLIW
 sequential stream of long instruction words
 instructions scheduled statically by the
compiler
 number of simultaneously issued
instructions is fixed during compile-time
 instruction issue is less complicated than in
a superscalar processor
7
VLIW
 Disadvantage: VLIW processors cannot react on dynamic
events,
e.g. cache misses, with the same flexibility like superscalars.
 The number of instructions in a VLIW instruction word is
usually fixed.
 Padding VLIW instructions with no-ops is needed in case
the full issue bandwidth is not be met. This increases code
size. More recent VLIW architectures use a denser code
format which allows to remove the no-ops.
 VLIW is an architectural technique, whereas superscalar is
a microarchitecture technique.
 VLIW processors take advantage of spatial parallelism.
8
EPIC: a paradigm shift
 Superscalar RISC solution
• Based on sequential execution semantics
• Compiler’s role is limited by the instruction set architecture
• Superscalar hardware identifies and exploits parallelism
 EPIC solution – (the evolution of VLIW)
• Based on parallel execution semantics
• EPIC ISA enhancements support static parallelization
• Compiler takes greater responsibility for exploiting parallelism
• Compiler / hardware collaboration often resembles superscalar
9
EPIC: a paradigm shift
 Advantages of pursuing EPIC architectures
• Make wide issue & deep latency less expensive in
hardware
• Allow processor parallelism to scale with additional
VLSI density
 Architect the processor to do well with in-order execution
• Enhance the ISA to allow static parallelization
• Use compiler technology to parallelize program
• However, a purely static VLIW is not appropriate for
general-purpose use
10
The fusion of VLIW and superscalar techniques
 Superscalars need improved support for static parallelization
• Static scheduling
• Limited support for predicated execution
 VLIWs need improved support for dynamic parallelization
• Caches introduce dynamically changing memory latency
• Compatibility: issue width and latency may change with new
hardware
• Application requirements - e.g. object oriented programming with
dynamic binding
 EPIC processors exhibit features derived from both
• Interlock & out-of-order execution hardware are compatible with
EPIC (but not required!)
• EPIC processors can use dynamic translation to parallelize in
software
11
Many EPIC features are taken from VLIWs
Minisupercomputer products stimulated VLIW research (FPS,
Multiflow, Cydrome)
Minisupercomputers were specialized, costly, and short-lived
Traditional VLIWs not suited to general purpose computing
VLIW resurgence in single chip DSP & media processors
Minisupercomputers exaggerated forward-looking challenges:
Long latency
Wide issue
Large number of architected registers
Compile-time scheduling to exploit exotic amounts of parallelism
EPIC exploits many VLIW techniques
12
Shortcomings of early VLIWs
 Expensive multi-chip implementations
 No data cache
 Poor "scalar" performance
 No strategy for object code compatibility
13
EPIC design challenges
 Develop architectures applicable to general-purpose
computing
• Find substantial parallelism in “difficult to parallelize”
scalar programs
• Provide compatibility across hardware generations
• Support emerging applications (e.g. multimedia)
 Compiler must find or create sufficient ILP
 Combine the best attributes of VLIW & superscalar RISC
(incorporated best concepts from all available sources)
 Scale architectures for modern single-chip implementation
14
EPIC Processors, Intel's IA-64 ISA and Itanium
 Joint R&D project by Hewlett-Packard and Intel
(announced in June 1994)
 This resulted in explicitly parallel instruction
computing (EPIC) design style:
• specifying ILP explicit in the machine code, that is, the
parallelism is encoded directly into the instructions
similarly to VLIW;
• a fully predicated instruction set;
• an inherently scalable instruction set (i.e., the ability to
scale to a lot of FUs);
• many registers;
• speculative execution of load instructions
15
IA-64 Architecture
 Unique architecture features & enhancements
• Explicit parallelism and templates
• Predication, speculation, memory support, and
others
• Floating-point and multimedia architecture
 IA-64 resources available to applications
• Large, application visible register set
• Rotating registers, register stack, register stack
engine
 IA-32 & PA-RISC compatibility models
16
Today’s Architecture Challenges
 Performance barriers :
• Memory latency
• Branches
• Loop pipelining and call / return overhead
 Headroom constraints :
• Hardware-based instruction scheduling
- Unable to efficiently schedule parallel execution
• Resource constrained
- Too few registers
- Unable to fully utilize multiple execution units
 Scalability limitations :
• Memory addressing efficiency
17
Intel's IA-64 ISA
 Intel 64-bit Architecture (IA-64) register model:
• 128, 64-bit general purpose registers GR0-GR127
to hold values for integer and multimedia computations
- each register has one additional NaT (Not a Thing) bit to
indicate whether the value stored is valid,
• 128, 82-bit floating-point registers FR0-FR127
- registers f0 and f1 are read-only with values +0.0 and +1.0,
• 64, 1-bit predicate registers P0-PR63
- the first register p0 is read-only and always reads 1 (true)
• 8, 64-bit branch registers BR0-BR7 to specify the target
addresses of indirect branches
18
IA-64’s Large Register File
BR7
BR0
Branch
Registers
63 0
96 Stacked, Rotating
GR1
GR31
GR127
GR32
GR0
NaT 32 Static
0
Integer Registers
63 0
Predicate
Registers
1
PR1
PR63
PR0
PR15
PR16
48 Rotating
16 Static
bit 0
96 Rotating
GR1
GR31
GR127
GR32
GR0
32 Static
0.0
Floating-Point
Registers
81 0
19
Intel's IA-64 ISA
• IA-64 instructions are 41-bit (previously stated 40 bit) long and consist of
- op-code,
- predicate field (6 bits),
- two source register addresses (7 bits each),
- destination register address (7 bits), and
- special fields (includes integer and floating-point arithmetic).
• The 6-bit predicate field in each IA-64 instruction refers to a set of 64 predicate
registers.
• 6 types of instructions
- A: Integer ALU ==> I-unit or M-unit
- I: Non-ALU integer ==> I-unit
- M: Memory ==> M-unit
- B: Branch ==> B-unit
- F: Floating-point ==> F-unit
- L: Long Immediate ==> I-unit
• IA-64 instructions are packed by compiler into bundles.
20
IA-64 Bundles
 A bundle is a 128-bit long instruction word (LIW) containing three 41-bit IA-64
instructions along with a so-called 5-bit template that contains instruction
grouping information.
 IA-64 does not insert no-op instructions to fill slots in the bundles.
 The template explicitly indicates :
• first 4 bits: types of instructions
• last bit (stop bit): whether the bundle can be executed in parallel with the
next bundle
• (previous literature): whether the instructions in the bundle can be executed
in parallel or if one or more must be executed serially.
 Bundled instructions don't have to be in their original program order, and they
can even represent entirely different paths of a branch.
 Also, the compiler can mix dependent and independent instructions together in a
bundle, because the template keeps track of which is which.
21
IA-64 : Explicitly Parallel Architecture
 IA-64 template specifies
• The type of operation for each instruction
- MFI, MMI, MII, MLI, MIB, MMF, MFB, MMB,
MBB, BBB
• Intra-bundle relationship
- M / MI or MI / I
• Inter-bundle relationship
 Most common combinations covered by templates
• Headroom for additional templates
 Simplifies hardware requirements
 Scales compatibly to future generations
Instruction 2
41 bits
Instruction 1
41 bits
Instruction 0
41 bits
Template
5 bits
128 bits (bundle)
M=Memory
F=Floating-point
I=Integer
L=Long Immediate
B=Branch
(MMI)Memory (M) Memory (M) Integer (I)

More Related Content

What's hot

Ipv4 and Ipv6
Ipv4 and Ipv6Ipv4 and Ipv6
Ipv4 and Ipv6rahul kundu
 
Multithreading computer architecture
 Multithreading computer architecture  Multithreading computer architecture
Multithreading computer architecture Haris456
 
Computer organisation -morris mano
Computer organisation  -morris manoComputer organisation  -morris mano
Computer organisation -morris manovishnu murthy
 
Computer architecture the pentium architecture
Computer architecture the pentium architectureComputer architecture the pentium architecture
Computer architecture the pentium architectureMazin Alwaaly
 
Cachememory
CachememoryCachememory
CachememorySlideshare
 
8086 memory segmentation
8086 memory segmentation8086 memory segmentation
8086 memory segmentationmahalakshmimalini
 
CISC & RISC Architecture
CISC & RISC Architecture CISC & RISC Architecture
CISC & RISC Architecture Suvendu Kumar Dash
 
TCP/IP Protocol Architeture
TCP/IP Protocol ArchitetureTCP/IP Protocol Architeture
TCP/IP Protocol ArchitetureManoj Kumar
 
Os Swapping, Paging, Segmentation and Virtual Memory
Os Swapping, Paging, Segmentation and Virtual MemoryOs Swapping, Paging, Segmentation and Virtual Memory
Os Swapping, Paging, Segmentation and Virtual Memorysgpraju
 
Point to point interconnect
Point to point interconnectPoint to point interconnect
Point to point interconnectKinza Razzaq
 
Difference b/w 8085 & 8086
Difference b/w 8085 & 8086Difference b/w 8085 & 8086
Difference b/w 8085 & 8086j4jiet
 
Accessing I/O Devices
Accessing I/O DevicesAccessing I/O Devices
Accessing I/O DevicesSlideshare
 
Pentium (80586) Microprocessor By Er. Swapnil Kaware
Pentium (80586) Microprocessor By Er. Swapnil KawarePentium (80586) Microprocessor By Er. Swapnil Kaware
Pentium (80586) Microprocessor By Er. Swapnil KawareProf. Swapnil V. Kaware
 
Computer Organization and Architecture.
Computer Organization and Architecture.Computer Organization and Architecture.
Computer Organization and Architecture.CS_GDRCST
 
Instruction Set Architecture (ISA)
Instruction Set Architecture (ISA)Instruction Set Architecture (ISA)
Instruction Set Architecture (ISA)Gaditek
 
Input Output - Computer Architecture
Input Output - Computer ArchitectureInput Output - Computer Architecture
Input Output - Computer ArchitectureMaruf Abdullah (Rion)
 
Real Time Operating system (RTOS) - Embedded systems
Real Time Operating system (RTOS) - Embedded systemsReal Time Operating system (RTOS) - Embedded systems
Real Time Operating system (RTOS) - Embedded systemsHariharan Ganesan
 

What's hot (20)

Ipv4 and Ipv6
Ipv4 and Ipv6Ipv4 and Ipv6
Ipv4 and Ipv6
 
VLIW Processors
VLIW ProcessorsVLIW Processors
VLIW Processors
 
Multithreading computer architecture
 Multithreading computer architecture  Multithreading computer architecture
Multithreading computer architecture
 
Computer organisation -morris mano
Computer organisation  -morris manoComputer organisation  -morris mano
Computer organisation -morris mano
 
Computer architecture the pentium architecture
Computer architecture the pentium architectureComputer architecture the pentium architecture
Computer architecture the pentium architecture
 
Cachememory
CachememoryCachememory
Cachememory
 
Multicore Processor Technology
Multicore Processor TechnologyMulticore Processor Technology
Multicore Processor Technology
 
8086 memory segmentation
8086 memory segmentation8086 memory segmentation
8086 memory segmentation
 
CISC & RISC Architecture
CISC & RISC Architecture CISC & RISC Architecture
CISC & RISC Architecture
 
TCP/IP Protocol Architeture
TCP/IP Protocol ArchitetureTCP/IP Protocol Architeture
TCP/IP Protocol Architeture
 
Os Swapping, Paging, Segmentation and Virtual Memory
Os Swapping, Paging, Segmentation and Virtual MemoryOs Swapping, Paging, Segmentation and Virtual Memory
Os Swapping, Paging, Segmentation and Virtual Memory
 
Point to point interconnect
Point to point interconnectPoint to point interconnect
Point to point interconnect
 
Difference b/w 8085 & 8086
Difference b/w 8085 & 8086Difference b/w 8085 & 8086
Difference b/w 8085 & 8086
 
Accessing I/O Devices
Accessing I/O DevicesAccessing I/O Devices
Accessing I/O Devices
 
Pentium (80586) Microprocessor By Er. Swapnil Kaware
Pentium (80586) Microprocessor By Er. Swapnil KawarePentium (80586) Microprocessor By Er. Swapnil Kaware
Pentium (80586) Microprocessor By Er. Swapnil Kaware
 
Computer Organization and Architecture.
Computer Organization and Architecture.Computer Organization and Architecture.
Computer Organization and Architecture.
 
Ch 4 95
Ch 4 95Ch 4 95
Ch 4 95
 
Instruction Set Architecture (ISA)
Instruction Set Architecture (ISA)Instruction Set Architecture (ISA)
Instruction Set Architecture (ISA)
 
Input Output - Computer Architecture
Input Output - Computer ArchitectureInput Output - Computer Architecture
Input Output - Computer Architecture
 
Real Time Operating system (RTOS) - Embedded systems
Real Time Operating system (RTOS) - Embedded systemsReal Time Operating system (RTOS) - Embedded systems
Real Time Operating system (RTOS) - Embedded systems
 

Similar to Vliw or epic

Vliw and superscaler
Vliw and superscalerVliw and superscaler
Vliw and superscalerRafi Dar
 
VLIW(Very Long Instruction Word)
VLIW(Very Long Instruction Word)VLIW(Very Long Instruction Word)
VLIW(Very Long Instruction Word)Pragnya Dash
 
The sunsparc architecture
The sunsparc architectureThe sunsparc architecture
The sunsparc architectureTaha Malampatti
 
0 foundation update__final - Mendy Furmanek
0 foundation update__final - Mendy Furmanek0 foundation update__final - Mendy Furmanek
0 foundation update__final - Mendy FurmanekYutaka Kawai
 
Crussoe proc
Crussoe procCrussoe proc
Crussoe proctyadi
 
finaldraft-intelcorei5processorsarchitecture-130207093535-phpapp01.pdf
finaldraft-intelcorei5processorsarchitecture-130207093535-phpapp01.pdffinaldraft-intelcorei5processorsarchitecture-130207093535-phpapp01.pdf
finaldraft-intelcorei5processorsarchitecture-130207093535-phpapp01.pdfNazarAhmadAlkhidir
 
5-Embedded processor technology-06-01-2024.pdf
5-Embedded processor technology-06-01-2024.pdf5-Embedded processor technology-06-01-2024.pdf
5-Embedded processor technology-06-01-2024.pdfmovocode
 
Difficulties in Pipelining
Difficulties in PipeliningDifficulties in Pipelining
Difficulties in PipeliningChristineMaeCion1
 
Computer Organization: Introduction to Microprocessor and Microcontroller
Computer Organization: Introduction to Microprocessor and MicrocontrollerComputer Organization: Introduction to Microprocessor and Microcontroller
Computer Organization: Introduction to Microprocessor and MicrocontrollerAmrutaMehata
 
Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications OpenEBS
 
Lec4 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- ISA
Lec4 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- ISALec4 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- ISA
Lec4 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- ISAHsien-Hsin Sean Lee, Ph.D.
 
Unit-I_part-II_Virtualization.pptx
Unit-I_part-II_Virtualization.pptxUnit-I_part-II_Virtualization.pptx
Unit-I_part-II_Virtualization.pptxDARKKNIGHT116809
 
LinuxIO-Introduction-FUDCon-2015
LinuxIO-Introduction-FUDCon-2015LinuxIO-Introduction-FUDCon-2015
LinuxIO-Introduction-FUDCon-2015KASHISH BHATIA
 
Large Scale Computing Infrastructure - Nautilus
Large Scale Computing Infrastructure - NautilusLarge Scale Computing Infrastructure - Nautilus
Large Scale Computing Infrastructure - NautilusGabriele Di Bernardo
 
Digital Design Flow
Digital Design FlowDigital Design Flow
Digital Design FlowMostafa Khamis
 
Vliw
VliwVliw
VliwAJAL A J
 
Hyper threading
Hyper threadingHyper threading
Hyper threadingAnmol Purohit
 
W04505116121
W04505116121W04505116121
W04505116121IJERA Editor
 
RISC-V Introduction
RISC-V IntroductionRISC-V Introduction
RISC-V IntroductionYi-Hsiu Hsu
 

Similar to Vliw or epic (20)

Vliw and superscaler
Vliw and superscalerVliw and superscaler
Vliw and superscaler
 
VLIW(Very Long Instruction Word)
VLIW(Very Long Instruction Word)VLIW(Very Long Instruction Word)
VLIW(Very Long Instruction Word)
 
Lec1 final
Lec1 finalLec1 final
Lec1 final
 
The sunsparc architecture
The sunsparc architectureThe sunsparc architecture
The sunsparc architecture
 
0 foundation update__final - Mendy Furmanek
0 foundation update__final - Mendy Furmanek0 foundation update__final - Mendy Furmanek
0 foundation update__final - Mendy Furmanek
 
Crussoe proc
Crussoe procCrussoe proc
Crussoe proc
 
finaldraft-intelcorei5processorsarchitecture-130207093535-phpapp01.pdf
finaldraft-intelcorei5processorsarchitecture-130207093535-phpapp01.pdffinaldraft-intelcorei5processorsarchitecture-130207093535-phpapp01.pdf
finaldraft-intelcorei5processorsarchitecture-130207093535-phpapp01.pdf
 
5-Embedded processor technology-06-01-2024.pdf
5-Embedded processor technology-06-01-2024.pdf5-Embedded processor technology-06-01-2024.pdf
5-Embedded processor technology-06-01-2024.pdf
 
Difficulties in Pipelining
Difficulties in PipeliningDifficulties in Pipelining
Difficulties in Pipelining
 
Computer Organization: Introduction to Microprocessor and Microcontroller
Computer Organization: Introduction to Microprocessor and MicrocontrollerComputer Organization: Introduction to Microprocessor and Microcontroller
Computer Organization: Introduction to Microprocessor and Microcontroller
 
Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications
 
Lec4 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- ISA
Lec4 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- ISALec4 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- ISA
Lec4 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- ISA
 
Unit-I_part-II_Virtualization.pptx
Unit-I_part-II_Virtualization.pptxUnit-I_part-II_Virtualization.pptx
Unit-I_part-II_Virtualization.pptx
 
LinuxIO-Introduction-FUDCon-2015
LinuxIO-Introduction-FUDCon-2015LinuxIO-Introduction-FUDCon-2015
LinuxIO-Introduction-FUDCon-2015
 
Large Scale Computing Infrastructure - Nautilus
Large Scale Computing Infrastructure - NautilusLarge Scale Computing Infrastructure - Nautilus
Large Scale Computing Infrastructure - Nautilus
 
Digital Design Flow
Digital Design FlowDigital Design Flow
Digital Design Flow
 
Vliw
VliwVliw
Vliw
 
Hyper threading
Hyper threadingHyper threading
Hyper threading
 
W04505116121
W04505116121W04505116121
W04505116121
 
RISC-V Introduction
RISC-V IntroductionRISC-V Introduction
RISC-V Introduction
 

More from Amit Kumar Rathi

Hybrid Systems using Fuzzy, NN and GA (Soft Computing)
Hybrid Systems using Fuzzy, NN and GA (Soft Computing)Hybrid Systems using Fuzzy, NN and GA (Soft Computing)
Hybrid Systems using Fuzzy, NN and GA (Soft Computing)Amit Kumar Rathi
 
Fundamentals of Genetic Algorithms (Soft Computing)
Fundamentals of Genetic Algorithms (Soft Computing)Fundamentals of Genetic Algorithms (Soft Computing)
Fundamentals of Genetic Algorithms (Soft Computing)Amit Kumar Rathi
 
Fuzzy Systems by using fuzzy set (Soft Computing)
Fuzzy Systems by using fuzzy set (Soft Computing)Fuzzy Systems by using fuzzy set (Soft Computing)
Fuzzy Systems by using fuzzy set (Soft Computing)Amit Kumar Rathi
 
Fuzzy Set Theory and Classical Set Theory (Soft Computing)
Fuzzy Set Theory and Classical Set Theory (Soft Computing)Fuzzy Set Theory and Classical Set Theory (Soft Computing)
Fuzzy Set Theory and Classical Set Theory (Soft Computing)Amit Kumar Rathi
 
Associative Memory using NN (Soft Computing)
Associative Memory using NN (Soft Computing)Associative Memory using NN (Soft Computing)
Associative Memory using NN (Soft Computing)Amit Kumar Rathi
 
Back Propagation Network (Soft Computing)
Back Propagation Network (Soft Computing)Back Propagation Network (Soft Computing)
Back Propagation Network (Soft Computing)Amit Kumar Rathi
 
Fundamentals of Neural Network (Soft Computing)
Fundamentals of Neural Network (Soft Computing)Fundamentals of Neural Network (Soft Computing)
Fundamentals of Neural Network (Soft Computing)Amit Kumar Rathi
 
Introduction to Soft Computing (intro to the building blocks of SC)
Introduction to Soft Computing (intro to the building blocks of SC)Introduction to Soft Computing (intro to the building blocks of SC)
Introduction to Soft Computing (intro to the building blocks of SC)Amit Kumar Rathi
 
String matching, naive,
String matching, naive,String matching, naive,
String matching, naive,Amit Kumar Rathi
 
Shortest path algorithms
Shortest path algorithmsShortest path algorithms
Shortest path algorithmsAmit Kumar Rathi
 
Sccd and topological sorting
Sccd and topological sortingSccd and topological sorting
Sccd and topological sortingAmit Kumar Rathi
 
Recurrence and master theorem
Recurrence and master theoremRecurrence and master theorem
Recurrence and master theoremAmit Kumar Rathi
 
Rabin karp string matcher
Rabin karp string matcherRabin karp string matcher
Rabin karp string matcherAmit Kumar Rathi
 
Minimum spanning tree
Minimum spanning treeMinimum spanning tree
Minimum spanning treeAmit Kumar Rathi
 

More from Amit Kumar Rathi (20)

Hybrid Systems using Fuzzy, NN and GA (Soft Computing)
Hybrid Systems using Fuzzy, NN and GA (Soft Computing)Hybrid Systems using Fuzzy, NN and GA (Soft Computing)
Hybrid Systems using Fuzzy, NN and GA (Soft Computing)
 
Fundamentals of Genetic Algorithms (Soft Computing)
Fundamentals of Genetic Algorithms (Soft Computing)Fundamentals of Genetic Algorithms (Soft Computing)
Fundamentals of Genetic Algorithms (Soft Computing)
 
Fuzzy Systems by using fuzzy set (Soft Computing)
Fuzzy Systems by using fuzzy set (Soft Computing)Fuzzy Systems by using fuzzy set (Soft Computing)
Fuzzy Systems by using fuzzy set (Soft Computing)
 
Fuzzy Set Theory and Classical Set Theory (Soft Computing)
Fuzzy Set Theory and Classical Set Theory (Soft Computing)Fuzzy Set Theory and Classical Set Theory (Soft Computing)
Fuzzy Set Theory and Classical Set Theory (Soft Computing)
 
Associative Memory using NN (Soft Computing)
Associative Memory using NN (Soft Computing)Associative Memory using NN (Soft Computing)
Associative Memory using NN (Soft Computing)
 
Back Propagation Network (Soft Computing)
Back Propagation Network (Soft Computing)Back Propagation Network (Soft Computing)
Back Propagation Network (Soft Computing)
 
Fundamentals of Neural Network (Soft Computing)
Fundamentals of Neural Network (Soft Computing)Fundamentals of Neural Network (Soft Computing)
Fundamentals of Neural Network (Soft Computing)
 
Introduction to Soft Computing (intro to the building blocks of SC)
Introduction to Soft Computing (intro to the building blocks of SC)Introduction to Soft Computing (intro to the building blocks of SC)
Introduction to Soft Computing (intro to the building blocks of SC)
 
Topological sorting
Topological sortingTopological sorting
Topological sorting
 
String matching, naive,
String matching, naive,String matching, naive,
String matching, naive,
 
Shortest path algorithms
Shortest path algorithmsShortest path algorithms
Shortest path algorithms
 
Sccd and topological sorting
Sccd and topological sortingSccd and topological sorting
Sccd and topological sorting
 
Red black trees
Red black treesRed black trees
Red black trees
 
Recurrence and master theorem
Recurrence and master theoremRecurrence and master theorem
Recurrence and master theorem
 
Rabin karp string matcher
Rabin karp string matcherRabin karp string matcher
Rabin karp string matcher
 
Minimum spanning tree
Minimum spanning treeMinimum spanning tree
Minimum spanning tree
 
Merge sort analysis
Merge sort analysisMerge sort analysis
Merge sort analysis
 
Loop invarient
Loop invarientLoop invarient
Loop invarient
 
Linear sort
Linear sortLinear sort
Linear sort
 
Heap and heapsort
Heap and heapsortHeap and heapsort
Heap and heapsort
 

Recently uploaded

Energy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptxEnergy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptxsiddharthjain2303
 
Ch10-Global Supply Chain - Cadena de Suministro.pdf
Ch10-Global Supply Chain - Cadena de Suministro.pdfCh10-Global Supply Chain - Cadena de Suministro.pdf
Ch10-Global Supply Chain - Cadena de Suministro.pdfChristianCDAM
 
DM Pillar Training Manual.ppt will be useful in deploying TPM in project
DM Pillar Training Manual.ppt will be useful in deploying TPM in projectDM Pillar Training Manual.ppt will be useful in deploying TPM in project
DM Pillar Training Manual.ppt will be useful in deploying TPM in projectssuserb6619e
 
Virtual memory management in Operating System
Virtual memory management in Operating SystemVirtual memory management in Operating System
Virtual memory management in Operating SystemRashmi Bhat
 
Designing pile caps according to ACI 318-19.pptx
Designing pile caps according to ACI 318-19.pptxDesigning pile caps according to ACI 318-19.pptx
Designing pile caps according to ACI 318-19.pptxErbil Polytechnic University
 
Transport layer issues and challenges - Guide
Transport layer issues and challenges - GuideTransport layer issues and challenges - Guide
Transport layer issues and challenges - GuideGOPINATHS437943
 
Input Output Management in Operating System
Input Output Management in Operating SystemInput Output Management in Operating System
Input Output Management in Operating SystemRashmi Bhat
 
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...Erbil Polytechnic University
 
Crushers to screens in aggregate production
Crushers to screens in aggregate productionCrushers to screens in aggregate production
Crushers to screens in aggregate productionChinnuNinan
 
Risk Management in Engineering Construction Project
Risk Management in Engineering Construction ProjectRisk Management in Engineering Construction Project
Risk Management in Engineering Construction ProjectErbil Polytechnic University
 
BSNL Internship Training presentation.pptx
BSNL Internship Training presentation.pptxBSNL Internship Training presentation.pptx
BSNL Internship Training presentation.pptxNiranjanYadav41
 
Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...121011101441
 
US Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionUS Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionMebane Rash
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AIabhishek36461
 
11. Properties of Liquid Fuels in Energy Engineering.pdf
11. Properties of Liquid Fuels in Energy Engineering.pdf11. Properties of Liquid Fuels in Energy Engineering.pdf
11. Properties of Liquid Fuels in Energy Engineering.pdfHafizMudaserAhmad
 
Main Memory Management in Operating System
Main Memory Management in Operating SystemMain Memory Management in Operating System
Main Memory Management in Operating SystemRashmi Bhat
 
Configuration of IoT devices - Systems managament
Configuration of IoT devices - Systems managamentConfiguration of IoT devices - Systems managament
Configuration of IoT devices - Systems managamentBharaniDharan195623
 

Recently uploaded (20)

Energy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptxEnergy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptx
 
Ch10-Global Supply Chain - Cadena de Suministro.pdf
Ch10-Global Supply Chain - Cadena de Suministro.pdfCh10-Global Supply Chain - Cadena de Suministro.pdf
Ch10-Global Supply Chain - Cadena de Suministro.pdf
 
DM Pillar Training Manual.ppt will be useful in deploying TPM in project
DM Pillar Training Manual.ppt will be useful in deploying TPM in projectDM Pillar Training Manual.ppt will be useful in deploying TPM in project
DM Pillar Training Manual.ppt will be useful in deploying TPM in project
 
Virtual memory management in Operating System
Virtual memory management in Operating SystemVirtual memory management in Operating System
Virtual memory management in Operating System
 
Design and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdfDesign and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdf
 
Designing pile caps according to ACI 318-19.pptx
Designing pile caps according to ACI 318-19.pptxDesigning pile caps according to ACI 318-19.pptx
Designing pile caps according to ACI 318-19.pptx
 
Transport layer issues and challenges - Guide
Transport layer issues and challenges - GuideTransport layer issues and challenges - Guide
Transport layer issues and challenges - Guide
 
Input Output Management in Operating System
Input Output Management in Operating SystemInput Output Management in Operating System
Input Output Management in Operating System
 
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
 
Crushers to screens in aggregate production
Crushers to screens in aggregate productionCrushers to screens in aggregate production
Crushers to screens in aggregate production
 
Risk Management in Engineering Construction Project
Risk Management in Engineering Construction ProjectRisk Management in Engineering Construction Project
Risk Management in Engineering Construction Project
 
BSNL Internship Training presentation.pptx
BSNL Internship Training presentation.pptxBSNL Internship Training presentation.pptx
BSNL Internship Training presentation.pptx
 
young call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Serviceyoung call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Service
 
Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...
 
US Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionUS Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of Action
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AI
 
11. Properties of Liquid Fuels in Energy Engineering.pdf
11. Properties of Liquid Fuels in Energy Engineering.pdf11. Properties of Liquid Fuels in Energy Engineering.pdf
11. Properties of Liquid Fuels in Energy Engineering.pdf
 
Main Memory Management in Operating System
Main Memory Management in Operating SystemMain Memory Management in Operating System
Main Memory Management in Operating System
 
Configuration of IoT devices - Systems managament
Configuration of IoT devices - Systems managamentConfiguration of IoT devices - Systems managament
Configuration of IoT devices - Systems managament
 

Vliw or epic

  • 1. 1  Very long instruction word or VLIW refers to a processor architecture designed to take advantage of instruction level parallelism(ILP).  Whereas conventional processors mostly only allow programs that specify instructions to be executed one after another.  A VLIW processor allows programs that can explicitly specify instructions to be executed at the same time (i.e. in parallel).  This type of processor architecture is intended to allow higher performance without the inherent complexity of some other approaches.  The term VLIW, and the concept of VLIW architecture itself, were invented by Josh Fisher in his research group at Yale University in the early 1980s. Very long instruction word
  • 2. 2  Explicitly parallel instruction computing (EPIC) is a term coined in 1997 by the HP–Intel alliance to describe a computing paradigm that researchers had been investigating since the early 1980s.  This paradigm is also called Independence architectures. It was the basis for Intel and HP development of the Intel Itanium architecture, and HP later asserted that "EPIC" was merely an old term for the Itanium architecture.  EPIC permits microprocessors to execute software instructions in parallel by using the compiler, rather than complex on-die circuitry, to control parallel instruction execution.  This was intended to allow simple performance scaling without resorting to higher clock frequencies. Explicitly parallel instruction computing
  • 3. 3  VLIW (very long instruction word): - Compiler packs a fixed number of instructions into a single VLIW instruction. - The instructions within a VLIW instruction are issued and executed in parallel Example: High-end signal processors (TMS320C6201)  EPIC (explicit parallel instruction computing): Evolution of VLIW Example: Intel’s IA-64, exemplified by the Itanium processor VLIW or EPIC
  • 4. 4 VLIW  VLIW (very long instruction word) processors use a long instruction word that contains a usually fixed number of operations that are fetched, decoded, issued, and executed synchronously.  All operations specified within a VLIW instruction must be independent of one another.
  • 5. 5 VLIW  Some of the key issues of a (V)LIW processor: • (very) long instruction word (up to 1 024 bits per instruction), • each instruction consists of multiple independent parallel operations, • each operation requires a statically known number of cycles to complete, • a central controller that issues a long instruction word every cycle, • multiple FUs connected through a global shared register file.
  • 6. 6 VLIW  sequential stream of long instruction words  instructions scheduled statically by the compiler  number of simultaneously issued instructions is fixed during compile-time  instruction issue is less complicated than in a superscalar processor
  • 7. 7 VLIW  Disadvantage: VLIW processors cannot react on dynamic events, e.g. cache misses, with the same flexibility like superscalars.  The number of instructions in a VLIW instruction word is usually fixed.  Padding VLIW instructions with no-ops is needed in case the full issue bandwidth is not be met. This increases code size. More recent VLIW architectures use a denser code format which allows to remove the no-ops.  VLIW is an architectural technique, whereas superscalar is a microarchitecture technique.  VLIW processors take advantage of spatial parallelism.
  • 8. 8 EPIC: a paradigm shift  Superscalar RISC solution • Based on sequential execution semantics • Compiler’s role is limited by the instruction set architecture • Superscalar hardware identifies and exploits parallelism  EPIC solution – (the evolution of VLIW) • Based on parallel execution semantics • EPIC ISA enhancements support static parallelization • Compiler takes greater responsibility for exploiting parallelism • Compiler / hardware collaboration often resembles superscalar
  • 9. 9 EPIC: a paradigm shift  Advantages of pursuing EPIC architectures • Make wide issue & deep latency less expensive in hardware • Allow processor parallelism to scale with additional VLSI density  Architect the processor to do well with in-order execution • Enhance the ISA to allow static parallelization • Use compiler technology to parallelize program • However, a purely static VLIW is not appropriate for general-purpose use
  • 10. 10 The fusion of VLIW and superscalar techniques  Superscalars need improved support for static parallelization • Static scheduling • Limited support for predicated execution  VLIWs need improved support for dynamic parallelization • Caches introduce dynamically changing memory latency • Compatibility: issue width and latency may change with new hardware • Application requirements - e.g. object oriented programming with dynamic binding  EPIC processors exhibit features derived from both • Interlock & out-of-order execution hardware are compatible with EPIC (but not required!) • EPIC processors can use dynamic translation to parallelize in software
  • 11. 11 Many EPIC features are taken from VLIWs Minisupercomputer products stimulated VLIW research (FPS, Multiflow, Cydrome) Minisupercomputers were specialized, costly, and short-lived Traditional VLIWs not suited to general purpose computing VLIW resurgence in single chip DSP & media processors Minisupercomputers exaggerated forward-looking challenges: Long latency Wide issue Large number of architected registers Compile-time scheduling to exploit exotic amounts of parallelism EPIC exploits many VLIW techniques
  • 12. 12 Shortcomings of early VLIWs  Expensive multi-chip implementations  No data cache  Poor "scalar" performance  No strategy for object code compatibility
  • 13. 13 EPIC design challenges  Develop architectures applicable to general-purpose computing • Find substantial parallelism in “difficult to parallelize” scalar programs • Provide compatibility across hardware generations • Support emerging applications (e.g. multimedia)  Compiler must find or create sufficient ILP  Combine the best attributes of VLIW & superscalar RISC (incorporated best concepts from all available sources)  Scale architectures for modern single-chip implementation
  • 14. 14 EPIC Processors, Intel's IA-64 ISA and Itanium  Joint R&D project by Hewlett-Packard and Intel (announced in June 1994)  This resulted in explicitly parallel instruction computing (EPIC) design style: • specifying ILP explicit in the machine code, that is, the parallelism is encoded directly into the instructions similarly to VLIW; • a fully predicated instruction set; • an inherently scalable instruction set (i.e., the ability to scale to a lot of FUs); • many registers; • speculative execution of load instructions
  • 15. 15 IA-64 Architecture  Unique architecture features & enhancements • Explicit parallelism and templates • Predication, speculation, memory support, and others • Floating-point and multimedia architecture  IA-64 resources available to applications • Large, application visible register set • Rotating registers, register stack, register stack engine  IA-32 & PA-RISC compatibility models
  • 16. 16 Today’s Architecture Challenges  Performance barriers : • Memory latency • Branches • Loop pipelining and call / return overhead  Headroom constraints : • Hardware-based instruction scheduling - Unable to efficiently schedule parallel execution • Resource constrained - Too few registers - Unable to fully utilize multiple execution units  Scalability limitations : • Memory addressing efficiency
  • 17. 17 Intel's IA-64 ISA  Intel 64-bit Architecture (IA-64) register model: • 128, 64-bit general purpose registers GR0-GR127 to hold values for integer and multimedia computations - each register has one additional NaT (Not a Thing) bit to indicate whether the value stored is valid, • 128, 82-bit floating-point registers FR0-FR127 - registers f0 and f1 are read-only with values +0.0 and +1.0, • 64, 1-bit predicate registers P0-PR63 - the first register p0 is read-only and always reads 1 (true) • 8, 64-bit branch registers BR0-BR7 to specify the target addresses of indirect branches
  • 18. 18 IA-64’s Large Register File BR7 BR0 Branch Registers 63 0 96 Stacked, Rotating GR1 GR31 GR127 GR32 GR0 NaT 32 Static 0 Integer Registers 63 0 Predicate Registers 1 PR1 PR63 PR0 PR15 PR16 48 Rotating 16 Static bit 0 96 Rotating GR1 GR31 GR127 GR32 GR0 32 Static 0.0 Floating-Point Registers 81 0
  • 19. 19 Intel's IA-64 ISA • IA-64 instructions are 41-bit (previously stated 40 bit) long and consist of - op-code, - predicate field (6 bits), - two source register addresses (7 bits each), - destination register address (7 bits), and - special fields (includes integer and floating-point arithmetic). • The 6-bit predicate field in each IA-64 instruction refers to a set of 64 predicate registers. • 6 types of instructions - A: Integer ALU ==> I-unit or M-unit - I: Non-ALU integer ==> I-unit - M: Memory ==> M-unit - B: Branch ==> B-unit - F: Floating-point ==> F-unit - L: Long Immediate ==> I-unit • IA-64 instructions are packed by compiler into bundles.
  • 20. 20 IA-64 Bundles  A bundle is a 128-bit long instruction word (LIW) containing three 41-bit IA-64 instructions along with a so-called 5-bit template that contains instruction grouping information.  IA-64 does not insert no-op instructions to fill slots in the bundles.  The template explicitly indicates : • first 4 bits: types of instructions • last bit (stop bit): whether the bundle can be executed in parallel with the next bundle • (previous literature): whether the instructions in the bundle can be executed in parallel or if one or more must be executed serially.  Bundled instructions don't have to be in their original program order, and they can even represent entirely different paths of a branch.  Also, the compiler can mix dependent and independent instructions together in a bundle, because the template keeps track of which is which.
  • 21. 21 IA-64 : Explicitly Parallel Architecture  IA-64 template specifies • The type of operation for each instruction - MFI, MMI, MII, MLI, MIB, MMF, MFB, MMB, MBB, BBB • Intra-bundle relationship - M / MI or MI / I • Inter-bundle relationship  Most common combinations covered by templates • Headroom for additional templates  Simplifies hardware requirements  Scales compatibly to future generations Instruction 2 41 bits Instruction 1 41 bits Instruction 0 41 bits Template 5 bits 128 bits (bundle) M=Memory F=Floating-point I=Integer L=Long Immediate B=Branch (MMI)Memory (M) Memory (M) Integer (I)