Advantages of FPGA
• The FPGA is one of the most popular logic circuit components
and has revolutionized the way digital systems are designed.
Some FPGA advantages include:
– Low-cost
– Fast-turnaround prototype implementation
– Supported by CAD/EDA tools
– High density
– High speed
– Programmable and versatile
– Flexible
– Reusable
– Large amounts of logic gates, registers, RAM and routing resources
– Quick time-to-market
– SRAM FPGA provide the benefits of custom CMOS
3
FPGA
• There are two primary FPGA architectures:
– fine-grained
– coarse-grained
• Another difference in architectures is the underlying
technology used to manufacture the device. The
common technologies are:
– PROM/EPROM/EEPROM/FLASH based
– Anti-fuse based
– SRAM based
4
Programmable Switch Technology
• SRAM
• Antifuse
Disadvantages
• EPROM Volatile
External Permanent Memory
Required
Large Area Required
Advantages
Reprogrammable, easily and quickly
Requires only standard integrated
circuit process technology (as
opposed to Antifuse)
6
Programmable Switch Technology
• SRAM
• Antifuse
Disadvantages
• EPROM Not reprogrammable; links made are
permanent
Requires extra circuitry to deliver
the high programming voltage
Advantages
Small size
Relatively low series resistance
Low parasitic capacitance
9
Programmable Switch Technology
• SRAM
• Antifuse Word Line
Control Gate
• EPROM
Bit Line
-- Oxide Layer
Floating Gate
1
Drain Source
Word Line
Control Gate
Bit Line
------- Oxide Layer
Floating Gate 0
Drain Source
10
Programmable Switch Technology
• SRAM
• Antifuse
Disadvantages
• EPROM High resistance of EPROM
transistors
High static power consumption
UV light exposure needed to
reprogram
Advantages
No external memory required; retains
memory even without power
Reprogrammable
11
FPGA Architectures
• Fine-grained Architecture
– Fine-grained made up of a sea of gates or
transistors or small macro cells
– With programmable interconnect between them
Almost the opposite of a CPLD
13
FPGA Architectures
• Coarse-grained Architecture
– Coarse-grained FPGAs include bigger macrocells
– The macrocells usually include Flip-Flops and Look
Up Tables (LUTs) which are used to implement
combinatorial logic functions
– In a majority of these architectures, four-input look-
up table (think of it as a 16x1 ROM) implement the
actual logic
– The larger logic block usually results in improved
performance when compared to fine-grained
architectures
14
FPGA Process Technology
• PROM/EPROM/EEPROM/FLASH based
– These implementations are typically programmed
out of circuit and can or cannot be reprogrammed
• PROM is one time programmable (OTP) device can only be
programmed once
– EPROM cells are electrically programmed in a
device programmer
– Some EPROM-based devices are erasable using
ultra-violet (UV) lights if they are in a windowed
package
– EEPROMs are in low-cost plastic packaging for
production
• Plastic packages cannot be UV erased, they are electrically
erased
15
FPGA Process Technology
• PROM/EPROM/EEPROM/FLASH based
– An Electrically-Erasable-Programmable-Read-Only-
Memory (EEPROM) memory cell is physically
larger than an EPROM cell but offers the advantage
of being erased electrically with no special UV
erasers require.
• EEPROM devices can be erased, even in low-cost plastic
packaging.
– FLASH-erased (or bulk erased) electrically erasable
programmable read-only memory.
• FLASH has the electrically erasable benefits of EEPROM
but the small, economical cell size of EPROM technology.
16
FPGA Process Technology
• Anti-fuse based
– Anti-fuse is a one-time programmable (OTP)
– Fuses are permanently put in place
– The anti-part of anti-fuse comes from its
programming method
• Instead of breaking a metal connection by passing current
through it, a link is grown to make a connection
– Anti-fuses are either amorphous silicon or metal-to-
metal connections
17
FPGA Process Technology
• Anti-fuse based
– The advantages of anti-fuse FPGAs include:
• They are usually physically quite small
• They have low resistance interconnect
– Disadvantages include
• They require large programming transistors on the device
• They cannot be reused (they are OTP)
18
FPGA Process Technology
• SRAM based
– SRAM cells are implemented as function generators to
simulate combinatorial logic and also are used to
control multiplexors and routing resources
– This is by far the most popular process technology
– This method is similar to the technology used in static
RAM devices but with a few modifications
• The RAM cells in a memory device are designed for fastest
possible read/write performance
• The RAM cells in a programmable device are usually designed
for stability instead of read/write performance
• Consequently, RAM cells in a programmable device have a
low-impedance connect to VCC and ground to provide maximum
stability over voltage fluctuations
19
FPGA Process Technology
• SRAM based (cont.)
– Because static memory is volatile (the contents
disappear when the power is turned off), SRAM-
based devices are "booted" after power-on
– This makes them in-system programmable and re-
programmable, even in real-time
– As a result, SRAM-based FPGAs are common in
reconfigure computing applications where the
device's function is dynamically changed
20
FPGA Process Technology
• SRAM based (cont.)
– The configuration process typically requires only a
few hundred milliseconds at most
– Most SRAM-based devices can boot themselves
automatically at power-on much like a
microprocessor
– Most SRAM-based devices are designed to work
with either standard byte-wide PROMs or with
sequential-access serial PROMs
21
FPGAs
• Historically, FPGA architectures and
companies began around the same time as
CPLDs
• FPGAs are closer to “programmable ASICs”
– Large emphasis on interconnection routing
– Timing is difficult to predict -- multiple hops vs. the
fixed delay of a CPLD’s switch matrix
– But more “scalable” to large sizes
• FPGA programmable logic blocks have only a
few inputs and 1 or 2 flip-flops, but there are a
lot more of them compared to the number of
macrocells in a CPLD
22
FPGAs
• FPGAs do not contain AND or OR planes
• Three major elements:
Interconnection
– Logic blocks Switches
Logic
Block
– I/O blocks
– Interconnection wires
and switches
all elements are
programmable
I/O Block
24
Other FPGA Building Blocks
• Clock distribution
• Embedded memory blocks
• Special purpose blocks:
– DSP blocks:
• Hardware multipliers, adders and registers
– Embedded microprocessors/microcontrollers
– High-speed serial transceivers
FPGA – Basic Logic Element
• LUT to implement combinatorial logic
• Register for sequential circuits
• Additional logic (not shown):
– Carry logic for arithmetic functions
– Expansion logic for functions requiring more than 4 inputs
Select
Out
A
B
C LUT
LUT D Q
D
Clock
Look-Up Tables (LUT)
• Look-up table with N-inputs can be used to implement
any combinatorial function of N inputs
• LUT is programmed with the truth-table
A
B
C LUT
LUT Z
D
LUT implementation
A
B
Z
C
D
Truth-table Gate implementation
LUT Implementation
• Example: 3-input LUT
• Based on multiplexers X1
X2
(pass transistors) 0/1
• LUT entries stored in 0/1
configuration memory 0/1
cells 0/1
F
0/1
0/1
0/1
0/1
configuration memory cells
X3
Other FPGA Building Blocks
• Clock distribution
• Embedded memory blocks
• Special purpose blocks:
– DSP blocks:
• Hardware multipliers, adders and registers
– Embedded microprocessors/microcontrollers
– High-speed serial transceivers
Special Features
• Clock management
– PLL,DLL
– Eliminate clock skew between external clock input
and on-chip clock
– Low-skew global clock distribution network
• Support for various interface standards
• High-speed serial I/Os
• Embedded processor cores
• DSP blocks
Configuration Storage Elements
• Static Random Access Memory (SRAM)
– Logical configuration is controlled by the state of
SRAM bits
– FPGA needs to be configured at power-on by
another separated ROM
• Flash Erasable Programmable ROM (Flash)
– Logical configuration is implemented by floating-
gate transistors that can be turned off by injecting
charge onto its gate.
– FPGA itself holds the program
– reprogrammable, even in-circuit
FPGA
• Xilinx refers to the “interconnection switches” as the
switch matrix IOB IOB IOB IOB
IOB
IOB
CLB CLB CLB CLB
SM SM SM
Programmable
Switch Matrix
IOB
IOB CLB CLB CLB CLB
SM SM SM
IOB
IOB
CLB CLB CLB CLB
SM SM SM
IOB
IOB
CLB CLB CLB CLB
IOB IOB IOB IOB
32
FPGA Logic Block
• The storage cells in the LUTs in an FPGA are volatile
– losing stored contents whenever the power is off
• Using PROM to hold data permanently
• The storage cells are loaded automatically from
PROM when the chip is initialized
Select
Logic Block x1
Out 0/1
In1 0/1
In2 LUT 0/1
f
In3 LUT D
0/1
In4 Q
Clock x2
34
FPGAs
• An example of programming an FPGA
x3 f
f1 = x1 x2
x1 f 2 = x2 x3
x1 0 x2 0
0
f1
1
f2 f = x1 x2 + x2 x3
0 0
1 0
x2 x2 x3
f1 0
1 x1
f3
1 0/1
0/1
1 LUT 0/1
f
f2 0/1
x2
35
FPGAs
• An example of programming an FPGA
x3 f
f1 = x1 x2
x1 f 2 = x2 x3
x1 0 x2 0
0
f1
1
f2 f = x1 x2 + x2 x3
0 0
1 0
x2 x2 x3
f1 0
1 x1
f3
1 0/1
0/1
1 LUT 0/1
f
f2 0/1
x2
36
Logic Function Generators
• Look-Up Tables (LUT)
– Memory to store truth tables
• F, G
– 16 x 1 SRAMs
•H
– 8 x 1 SRAM
• Can be configured as memory
39
CLB function generators (F, G, H)
• Use RAM to store a truth table
– F, G: 4 inputs, 16 bits of RAM each
– H: 3 inputs, 8 bits of RAM
– RAM is loaded from an external PROM at system
initialization.
• Broad capability using F, G, and H:
– Any 2 funcs of 4 vars, plus a func of 3 vars
– Any func of 5 vars
– Any func of 4 vars, plus some funcs of 6 vars
– Some funcs of 9 vars, including parity and 4-bit
cascadable equality checking
40
FPGAs
• CLB input and output connections – buried in the sea
of interconnect
CLB
41
The Fitter’s Job
• Partition logic functions into CLBs
• Arrange the CLBs
• Interconnect the CLBs
• Minimize the number of CLBs used
• Minimize the size and delay of interconnect
used
• Work with constraints
– “Locked” I/O pins
– Critical-path delays
– Setup and hold times of storage elements
43
Logic Fabric
• Logic Cell
I3
– Lookup table (LUT) I2
O 0 1
SET
CE
I1
– Flip-Flop D Q
I0
RST
– Carry logic
– Muxes (not shown)
• Slice I3
I2 SET
O 0 1
CE
– Two Logic Cells I1 D Q
I0
• Spartan-3E FPGAs RST
– 2K to 33K logic cells
46
Memory
• Block RAM DIA DOA
DIPA DOPA
– RAM or ROM
ADDRA
– True dual port
CLKA
• Separate read and write ports
– Independent port size DIB DOB
DIPB DOPB
• Data width translation
ADDRB
– Excellent for FIFOs CLKB
Block RAM Configurations
Configuration Depth Data bits Parity bits
16K x 1 16Kb 1 0
8K x 2 8Kb 2 0
4K x 4 4Kb 4 0
2K x 9 2Kb 8 1
1K x 18 1Kb 16 2
512 x 36 512 32 4
47
Multipliers
• 18 x 18 Multipliers
– Signed or unsigned 18 bit
– Optional pipeline stage 36 bit
– Cascadable
18 bit
48
Summary
• Complex Programmable Logic Devices
– Function Blocks
• AND Arrays and Macrocells
– Programmable Interconnect
– I/O
• Field Programmable Gate Arrays
– Configurable Logic Blocks
• Look-up Tables
– Programmable Interconnect
– I/O
55
Notes de l'éditeur
From Rose article (AMR) SRAM, where the switch is a pass transistor controlled by the state of a SRAM bit. SRAM = Static Random Access Memory Control Pass Gate When a one is stored in the SRAM cell the pass gate acts as a closed switch, and can be used to make a connection between two wire segments. When a zero is stored, the switch is open and the transistor presents a high resistance between the two wire segments. For Multiplexer SRAM cells connected to the select lines controls which one of the multiplexer inputs are connected to the output D-flipflops may be included with the multiplexer to hold the state of the selection (KCompton) Same idea as with the multiplexer – the bit can be used as the selection for an ALU (add, subtract, whatever…) (KCompton) Or, That bit can be used to select the output from the LUT
From Rose article (AMR) SRAM, where the switch is a pass transistor controlled by the state of a SRAM bit. Since SRAM is volatile (needs to be reprogrammed after losing power), the FPGA must be loaded and configured at the time of chip power-up. This requires external permanent memory to provide the programming bits such as PROM, EPROM, EEPROM or magnetic disk. A major disadvantage of SRAM programming technology is its large area. It takes at least five transistors to implement an SRAM cell, plus at least one transistor to serve as a programmable switch. However, SRAM programming technology has two major advantages; fast re-programmability and that it requires only standard integrated circuit process technology.
From Rose article (AMR) Antifuse, which, when electrically programmed, forms a low resistance path, and When a high voltage (from 11 to 20 volts, depending on the type of antifuse) is applied across its terminals the antifuse will “blow” and create a low resistance link.
From Rose article (AMR) Antifuse, which, when electrically programmed, forms a low resistance path, and
From Rose article (AMR) and howstuffworks EPROM, where the switch is a floating-gate transistor that can be turned off by injecting charge onto their floating gate. EPROM stands for Erasable Programmable Read Only Memory Can only go from 1 to 0 Read more on how EPROM works!!!!! … transistor that can be permanently “disabled.” This is accomplished by injecting a charge on the floating gate (gate 2 in the figure) using a high voltage between the control gate 1 and the drain of the transistor. This charge increases the threshold voltage of the transistor so that it turns off. The charge is removed by exposing the floating gate to UV light. This lowers the threshold voltage of the transistor and makes the transistor function normally. Rather than using an EPROM transistor directly as a programmable switch, the unprogrammed transistor is used to pull down a “bit line” when the “word line” is set high, as illustrated in Fig. 3. While this approach can be simply used to make a connection between the word and bit lines, it can also be used to implement a wired-AND style of logic, thereby providing both logic and routing. http://computer.howstuffworks.com/rom4.htm
From Rose article (AMR), wikipedia (EPROM) EPROM, where the switch is a floating-gate transistor that can be turned off by injecting charge onto their floating gate. High static power consumption (because of pull-up resistor)
Avnet SpeedWay Workshops Explain the basic LUT/Slice architecture. Since this is generic, you may decide to mention the CLB – don’t muddy the water, however. The main idea is to explain the composition of the logic fabric, and the typical fpga sizes in terms of logic cells. The F5MUX and FiMUX benefits and operations are covered in XAPP466 here: http://www.xilinx.com/bvdocs/appnotes/xapp466.pdf The basics are that the additional muxes have the capability of implementing any function of 5 inputs (F5), 6 inputs (F6), 7 inputs (F7) and 8 inputs (F8) without leaving the CLB. This doesn’t introduce any level of logic delay because the routes are inside the CLB. As far as mux functionality, the F5,6,7,8 muxes can be used to create 4:1, 8:1, 16:1 and 32:1 muxes.
Avnet SpeedWay Workshops
Avnet SpeedWay Workshops Pipelining in Spartan-3E means using the registers at the input and output of the multiplier. Unlike Spartan-3, the registers are part of the multiplier block and are not used from the fabric.