Publicité

FPGA

Abhilash Nair
20 Feb 2013
Publicité

Contenu connexe

Publicité
Publicité

FPGA

  1. EET 3350 Digital Systems Design Textbook: John Wakerly Chapter 9: 9.6 FPGAs 1
  2. FPGA • FPGA Basics • FPGA Architecture – CLBs – I/O Blocks – Switch Matrix • Xilinx FPGAs • System Development Boards 2
  3. Advantages of FPGA • The FPGA is one of the most popular logic circuit components and has revolutionized the way digital systems are designed. Some FPGA advantages include: – Low-cost – Fast-turnaround prototype implementation – Supported by CAD/EDA tools – High density – High speed – Programmable and versatile – Flexible – Reusable – Large amounts of logic gates, registers, RAM and routing resources – Quick time-to-market – SRAM FPGA provide the benefits of custom CMOS 3
  4. FPGA • There are two primary FPGA architectures: – fine-grained – coarse-grained • Another difference in architectures is the underlying technology used to manufacture the device. The common technologies are: – PROM/EPROM/EEPROM/FLASH based – Anti-fuse based – SRAM based 4
  5. Programmable Switch Technology • SRAM • Antifuse Control Pass Gate • EPROM SRAM SRAM Cell Cell 0 1 Multiplexer 0 or 1 MUX 5
  6. Programmable Switch Technology • SRAM • Antifuse Disadvantages • EPROM Volatile External Permanent Memory Required Large Area Required Advantages Reprogrammable, easily and quickly Requires only standard integrated circuit process technology (as opposed to Antifuse) 6
  7. Programmable Switch Technology • SRAM • Antifuse • EPROM 0 1 7
  8. AntiFuse Technology • Growing an antifuse
  9. Programmable Switch Technology • SRAM • Antifuse Disadvantages • EPROM Not reprogrammable; links made are permanent Requires extra circuitry to deliver the high programming voltage Advantages Small size Relatively low series resistance Low parasitic capacitance 9
  10. Programmable Switch Technology • SRAM • Antifuse Word Line Control Gate • EPROM Bit Line -- Oxide Layer Floating Gate 1 Drain Source Word Line Control Gate Bit Line ------- Oxide Layer Floating Gate 0 Drain Source 10
  11. Programmable Switch Technology • SRAM • Antifuse Disadvantages • EPROM High resistance of EPROM transistors High static power consumption UV light exposure needed to reprogram Advantages No external memory required; retains memory even without power Reprogrammable 11
  12. Programmable Switch Technology • A summary of programmable switches 12
  13. FPGA Architectures • Fine-grained Architecture – Fine-grained made up of a sea of gates or transistors or small macro cells – With programmable interconnect between them Almost the opposite of a CPLD 13
  14. FPGA Architectures • Coarse-grained Architecture – Coarse-grained FPGAs include bigger macrocells – The macrocells usually include Flip-Flops and Look Up Tables (LUTs) which are used to implement combinatorial logic functions – In a majority of these architectures, four-input look- up table (think of it as a 16x1 ROM) implement the actual logic – The larger logic block usually results in improved performance when compared to fine-grained architectures 14
  15. FPGA Process Technology • PROM/EPROM/EEPROM/FLASH based – These implementations are typically programmed out of circuit and can or cannot be reprogrammed • PROM is one time programmable (OTP) device can only be programmed once – EPROM cells are electrically programmed in a device programmer – Some EPROM-based devices are erasable using ultra-violet (UV) lights if they are in a windowed package – EEPROMs are in low-cost plastic packaging for production • Plastic packages cannot be UV erased, they are electrically erased 15
  16. FPGA Process Technology • PROM/EPROM/EEPROM/FLASH based – An Electrically-Erasable-Programmable-Read-Only- Memory (EEPROM) memory cell is physically larger than an EPROM cell but offers the advantage of being erased electrically with no special UV erasers require. • EEPROM devices can be erased, even in low-cost plastic packaging. – FLASH-erased (or bulk erased) electrically erasable programmable read-only memory. • FLASH has the electrically erasable benefits of EEPROM but the small, economical cell size of EPROM technology. 16
  17. FPGA Process Technology • Anti-fuse based – Anti-fuse is a one-time programmable (OTP) – Fuses are permanently put in place – The anti-part of anti-fuse comes from its programming method • Instead of breaking a metal connection by passing current through it, a link is grown to make a connection – Anti-fuses are either amorphous silicon or metal-to- metal connections 17
  18. FPGA Process Technology • Anti-fuse based – The advantages of anti-fuse FPGAs include: • They are usually physically quite small • They have low resistance interconnect – Disadvantages include • They require large programming transistors on the device • They cannot be reused (they are OTP) 18
  19. FPGA Process Technology • SRAM based – SRAM cells are implemented as function generators to simulate combinatorial logic and also are used to control multiplexors and routing resources – This is by far the most popular process technology – This method is similar to the technology used in static RAM devices but with a few modifications • The RAM cells in a memory device are designed for fastest possible read/write performance • The RAM cells in a programmable device are usually designed for stability instead of read/write performance • Consequently, RAM cells in a programmable device have a low-impedance connect to VCC and ground to provide maximum stability over voltage fluctuations 19
  20. FPGA Process Technology • SRAM based (cont.) – Because static memory is volatile (the contents disappear when the power is turned off), SRAM- based devices are "booted" after power-on – This makes them in-system programmable and re- programmable, even in real-time – As a result, SRAM-based FPGAs are common in reconfigure computing applications where the device's function is dynamically changed 20
  21. FPGA Process Technology • SRAM based (cont.) – The configuration process typically requires only a few hundred milliseconds at most – Most SRAM-based devices can boot themselves automatically at power-on much like a microprocessor – Most SRAM-based devices are designed to work with either standard byte-wide PROMs or with sequential-access serial PROMs 21
  22. FPGAs • Historically, FPGA architectures and companies began around the same time as CPLDs • FPGAs are closer to “programmable ASICs” – Large emphasis on interconnection routing – Timing is difficult to predict -- multiple hops vs. the fixed delay of a CPLD’s switch matrix – But more “scalable” to large sizes • FPGA programmable logic blocks have only a few inputs and 1 or 2 flip-flops, but there are a lot more of them compared to the number of macrocells in a CPLD 22
  23. FPGAs • General FPGA chip architecture, coarse-grained a.k.a. CLB: “configurable logic block” 23
  24. FPGAs • FPGAs do not contain AND or OR planes • Three major elements: Interconnection – Logic blocks Switches Logic Block – I/O blocks – Interconnection wires and switches all elements are programmable I/O Block 24
  25. Other FPGA Building Blocks • Clock distribution • Embedded memory blocks • Special purpose blocks: – DSP blocks: • Hardware multipliers, adders and registers – Embedded microprocessors/microcontrollers – High-speed serial transceivers
  26. FPGA – Basic Logic Element • LUT to implement combinatorial logic • Register for sequential circuits • Additional logic (not shown): – Carry logic for arithmetic functions – Expansion logic for functions requiring more than 4 inputs Select Out A B C LUT LUT D Q D Clock
  27. Look-Up Tables (LUT) • Look-up table with N-inputs can be used to implement any combinatorial function of N inputs • LUT is programmed with the truth-table A B C LUT LUT Z D LUT implementation A B Z C D Truth-table Gate implementation
  28. LUT Implementation • Example: 3-input LUT • Based on multiplexers X1 X2 (pass transistors) 0/1 • LUT entries stored in 0/1 configuration memory 0/1 cells 0/1 F 0/1 0/1 0/1 0/1 configuration memory cells X3
  29. Other FPGA Building Blocks • Clock distribution • Embedded memory blocks • Special purpose blocks: – DSP blocks: • Hardware multipliers, adders and registers – Embedded microprocessors/microcontrollers – High-speed serial transceivers
  30. Special Features • Clock management – PLL,DLL – Eliminate clock skew between external clock input and on-chip clock – Low-skew global clock distribution network • Support for various interface standards • High-speed serial I/Os • Embedded processor cores • DSP blocks
  31. Configuration Storage Elements • Static Random Access Memory (SRAM) – Logical configuration is controlled by the state of SRAM bits – FPGA needs to be configured at power-on by another separated ROM • Flash Erasable Programmable ROM (Flash) – Logical configuration is implemented by floating- gate transistors that can be turned off by injecting charge onto its gate. – FPGA itself holds the program – reprogrammable, even in-circuit
  32. FPGA • Xilinx refers to the “interconnection switches” as the switch matrix IOB IOB IOB IOB IOB IOB CLB CLB CLB CLB SM SM SM Programmable Switch Matrix IOB IOB CLB CLB CLB CLB SM SM SM IOB IOB CLB CLB CLB CLB SM SM SM IOB IOB CLB CLB CLB CLB IOB IOB IOB IOB 32
  33. FPGAs • Programmable Switch Matrix programmable switch element turning the corner, etc. 33
  34. FPGA Logic Block • The storage cells in the LUTs in an FPGA are volatile – losing stored contents whenever the power is off • Using PROM to hold data permanently • The storage cells are loaded automatically from PROM when the chip is initialized Select Logic Block x1 Out 0/1 In1 0/1 In2 LUT 0/1 f In3 LUT D 0/1 In4 Q Clock x2 34
  35. FPGAs • An example of programming an FPGA x3 f f1 = x1 x2 x1 f 2 = x2 x3 x1 0 x2 0 0 f1 1 f2 f = x1 x2 + x2 x3 0 0 1 0 x2 x2 x3 f1 0 1 x1 f3 1 0/1 0/1 1 LUT 0/1 f f2 0/1 x2 35
  36. FPGAs • An example of programming an FPGA x3 f f1 = x1 x2 x1 f 2 = x2 x3 x1 0 x2 0 0 f1 1 f2 f = x1 x2 + x2 x3 0 0 1 0 x2 x2 x3 f1 0 1 x1 f3 1 0/1 0/1 1 LUT 0/1 f f2 0/1 x2 36
  37. Xilinx 4000-Series FPGAs • Characteristics of the Xilinx 4000-series FPGAs 37
  38. Configurable Logic Block (CLB) 38
  39. Logic Function Generators • Look-Up Tables (LUT) – Memory to store truth tables • F, G – 16 x 1 SRAMs •H – 8 x 1 SRAM • Can be configured as memory 39
  40. CLB function generators (F, G, H) • Use RAM to store a truth table – F, G: 4 inputs, 16 bits of RAM each – H: 3 inputs, 8 bits of RAM – RAM is loaded from an external PROM at system initialization. • Broad capability using F, G, and H: – Any 2 funcs of 4 vars, plus a func of 3 vars – Any func of 5 vars – Any func of 4 vars, plus some funcs of 6 vars – Some funcs of 9 vars, including parity and 4-bit cascadable equality checking 40
  41. FPGAs • CLB input and output connections – buried in the sea of interconnect CLB 41
  42. Detail CLB connections controlled by RAM bits 42
  43. The Fitter’s Job • Partition logic functions into CLBs • Arrange the CLBs • Interconnect the CLBs • Minimize the number of CLBs used • Minimize the size and delay of interconnect used • Work with constraints – “Locked” I/O pins – Critical-path delays – Setup and hold times of storage elements 43
  44. I/O Blocks 44
  45. Spartan-II FPGA 45
  46. Logic Fabric • Logic Cell I3 – Lookup table (LUT) I2 O 0 1 SET CE I1 – Flip-Flop D Q I0 RST – Carry logic – Muxes (not shown) • Slice I3 I2 SET O 0 1 CE – Two Logic Cells I1 D Q I0 • Spartan-3E FPGAs RST – 2K to 33K logic cells 46
  47. Memory • Block RAM DIA DOA DIPA DOPA – RAM or ROM ADDRA – True dual port CLKA • Separate read and write ports – Independent port size DIB DOB DIPB DOPB • Data width translation ADDRB – Excellent for FIFOs CLKB Block RAM Configurations Configuration Depth Data bits Parity bits 16K x 1 16Kb 1 0 8K x 2 8Kb 2 0 4K x 4 4Kb 4 0 2K x 9 2Kb 8 1 1K x 18 1Kb 16 2 512 x 36 512 32 4 47
  48. Multipliers • 18 x 18 Multipliers – Signed or unsigned 18 bit – Optional pipeline stage 36 bit – Cascadable 18 bit 48
  49. Clock Management • Digital Clock Managers (DCMs) – Clock de-skew – Phase shifting CLKIN CLK0 – Clock multiplication CLK90 – Clock division CLKFX – Frequency synthesis 49
  50. CLB Logic Cells (x4) 50
  51. Dual-Port Block Ram (SRAM) 51
  52. BASYS Board Components • A training resource 52
  53. BASYS Board Components • 100K FPGA • USB2 Port • Flash ROM • I/O Devices • PS/2 and VGA • Clock • Expansion Connectors 53
  54. FPGA Selection Guide • Xilinx Spartan-3 series FPGAs 54
  55. Summary • Complex Programmable Logic Devices – Function Blocks • AND Arrays and Macrocells – Programmable Interconnect – I/O • Field Programmable Gate Arrays – Configurable Logic Blocks • Look-up Tables – Programmable Interconnect – I/O 55

Notes de l'éditeur

  1. From Rose article (AMR) SRAM, where the switch is a pass transistor controlled by the state of a SRAM bit. SRAM = Static Random Access Memory Control Pass Gate When a one is stored in the SRAM cell the pass gate acts as a closed switch, and can be used to make a connection between two wire segments. When a zero is stored, the switch is open and the transistor presents a high resistance between the two wire segments. For Multiplexer SRAM cells connected to the select lines controls which one of the multiplexer inputs are connected to the output D-flipflops may be included with the multiplexer to hold the state of the selection (KCompton) Same idea as with the multiplexer – the bit can be used as the selection for an ALU (add, subtract, whatever…) (KCompton) Or, That bit can be used to select the output from the LUT
  2. From Rose article (AMR) SRAM, where the switch is a pass transistor controlled by the state of a SRAM bit. Since SRAM is volatile (needs to be reprogrammed after losing power), the FPGA must be loaded and configured at the time of chip power-up. This requires external permanent memory to provide the programming bits such as PROM, EPROM, EEPROM or magnetic disk. A major disadvantage of SRAM programming technology is its large area. It takes at least five transistors to implement an SRAM cell, plus at least one transistor to serve as a programmable switch. However, SRAM programming technology has two major advantages; fast re-programmability and that it requires only standard integrated circuit process technology.
  3. From Rose article (AMR) Antifuse, which, when electrically programmed, forms a low resistance path, and When a high voltage (from 11 to 20 volts, depending on the type of antifuse) is applied across its terminals the antifuse will “blow” and create a low resistance link.
  4. From Rose article (AMR) Antifuse, which, when electrically programmed, forms a low resistance path, and
  5. From Rose article (AMR) and howstuffworks EPROM, where the switch is a floating-gate transistor that can be turned off by injecting charge onto their floating gate. EPROM stands for Erasable Programmable Read Only Memory Can only go from 1 to 0 Read more on how EPROM works!!!!! … transistor that can be permanently “disabled.” This is accomplished by injecting a charge on the floating gate (gate 2 in the figure) using a high voltage between the control gate 1 and the drain of the transistor. This charge increases the threshold voltage of the transistor so that it turns off. The charge is removed by exposing the floating gate to UV light. This lowers the threshold voltage of the transistor and makes the transistor function normally. Rather than using an EPROM transistor directly as a programmable switch, the unprogrammed transistor is used to pull down a “bit line” when the “word line” is set high, as illustrated in Fig. 3. While this approach can be simply used to make a connection between the word and bit lines, it can also be used to implement a wired-AND style of logic, thereby providing both logic and routing. http://computer.howstuffworks.com/rom4.htm
  6. From Rose article (AMR), wikipedia (EPROM) EPROM, where the switch is a floating-gate transistor that can be turned off by injecting charge onto their floating gate. High static power consumption (because of pull-up resistor)
  7. Avnet SpeedWay Workshops Explain the basic LUT/Slice architecture. Since this is generic, you may decide to mention the CLB – don’t muddy the water, however. The main idea is to explain the composition of the logic fabric, and the typical fpga sizes in terms of logic cells. The F5MUX and FiMUX benefits and operations are covered in XAPP466 here: http://www.xilinx.com/bvdocs/appnotes/xapp466.pdf The basics are that the additional muxes have the capability of implementing any function of 5 inputs (F5), 6 inputs (F6), 7 inputs (F7) and 8 inputs (F8) without leaving the CLB. This doesn’t introduce any level of logic delay because the routes are inside the CLB. As far as mux functionality, the F5,6,7,8 muxes can be used to create 4:1, 8:1, 16:1 and 32:1 muxes.
  8. Avnet SpeedWay Workshops
  9. Avnet SpeedWay Workshops Pipelining in Spartan-3E means using the registers at the input and output of the multiplier. Unlike Spartan-3, the registers are part of the multiplier block and are not used from the fabric.
  10. Avnet SpeedWay Workshops
Publicité