Arteris network on chip: The growing cost of wires

•Download as PPT, PDF•

3 likes•3,064 views

Arteris NoC SoC Interconnect presentation given by Jonah Probell at ARM Technology Conference 9-11 Nov 2010. Explains how traditional AXI fabrics require huge numbers of wires and leads to routing congestion, and how network on chip interconnects address routing congestion by allowing fewer wires. Explains the basics of NoC packetization and serialization.

Technology

November 9-11, 2010 The Santa Clara Convention Center www.armtechcon.com

P&R congestion is the focus of EDA "...upstream tools need to be claivoyant deep into the layout." "The worst crises are when you're deep into the layout and realize that my floorplan's no good. So how do you avoid that? Well what's needed are claivoyant tools. That is a chain of steps where each step already knows a little bit about the changes downstream." "The synthesizer can, this year, avoid congestion; and congestion is really the killer of schedules.“ -Aart DeGeus, Synopsys Synposium 2010

Interconnects logically Interconnect The interconnect transports AXI transactions between masters and slaves. The means of transportation are not defined by the AXI spec. master master master master master slave slave slave slave slave AXI AXI

Interconnect physically The interconnect lives in the hallways between IP cores. The width of the links affects the compactness of the die.

1. Growing interface complexity Address Write data Read data Write address Write data Read data Read address Write response 32 Data width Data width 32 Data width Data width 32 Control A few Wires Wires Control A few Control A few Control A few Control A few A few AHB AXI Signal Signal Data width 32 64 128 AHB signals 113 177 305 AXI signals 204 272 408

3. Relative wire cost growing Transistor sizes shrink faster than wire widths. 286 CPU (1982) 69 mm 2 Atom N450 (2010) 66 mm 2 Chips are, on average, the same size as ever.

The growing cost of wires ,[object Object],[object Object],[object Object]

Packetizing AXI to transport transactions Read Address Read Data Write Data Write Address Write Response Request from master Response to master Request Packet Response Packet packetize depacketize

Packetizing AXI to transport transactions Read Address Read Data Write Data Write Address Write Response Request to slave Response to master Request Packet Response Packet depacketize packetize

Serializing With a packetized protocol, serializing data simply requires a register and a mux. Serializing packets is much easier than serializing the AXI interface protocol.

Throughput and wires header data data header data data data data header data data data data header header data data (a) (b) (c) (d) Link width = data width + header width Header penalty = 0 Link width = header width Header penalty = 1 cycle per transaction Link width < data width Header penalty > 1 cycle per transaction Link width = data width Header penalty = 1 cycle per transaction

Selection of link width L2 DDR peripherals Place cores with high communication throughput and low latency requirements near each other. Use zero header penalty links between such cores. Use narrow links for long paths to low throughput peripherals. This minimizes the number of long wires for P&R

Experimental packetized link width results obtained with Arteris FlexNoC packetized interconnect generator Data width 32 64 128 AHB signals 113 177 305 AXI signals 204 272 408 Packets with 0 penalty cycles 146 218 362 Packets with 1 penalty cycle 84 156 300

Experimental place & route results Standard NoC

Summary Routing congestion is the problem of the decade for chip implementation. AXI is expensive in wires. Packetizing and serializing transaction data effectively reduces routing congestion.

Clairvoyant IP physical synthesis P&R RTL serializing interconnect -> fewer wires physical synthesis -> shorter wires

What's hot

Partially connected 3D NoC - Access Noxim. Abhishek Madav

Report star topology using noc router Vikas Tiwari

Network on ChipSrinivas Vasamsetti

MINIMALLY BUFFERED ROUTER USING WEIGHTED DEFLECTION ROUTING FOR MESH NETWORK ...VLSICS Design

Ieee 2015 project list_vlsiigeeks1234

Optical Burst SwitchingJYoTHiSH o.s

Performance analysis of CSMA/CA and TDMA MAC protocols in Wireless Mesh Netw...Pranjal Das

VHDL Implementation of DSDV Ad-Hoc Routing ProtocolIOSR Journals

Analysis of Packet Loss Rate in Wireless Sensor Network using LEACH ProtocolIJTET Journal

Urllc 20190709Jonathan Afendy

OPTICAL BURST SWITCHINGJigyasa Singh

Ad Hocnayakslideshare

Frame relayTalha Ali

Hyper Transport TechnologyRohan Khude

DSCmeetsRTS-CTS_v0Shahwaiz Afaqui

Tunneling in MPLSShehzad Amanat

PERFORMANCE STUDIES ON THE VARIOUS ROUTING PROTOCOLS IN AD-HOC NETWORKSJYoTHiSH o.s

VTU 8TH SEM CSE ADHOC NETWORKS SOLVED PAPERS OF JUNE-2014 DEC-14 & JUNE-2015vtunotesbysree

What's hot (18)

Partially connected 3D NoC - Access Noxim.

Report star topology using noc router

Network on Chip

MINIMALLY BUFFERED ROUTER USING WEIGHTED DEFLECTION ROUTING FOR MESH NETWORK ...

Ieee 2015 project list_vlsi

Optical Burst Switching

Performance analysis of CSMA/CA and TDMA MAC protocols in Wireless Mesh Netw...

VHDL Implementation of DSDV Ad-Hoc Routing Protocol

Analysis of Packet Loss Rate in Wireless Sensor Network using LEACH Protocol

Urllc 20190709

OPTICAL BURST SWITCHING

Ad Hoc

Frame relay

Hyper Transport Technology

DSCmeetsRTS-CTS_v0

Tunneling in MPLS

PERFORMANCE STUDIES ON THE VARIOUS ROUTING PROTOCOLS IN AD-HOC NETWORKS

VTU 8TH SEM CSE ADHOC NETWORKS SOLVED PAPERS OF JUNE-2014 DEC-14 & JUNE-2015

Similar to Arteris network on chip: The growing cost of wires

VOIP QOSThomas Mangin

Study and Emulation of 10G-EPON with Triple PlaySatya Prakash Rout

UART project report by Tarun Khaneja ( 09034406598 )Tarun Khaneja

Adhoc mobile wireless network enhancement based on cisco devicesIJCNCJournal

Www ccnav5 net_ccna_1_chapter_4_v5_0_exam_answers_2014Đồng Quốc Vương

Network protocolsIT Tech

ch5-network.pptJohnColaco1

Disadvantages And Disadvantages Of Wireless Networked And...Kimberly Jones

internet protocolsSrinivasa Rao

Network presentation 24 3-18Kirthika Tamil

class30.pptwebhostingguy

IP NETWORKSKathirvel Ayyaswamy

Introduction to networkingMohsen Sarakbi

class28.pptwebhostingguy

HSTR SeminarRonald Bartels

TCP/IP 3RD SEM.2012 AUG.ASSIGNMENTmayank's it solution pvt.ltd

Networking Technologies02.pptxANISHTP

LAN TECHNOLOGLESfatma shabaen

Local Area Network – Wired LANRaj vardhan

Storage Networking InterfacesTheFibreChannel

Similar to Arteris network on chip: The growing cost of wires (20)

VOIP QOS

Study and Emulation of 10G-EPON with Triple Play

UART project report by Tarun Khaneja ( 09034406598 )

Adhoc mobile wireless network enhancement based on cisco devices

Www ccnav5 net_ccna_1_chapter_4_v5_0_exam_answers_2014

Network protocols

ch5-network.ppt

Disadvantages And Disadvantages Of Wireless Networked And...

internet protocols

Network presentation 24 3-18

class30.ppt

IP NETWORKS

Introduction to networking

class28.ppt

HSTR Seminar

TCP/IP 3RD SEM.2012 AUG.ASSIGNMENT

Networking Technologies02.pptx

LAN TECHNOLOGLES

Local Area Network – Wired LAN

Storage Networking Interfaces

Recently uploaded

New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada

Anypoint Exchange: It’s Not Just a Repo!Manik S Magar

DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell

From Family Reminiscence to Scholarly Archive .Alan Dix

SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero

What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett

Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3

DevEX - reference for building teams, processes, and platformsSergiu Bodiu

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos

"ML in Production",Oleksandr BaganFwdays

Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan

"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays

Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro

What is Artificial Intelligence?????????blackmambaettijean

Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney

TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey

Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University

Sample pptx for embedding into website for demoHarshalMandlekar2

The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech

WordPress Websites for Engineers: Elevate Your Brandgvaughan

Recently uploaded (20)

New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024

Anypoint Exchange: It’s Not Just a Repo!

DSPy a system for AI to Write Prompts and Do Fine Tuning

From Family Reminiscence to Scholarly Archive .

SIP trunking in Janus @ Kamailio World 2024

What's New in Teams Calling, Meetings and Devices March 2024

Digital Identity is Under Attack: FIDO Paris Seminar.pptx

DevEX - reference for building teams, processes, and platforms

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)

"ML in Production",Oleksandr Bagan

Generative AI for Technical Writer or Information Developers

"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack

Unraveling Multimodality with Large Language Models.pdf

What is Artificial Intelligence?????????

Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents

TeamStation AI System Report LATAM IT Salaries 2024

Nell’iperspazio con Rocket: il Framework Web di Rust!

Sample pptx for embedding into website for demo

The Ultimate Guide to Choosing WordPress Pros and Cons

WordPress Websites for Engineers: Elevate Your Brand

Arteris network on chip: The growing cost of wires

1. November 9-11, 2010 The Santa Clara Convention Center www.armtechcon.com

2. P&R congestion is the focus of EDA "...upstream tools need to be claivoyant deep into the layout." "The worst crises are when you're deep into the layout and realize that my floorplan's no good. So how do you avoid that? Well what's needed are claivoyant tools. That is a chain of steps where each step already knows a little bit about the changes downstream." "The synthesizer can, this year, avoid congestion; and congestion is really the killer of schedules.“ -Aart DeGeus, Synopsys Synposium 2010

3. Interconnects logically Interconnect The interconnect transports AXI transactions between masters and slaves. The means of transportation are not defined by the AXI spec. master master master master master slave slave slave slave slave AXI AXI

4. Interconnect physically The interconnect lives in the hallways between IP cores. The width of the links affects the compactness of the die.

5. 1. Growing interface complexity Address Write data Read data Write address Write data Read data Read address Write response 32 Data width Data width 32 Data width Data width 32 Control A few Wires Wires Control A few Control A few Control A few Control A few A few AHB AXI Signal Signal Data width 32 64 128 AHB signals 113 177 305 AXI signals 204 272 408

6. 2. More interfaces each year

7. 3. Relative wire cost growing Transistor sizes shrink faster than wire widths. 286 CPU (1982) 69 mm 2 Atom N450 (2010) 66 mm 2 Chips are, on average, the same size as ever.

9. Packetizing AXI to transport transactions Read Address Read Data Write Data Write Address Write Response Request from master Response to master Request Packet Response Packet packetize depacketize

10. Packetizing AXI to transport transactions Read Address Read Data Write Data Write Address Write Response Request to slave Response to master Request Packet Response Packet depacketize packetize

11. Serializing With a packetized protocol, serializing data simply requires a register and a mux. Serializing packets is much easier than serializing the AXI interface protocol.

12. Throughput and wires header data data header data data data data header data data data data header header data data (a) (b) (c) (d) Link width = data width + header width Header penalty = 0 Link width = header width Header penalty = 1 cycle per transaction Link width < data width Header penalty > 1 cycle per transaction Link width = data width Header penalty = 1 cycle per transaction

13. Selection of link width L2 DDR peripherals Place cores with high communication throughput and low latency requirements near each other. Use zero header penalty links between such cores. Use narrow links for long paths to low throughput peripherals. This minimizes the number of long wires for P&R

14. Experimental packetized link width results obtained with Arteris FlexNoC packetized interconnect generator Data width 32 64 128 AHB signals 113 177 305 AXI signals 204 272 408 Packets with 0 penalty cycles 146 218 362 Packets with 1 penalty cycle 84 156 300

15. Experimental place & route results Standard NoC

16. Summary Routing congestion is the problem of the decade for chip implementation. AXI is expensive in wires. Packetizing and serializing transaction data effectively reduces routing congestion.

17. Clairvoyant IP physical synthesis P&R RTL serializing interconnect -> fewer wires physical synthesis -> shorter wires

Editor's Notes

Place & route wire routing congestion is the problem of the decade for the EDA industry.
An interconnect, done right, is a black box that simply allows masters to perform transactions with slaves without consideration of the internal topology.
Narrower interconnects allow a smaller chip floorplan.
AXI has separate channels for write address, write data, read address, read data, and write response. This allows reads to pass writes, which improves the performance of CPUs accessing caches. However, AXI requires those extra wires everywhere else in the chip, too.
The average number of IP cores in chip designs increases each year. Each year a new record is reached. The largest that I have seen is about 100.
Wire widths and average lengths are shrinking. However, not as fast as transistors are shrinking. With more logic and relatively fewer wires in the same square millimeter, routing congestion is increasing. The die size and resulting lengths of long wires are the same as ever. This makes wires larger and larger relative to gates and as a portion of die size.
Increasing congestion costs in: * silicon area (floorplan compactness) * manufacturing cost (metal layers, vias, and reliability) * time to market (timing closure and ECOs)
Packetizing is a necessary first step in our approach to reducing wire congestion. Address and control signals are encoded in a header and transported on the same generic link as the data. Link wires are untyped, meaning that they can carry header bits in one cycle and data bits in another.
A narrow link transfers a wider data elements serially over multiple cycles. A wide link transmits multiple narrower data elements simultaneously in parallel.
The typical view of packetized data transmission A configuration that uses fewer long wires where less throughput is required A configuration that compromises on wire savings in order to ensure that single cycle transactions have less overhead A configuration that ensure maximum throughput and minimum latency between those IPs where it is necessary
Configure narrow links for the long wires around the chip to relatively low bandwidth peripherals such as USB, Flash, I2C and GPIO. Configure links with low header penalty for high performance connections such as video and graphics cores. Configure maximum throughput, minimum latency, zero header penalty links between CPUs and caches.
The number of wires in a physical link using AXI, AHB, or Arteris packet connections is shown. Remember, a packet based link with zero header penalty gives the full throughput of the data width * the clock speed but with fewer pins A packet based link with 1 header cycle for each transaction uses fewer wires than AHB and with the benefits of the AXI protocol
This is a small piece of a layout congestion diagram from a chip design done twice, once with an interconnect using AXI based links and once with an Arteris packet based interconnect. The same floorplan was used in both cases.
An ounce of prevention is worth a pound of cure. Solving a problem early in a process takes much less effort that solving it later. Physical synthesis uses placement awareness to reduce the average wire length A serialized interconnect reduces the total number of wires in the first place

Arteris network on chip: The growing cost of wires

Recommended

Recommended

More Related Content

What's hot

What's hot (18)

Similar to Arteris network on chip: The growing cost of wires

Similar to Arteris network on chip: The growing cost of wires (20)

Recently uploaded

Recently uploaded (20)

Arteris network on chip: The growing cost of wires

Editor's Notes