SlideShare une entreprise Scribd logo
1  sur  12
Télécharger pour lire hors ligne
© 2018 NETRONOME SYSTEMS, INC. 1
December 3 - 6, 2018
Santa Clara Convention Center
CA, USA
REVOLUTIONIZING
THE COMPUTING
LANDSCAPE AND
BEYOND.
https://tmt.knect365.com/risc-v-summit
@risc_v
© 2018 NETRONOME SYSTEMS, INC. 2
Steven Zagorianakos
VP Silicon Development
Netronome
MASSIVELY PARALLEL
RISC-V PROCESSING WITH
TRANSACTIONAL MEMORY
https://tmt.knect365.com/risc-v-summit
@risc_v
© 2018 NETRONOME SYSTEMS, INC. 3
Introduction
• Discuss Transaction Memories
• Walk Through an Example Implementation, Utilizing Transactional Memories and
RISC-V Harts
• Full Chip, Island, Cluster and Groups of RISC-V Harts
• RISC-V Feature Set for RFPC
• Summary
© 2018 NETRONOME SYSTEMS, INC. 5
“Transactional Memory”
But still running in arbitrary
C code of any size ...
Instruction-Driven
Switch Fabric
• Transactional Memory
Hierarchy
▶ Memory
▶ Closely coupled
▶ Threaded processing
engines
▶ And hardwired transaction
types
▶ Atomics
▶ CRC
▶ Crypto
• Many, Many CPU Cores
• Require
▶ Many Cores
▶ Efficient Command Dispatch /
Fetch / Result / Synchronization
• (Not interrupt based
for example…)!
▶ WFE
▶ Currently Planned
as Custom-1
© 2018 NETRONOME SYSTEMS, INC. 6
A Practical Implementation
RFPC
Island
(~100 Cores)
RFPC
Island
(~100 Cores)
RFPC
Island
(~100 Cores)
RFPC
Island
(~100 Cores)
RFPC
Island
(~100 Cores)
RFPC
Island
(~100 Cores)
SRAM
Memory
Island
RFPC
Island
(~100 Cores)
SRAM
Memory
Island
SRAM
Memory
Island
DRAM-Backed
Memory Island
SRAM
Memory
Island
Host
Interface
Island
DRAM Cache
Config
Island
Expansion
Island
Network
Interface
Island
Host
Memory
Host
• The chip or chiplet is made up
of islands, which are connected
through the instruction-driven
switch fabric
• Which allows for implement-
tation from small to large
• Memory hierarchy provides
equal access to all types of
memories
• The config, host interface, and
network interface islands allow
for feeding data into the system
• Basic flow of data in a
SmartNIC
© 2018 NETRONOME SYSTEMS, INC. 7
RFPC Island
RFPC Cluster
(Many RFPC Cores)
RFPC Cluster
(Many RFPC Cores)
RFPC Cluster
(Many RFPC Cores)
Local
Scratch
Memory
Config/Island
Bridge
Tile Link
to Island Bus
Agent
Slice Cache
Global Bus
Island
Bus
Transactional
Memory Ops
Datapath: Posted
Coprocessor and
Memory Transactions
Caching Data/
Instructions, C Memory
Structures, etc.
Island Bus
Remote-Cache Coherency Ops
Tile Link
Tile Link
Tile Link
Slice Cache
Slice Cache
© 2018 NETRONOME SYSTEMS, INC. 8
RFPC Cluster
RFPC Group
(~10 Cores)
Transactional
Memory Ops
Tile Link
Interface
Manages
Binding
Local
Prefetch/Write
Buffer
Island Bus
interface
RFPC Group
(~10 Cores)
RFPC Group
(~10 Cores)
Island Bus
interface
RFPC Group
(~10 Cores)
Load
Store
Island
Bus
Caching Data/
Instructions, C Memory
Structures, etc.
Datapath: Posted
Coprocessor
and Memory
Transactions
Tile Link
Load
Store
Island
Bus
Datapath: Posted
Coprocessor
and Memory
Transactions
Remote-Cache
Coherency Ops
© 2018 NETRONOME SYSTEMS, INC. 9
RFPC Group
RFPC Core
RFPC Group
Coproc
(Multiply +)
Signals /
Timers
RISC-V
Pipeline
Several Cores
Per RFPC Group
Internal Cmd/
Atomic/
Prefetch/
Write Buffer
Transactional
Memory Ops
Remote-Cache
Coherency Ops
Local Shared
Memory
Code, High-Speed
Thread-Local
Data Structures
Data
Prefetch/Write
Buffer
Instruction
Fetch
© 2018 NETRONOME SYSTEMS, INC. 10
RISC-V Feature Set for RFPC
RFPC Cores are RV32IMC cores with custom-0/1 instructions
RV32IMC keeps the performance high with low silicon gate count; support for User, Machine and Debug modes only, but
provides some memory protection and both user-level and machine-level interrupts.
Custom-0 instructions permit dynamic binding of 48+-bit host address and bulk DDR addresses to 32-bit RISC-V addresses
Custom-1 instructions permit transaction memory and signaling operations
RFPC Cores collected into RFPC groups
Sharing local memory, which is directly accessed (not cache)
Simple address translation permits core-local data and stack without changing code and register initialization values
RFPC Groups collected into RFPC Clusters
Transaction initiation and signal handling (for transaction acceptance/completion) are handled also in the island bus
interfaces.
Island bus access through a shared memory, and local transactional (atomic pipeline) memory shared within the cluster
only. Non-transactional access to the cache slices
RFPC Clusters collected together
RISC-V Debug module shared amongst 40 cores - permits JTAG-based debugging of every core
The slices of cache combine as ‘L2’ cache
Provides windowing to 48-bit PCIe and 40-bit MU address spaces
RFPC is size and performance optimized
© 2018 NETRONOME SYSTEMS, INC. 11
Summary
• RISC-V harts are well suited for the processor required for implementing a
thousand CPU Smart-NIC.
• The RISC-V solutions can be tailored to meet the needs for embedded
applications with suitable choice of instruction set features, privileged
modes and debug methodology.
• We covered at a high level the organization of memories and RISC-V harts
that provides efficient processing with high latency memory transactions
• We looked at the instruction set customizations that allow this to handle
RISC-V hart interaction with the memory systems and other harts
© 2018 NETRONOME SYSTEMS, INC. 12
ODSA Workgroup
Implementing open specifications contributed by participating
companies, any vendor’s silicon die can become a building
block that can be utilized in a chiplet-based SoC design
Working together to standardize processors, accelerators,
and memory and I/O peripherals using optimal process nodes
Companies wishing to learn more, participate and become an integral part
of the ODSA Workgroup can inquire further at odsa@netronome.com or visit us
in booth #407!
© 2018 NETRONOME SYSTEMS, INC. 13
THANK
YOU
https://tmt.knect365.com/risc-v-summit
@risc_v

Contenu connexe

Tendances

Tendances (19)

ODSA NXP Presentation
ODSA NXP PresentationODSA NXP Presentation
ODSA NXP Presentation
 
RISC-V Foundation Overview
RISC-V Foundation OverviewRISC-V Foundation Overview
RISC-V Foundation Overview
 
Gernot heiser unsw sydney and se l4 foundation
Gernot heiser unsw sydney and se l4 foundationGernot heiser unsw sydney and se l4 foundation
Gernot heiser unsw sydney and se l4 foundation
 
MattockFS Computer Forensic File-System
MattockFS Computer Forensic File-SystemMattockFS Computer Forensic File-System
MattockFS Computer Forensic File-System
 
HKG18-411 - Introduction to OpenAMP which is an open source solution for hete...
HKG18-411 - Introduction to OpenAMP which is an open source solution for hete...HKG18-411 - Introduction to OpenAMP which is an open source solution for hete...
HKG18-411 - Introduction to OpenAMP which is an open source solution for hete...
 
01 high bandwidth acquisitioncomputing compressionall in a box
01 high bandwidth acquisitioncomputing compressionall in a box01 high bandwidth acquisitioncomputing compressionall in a box
01 high bandwidth acquisitioncomputing compressionall in a box
 
Comprehensive XDP Off‌load-handling the Edge Cases
Comprehensive XDP Off‌load-handling the Edge CasesComprehensive XDP Off‌load-handling the Edge Cases
Comprehensive XDP Off‌load-handling the Edge Cases
 
Disaggregation a Primer: Optimizing design for Edge Cloud & Bare Metal applic...
Disaggregation a Primer: Optimizing design for Edge Cloud & Bare Metal applic...Disaggregation a Primer: Optimizing design for Edge Cloud & Bare Metal applic...
Disaggregation a Primer: Optimizing design for Edge Cloud & Bare Metal applic...
 
PoC Requirements and Use Cases
PoC Requirements and Use CasesPoC Requirements and Use Cases
PoC Requirements and Use Cases
 
ODSA Speedster22i FPGA for POC
ODSA Speedster22i FPGA for POCODSA Speedster22i FPGA for POC
ODSA Speedster22i FPGA for POC
 
99.999% Available OpenStack Cloud - A Builder's Guide
99.999% Available OpenStack Cloud - A Builder's Guide99.999% Available OpenStack Cloud - A Builder's Guide
99.999% Available OpenStack Cloud - A Builder's Guide
 
A Framework with Cloud Integration for CNN Acceleration on FPGA Devices
A Framework with Cloud Integration for CNN Acceleration on FPGA DevicesA Framework with Cloud Integration for CNN Acceleration on FPGA Devices
A Framework with Cloud Integration for CNN Acceleration on FPGA Devices
 
Aus Post Archiving
Aus Post ArchivingAus Post Archiving
Aus Post Archiving
 
P4 to OpenDataPlane Compiler - BUD17-304
P4 to OpenDataPlane Compiler - BUD17-304P4 to OpenDataPlane Compiler - BUD17-304
P4 to OpenDataPlane Compiler - BUD17-304
 
Secure IoT Firmware for RISC-V
Secure IoT Firmware for RISC-VSecure IoT Firmware for RISC-V
Secure IoT Firmware for RISC-V
 
CEPH DAY BERLIN - WELCOME
CEPH DAY BERLIN - WELCOME CEPH DAY BERLIN - WELCOME
CEPH DAY BERLIN - WELCOME
 
BKK16-100K1 George Grey, Linaro CEO Opening Keynote
BKK16-100K1 George Grey, Linaro CEO Opening KeynoteBKK16-100K1 George Grey, Linaro CEO Opening Keynote
BKK16-100K1 George Grey, Linaro CEO Opening Keynote
 
Deploying IPv6 - planning, common pitfalls and security-considerations
Deploying IPv6 - planning, common pitfalls and security-considerationsDeploying IPv6 - planning, common pitfalls and security-considerations
Deploying IPv6 - planning, common pitfalls and security-considerations
 
Closing the RISC-V compliance gap via fuzzing
Closing the RISC-V compliance gap via fuzzingClosing the RISC-V compliance gap via fuzzing
Closing the RISC-V compliance gap via fuzzing
 

Similaire à Massively Parallel RISC-V Processing with Transactional Memory

CETH for XDP [Linux Meetup Santa Clara | July 2016]
CETH for XDP [Linux Meetup Santa Clara | July 2016] CETH for XDP [Linux Meetup Santa Clara | July 2016]
CETH for XDP [Linux Meetup Santa Clara | July 2016]
IO Visor Project
 
Advanced Networking: The Critical Path for HPC, Cloud, Machine Learning and more
Advanced Networking: The Critical Path for HPC, Cloud, Machine Learning and moreAdvanced Networking: The Critical Path for HPC, Cloud, Machine Learning and more
Advanced Networking: The Critical Path for HPC, Cloud, Machine Learning and more
inside-BigData.com
 
Realizing Exabyte-scale PM Centric Architectures and Memory Fabrics
Realizing Exabyte-scale PM Centric Architectures and Memory FabricsRealizing Exabyte-scale PM Centric Architectures and Memory Fabrics
Realizing Exabyte-scale PM Centric Architectures and Memory Fabrics
inside-BigData.com
 

Similaire à Massively Parallel RISC-V Processing with Transactional Memory (20)

Flexible and Scalable Domain-Specific Architectures
Flexible and Scalable Domain-Specific ArchitecturesFlexible and Scalable Domain-Specific Architectures
Flexible and Scalable Domain-Specific Architectures
 
SoC Solutions Enabling Server-Based Networking
SoC Solutions Enabling Server-Based NetworkingSoC Solutions Enabling Server-Based Networking
SoC Solutions Enabling Server-Based Networking
 
P4_tutorial.pdf
P4_tutorial.pdfP4_tutorial.pdf
P4_tutorial.pdf
 
PEARC17: Interactive Code Adaptation Tool for Modernizing Applications for In...
PEARC17: Interactive Code Adaptation Tool for Modernizing Applications for In...PEARC17: Interactive Code Adaptation Tool for Modernizing Applications for In...
PEARC17: Interactive Code Adaptation Tool for Modernizing Applications for In...
 
2009-01-28 DOI NBC Red Hat on System z Performance Considerations
2009-01-28 DOI NBC Red Hat on System z Performance Considerations2009-01-28 DOI NBC Red Hat on System z Performance Considerations
2009-01-28 DOI NBC Red Hat on System z Performance Considerations
 
Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...
Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...
Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...
 
CETH for XDP [Linux Meetup Santa Clara | July 2016]
CETH for XDP [Linux Meetup Santa Clara | July 2016] CETH for XDP [Linux Meetup Santa Clara | July 2016]
CETH for XDP [Linux Meetup Santa Clara | July 2016]
 
Host Data Plane Acceleration: SmartNIC Deployment Models
Host Data Plane Acceleration: SmartNIC Deployment ModelsHost Data Plane Acceleration: SmartNIC Deployment Models
Host Data Plane Acceleration: SmartNIC Deployment Models
 
Predictable Big Data Performance in Real-time
Predictable Big Data Performance in Real-timePredictable Big Data Performance in Real-time
Predictable Big Data Performance in Real-time
 
A Dataflow Processing Chip for Training Deep Neural Networks
A Dataflow Processing Chip for Training Deep Neural NetworksA Dataflow Processing Chip for Training Deep Neural Networks
A Dataflow Processing Chip for Training Deep Neural Networks
 
New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...
New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...
New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...
 
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storageWebinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
 
Elastify Cloud-Native Spark Application with Persistent Memory
Elastify Cloud-Native Spark Application with Persistent MemoryElastify Cloud-Native Spark Application with Persistent Memory
Elastify Cloud-Native Spark Application with Persistent Memory
 
Brian Bulkowski. Aerospike
Brian Bulkowski. AerospikeBrian Bulkowski. Aerospike
Brian Bulkowski. Aerospike
 
Advanced Networking: The Critical Path for HPC, Cloud, Machine Learning and more
Advanced Networking: The Critical Path for HPC, Cloud, Machine Learning and moreAdvanced Networking: The Critical Path for HPC, Cloud, Machine Learning and more
Advanced Networking: The Critical Path for HPC, Cloud, Machine Learning and more
 
Arm - ceph on arm update
Arm - ceph on arm updateArm - ceph on arm update
Arm - ceph on arm update
 
Realizing Exabyte-scale PM Centric Architectures and Memory Fabrics
Realizing Exabyte-scale PM Centric Architectures and Memory FabricsRealizing Exabyte-scale PM Centric Architectures and Memory Fabrics
Realizing Exabyte-scale PM Centric Architectures and Memory Fabrics
 
Data Plane Evolution: Towards Openness and Flexibility
Data Plane Evolution: Towards Openness and FlexibilityData Plane Evolution: Towards Openness and Flexibility
Data Plane Evolution: Towards Openness and Flexibility
 
Container Attached Storage (CAS) with OpenEBS - SDC 2018
Container Attached Storage (CAS) with OpenEBS -  SDC 2018Container Attached Storage (CAS) with OpenEBS -  SDC 2018
Container Attached Storage (CAS) with OpenEBS - SDC 2018
 
Dataplane networking acceleration with OpenDataplane / Максим Уваров (Linaro)
Dataplane networking acceleration with OpenDataplane / Максим Уваров (Linaro)Dataplane networking acceleration with OpenDataplane / Максим Уваров (Linaro)
Dataplane networking acceleration with OpenDataplane / Максим Уваров (Linaro)
 

Plus de Netronome

Efficient JIT to 32-bit Arches
Efficient JIT to 32-bit ArchesEfficient JIT to 32-bit Arches
Efficient JIT to 32-bit Arches
Netronome
 

Plus de Netronome (20)

LFSMM AF XDP Queue I-DS
LFSMM AF XDP Queue I-DSLFSMM AF XDP Queue I-DS
LFSMM AF XDP Queue I-DS
 
LFSMM Verifier Optimizations and 1 M Instructions
LFSMM Verifier Optimizations and 1 M InstructionsLFSMM Verifier Optimizations and 1 M Instructions
LFSMM Verifier Optimizations and 1 M Instructions
 
Using Network Acceleration for an Optimized Edge Cloud Server Architecture
Using Network Acceleration for an Optimized Edge Cloud Server ArchitectureUsing Network Acceleration for an Optimized Edge Cloud Server Architecture
Using Network Acceleration for an Optimized Edge Cloud Server Architecture
 
Offloading TC Rules on OVS Internal Ports
Offloading TC Rules on OVS Internal Ports Offloading TC Rules on OVS Internal Ports
Offloading TC Rules on OVS Internal Ports
 
Quality of Service Ingress Rate Limiting and OVS Hardware Offloads
Quality of Service Ingress Rate Limiting and OVS Hardware OffloadsQuality of Service Ingress Rate Limiting and OVS Hardware Offloads
Quality of Service Ingress Rate Limiting and OVS Hardware Offloads
 
ODSA Sub-Project Launch
 ODSA Sub-Project Launch ODSA Sub-Project Launch
ODSA Sub-Project Launch
 
Unifying Network Filtering Rules for the Linux Kernel with eBPF
Unifying Network Filtering Rules for the Linux Kernel with eBPFUnifying Network Filtering Rules for the Linux Kernel with eBPF
Unifying Network Filtering Rules for the Linux Kernel with eBPF
 
Offloading Linux LAG Devices Via Open vSwitch and TC
Offloading Linux LAG Devices Via Open vSwitch and TCOffloading Linux LAG Devices Via Open vSwitch and TC
Offloading Linux LAG Devices Via Open vSwitch and TC
 
eBPF Debugging Infrastructure - Current Techniques
eBPF Debugging Infrastructure - Current TechniqueseBPF Debugging Infrastructure - Current Techniques
eBPF Debugging Infrastructure - Current Techniques
 
Efficient JIT to 32-bit Arches
Efficient JIT to 32-bit ArchesEfficient JIT to 32-bit Arches
Efficient JIT to 32-bit Arches
 
eBPF & Switch Abstractions
eBPF & Switch AbstractionseBPF & Switch Abstractions
eBPF & Switch Abstractions
 
eBPF Tooling and Debugging Infrastructure
eBPF Tooling and Debugging InfrastructureeBPF Tooling and Debugging Infrastructure
eBPF Tooling and Debugging Infrastructure
 
BPF Hardware Offload Deep Dive
BPF Hardware Offload Deep DiveBPF Hardware Offload Deep Dive
BPF Hardware Offload Deep Dive
 
Demystify eBPF JIT Compiler
Demystify eBPF JIT CompilerDemystify eBPF JIT Compiler
Demystify eBPF JIT Compiler
 
eBPF/XDP
eBPF/XDP eBPF/XDP
eBPF/XDP
 
P4 Introduction
P4 Introduction P4 Introduction
P4 Introduction
 
The Power of SmartNICs
The Power of SmartNICsThe Power of SmartNICs
The Power of SmartNICs
 
DPDK Support for New HW Offloads
DPDK Support for New HW OffloadsDPDK Support for New HW Offloads
DPDK Support for New HW Offloads
 
Open vSwitch Offload: Conntrack and the Upstream Kernel
Open vSwitch Offload: Conntrack and the Upstream KernelOpen vSwitch Offload: Conntrack and the Upstream Kernel
Open vSwitch Offload: Conntrack and the Upstream Kernel
 
OVS Hardware Offload with TC Flower
OVS Hardware Offload with TC FlowerOVS Hardware Offload with TC Flower
OVS Hardware Offload with TC Flower
 

Dernier

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Dernier (20)

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 

Massively Parallel RISC-V Processing with Transactional Memory

  • 1. © 2018 NETRONOME SYSTEMS, INC. 1 December 3 - 6, 2018 Santa Clara Convention Center CA, USA REVOLUTIONIZING THE COMPUTING LANDSCAPE AND BEYOND. https://tmt.knect365.com/risc-v-summit @risc_v
  • 2. © 2018 NETRONOME SYSTEMS, INC. 2 Steven Zagorianakos VP Silicon Development Netronome MASSIVELY PARALLEL RISC-V PROCESSING WITH TRANSACTIONAL MEMORY https://tmt.knect365.com/risc-v-summit @risc_v
  • 3. © 2018 NETRONOME SYSTEMS, INC. 3 Introduction • Discuss Transaction Memories • Walk Through an Example Implementation, Utilizing Transactional Memories and RISC-V Harts • Full Chip, Island, Cluster and Groups of RISC-V Harts • RISC-V Feature Set for RFPC • Summary
  • 4. © 2018 NETRONOME SYSTEMS, INC. 5 “Transactional Memory” But still running in arbitrary C code of any size ... Instruction-Driven Switch Fabric • Transactional Memory Hierarchy ▶ Memory ▶ Closely coupled ▶ Threaded processing engines ▶ And hardwired transaction types ▶ Atomics ▶ CRC ▶ Crypto • Many, Many CPU Cores • Require ▶ Many Cores ▶ Efficient Command Dispatch / Fetch / Result / Synchronization • (Not interrupt based for example…)! ▶ WFE ▶ Currently Planned as Custom-1
  • 5. © 2018 NETRONOME SYSTEMS, INC. 6 A Practical Implementation RFPC Island (~100 Cores) RFPC Island (~100 Cores) RFPC Island (~100 Cores) RFPC Island (~100 Cores) RFPC Island (~100 Cores) RFPC Island (~100 Cores) SRAM Memory Island RFPC Island (~100 Cores) SRAM Memory Island SRAM Memory Island DRAM-Backed Memory Island SRAM Memory Island Host Interface Island DRAM Cache Config Island Expansion Island Network Interface Island Host Memory Host • The chip or chiplet is made up of islands, which are connected through the instruction-driven switch fabric • Which allows for implement- tation from small to large • Memory hierarchy provides equal access to all types of memories • The config, host interface, and network interface islands allow for feeding data into the system • Basic flow of data in a SmartNIC
  • 6. © 2018 NETRONOME SYSTEMS, INC. 7 RFPC Island RFPC Cluster (Many RFPC Cores) RFPC Cluster (Many RFPC Cores) RFPC Cluster (Many RFPC Cores) Local Scratch Memory Config/Island Bridge Tile Link to Island Bus Agent Slice Cache Global Bus Island Bus Transactional Memory Ops Datapath: Posted Coprocessor and Memory Transactions Caching Data/ Instructions, C Memory Structures, etc. Island Bus Remote-Cache Coherency Ops Tile Link Tile Link Tile Link Slice Cache Slice Cache
  • 7. © 2018 NETRONOME SYSTEMS, INC. 8 RFPC Cluster RFPC Group (~10 Cores) Transactional Memory Ops Tile Link Interface Manages Binding Local Prefetch/Write Buffer Island Bus interface RFPC Group (~10 Cores) RFPC Group (~10 Cores) Island Bus interface RFPC Group (~10 Cores) Load Store Island Bus Caching Data/ Instructions, C Memory Structures, etc. Datapath: Posted Coprocessor and Memory Transactions Tile Link Load Store Island Bus Datapath: Posted Coprocessor and Memory Transactions Remote-Cache Coherency Ops
  • 8. © 2018 NETRONOME SYSTEMS, INC. 9 RFPC Group RFPC Core RFPC Group Coproc (Multiply +) Signals / Timers RISC-V Pipeline Several Cores Per RFPC Group Internal Cmd/ Atomic/ Prefetch/ Write Buffer Transactional Memory Ops Remote-Cache Coherency Ops Local Shared Memory Code, High-Speed Thread-Local Data Structures Data Prefetch/Write Buffer Instruction Fetch
  • 9. © 2018 NETRONOME SYSTEMS, INC. 10 RISC-V Feature Set for RFPC RFPC Cores are RV32IMC cores with custom-0/1 instructions RV32IMC keeps the performance high with low silicon gate count; support for User, Machine and Debug modes only, but provides some memory protection and both user-level and machine-level interrupts. Custom-0 instructions permit dynamic binding of 48+-bit host address and bulk DDR addresses to 32-bit RISC-V addresses Custom-1 instructions permit transaction memory and signaling operations RFPC Cores collected into RFPC groups Sharing local memory, which is directly accessed (not cache) Simple address translation permits core-local data and stack without changing code and register initialization values RFPC Groups collected into RFPC Clusters Transaction initiation and signal handling (for transaction acceptance/completion) are handled also in the island bus interfaces. Island bus access through a shared memory, and local transactional (atomic pipeline) memory shared within the cluster only. Non-transactional access to the cache slices RFPC Clusters collected together RISC-V Debug module shared amongst 40 cores - permits JTAG-based debugging of every core The slices of cache combine as ‘L2’ cache Provides windowing to 48-bit PCIe and 40-bit MU address spaces RFPC is size and performance optimized
  • 10. © 2018 NETRONOME SYSTEMS, INC. 11 Summary • RISC-V harts are well suited for the processor required for implementing a thousand CPU Smart-NIC. • The RISC-V solutions can be tailored to meet the needs for embedded applications with suitable choice of instruction set features, privileged modes and debug methodology. • We covered at a high level the organization of memories and RISC-V harts that provides efficient processing with high latency memory transactions • We looked at the instruction set customizations that allow this to handle RISC-V hart interaction with the memory systems and other harts
  • 11. © 2018 NETRONOME SYSTEMS, INC. 12 ODSA Workgroup Implementing open specifications contributed by participating companies, any vendor’s silicon die can become a building block that can be utilized in a chiplet-based SoC design Working together to standardize processors, accelerators, and memory and I/O peripherals using optimal process nodes Companies wishing to learn more, participate and become an integral part of the ODSA Workgroup can inquire further at odsa@netronome.com or visit us in booth #407!
  • 12. © 2018 NETRONOME SYSTEMS, INC. 13 THANK YOU https://tmt.knect365.com/risc-v-summit @risc_v