The document discusses trends in system-on-chip (SoC) and network-on-chip (NoC) architectures, including the integration of multiple CPU cores, hardware accelerators, and peripherals on a single chip connected by an on-chip network. NoCs are presented as a scalable solution to connect the growing number of computational resources in modern SoCs. Examples of ARM-based multicore SoCs including the big.LITTLE subsystem are provided to illustrate the challenges of cache coherency, interrupts, and task migration in multicore systems.
1. May 1, 2013 1
Trends & Design Considerations
ChipEx 2013
Multicores & Network On Chip
Architectures
ALL Rights Reserved
Oren Hollander
FPGA & ARM Expert
2. May 1, 2013 2
What is SoC ?
• On-chip integration of a variety of functional
hardware blocks to suit a specific product
application
– CPU/CPUs + Accelerators (GPU, VPU, IPU, etc.)
– Small form factor
– High volume of peripherals
• Blocks can operate at lower frequencies while
delivering higher system-level performance and
consuming much lower system-level power
ALL Rights Reserved
Enable rich features at reasonable computing
speed and reasonable price points
3. May 1, 2013 3
SoC Trends
• Apple acquired PA-Semi
– Enabling it to design its own application processors
• Qualcomm acquired Atheros
– Strengthen its wireless connectivity suite and Summit
Technology for enhanced power management capability
• Nvidia acquired Icera
– Strengthen its connectivity offering
• Intel acquired Infineon Wireless
– Gain entry into the baseband connectivity market
ALL Rights Reserved
In just five years, the SoC technology has
catapulted from enabling basic
computation/connectivity on a feature phone
to being at the heart of all smartphones and
early stage ultrabooks, capable of a wide
range of functions including audio/video,
gaming, communication and productivity
4. May 1, 2013 4
ARM Connected Community – 800+
ALL Rights Reserved
6. May 1, 2013 6
What is NoC ?
• NOC is a network of computational, storage and I/O
resources, interconnected by a network of switches
– Connect processing cores and subsystems in
Multiprocessor System-on-Chips
• One of the main component of NoC is a router which
is attached to a processing core (CPU or hardware
accelerator) and tranfer messages from one NoC
processing core to another core
– Resources communicate with each other using addressed
data packets routed to their destination by the switch
fabric
ALL Rights Reserved
7. May 1, 2013 7
Why do we need NoC ?
• State-of-the-art SoC communication architectures start
facing scalability as well as modularity limitations
– More advanced bus specifications are emerging to deal with
these issues at the expense of silicon area and complexity
• Communication architecture evolutions mainly regard bus
protocols (to better exploit available bandwidth) and bus
topologies (to increase bandwidth)
– More aggressive solutions are needed to overcome the
scalability limitation
• NoCs are currently viewed as a ‘revolutionary’ approach to
provide a scalable, high performance and robust
infrastructure for on-chip communication
ALL Rights Reserved
9. May 1, 2013 9
Multicore Challenges
• Coherency between Multi-Cores
• Coherency between Multi-Clusters
• Homogeneous and Heterogeneous MP
• Cluster booting
• System interrupts
• Tools issues (compiler & debugger)
• Energy
ALL Rights Reserved
10. May 1, 2013 10
The ARM big.LITTLE Subsystem
High performance Cortex-A15
cluster
Energy efficient Cortex-A7
cluster
CCI-400 provides cache coherency
between clusters
Shared GIC-400 interrupt controller
Note: C-A7 is not required to have
an L2 cache for coherency
management
Cortex-A15 Cortex-A7
CCI-400
CPU 1CPU 0 CPU 0 CPU 1
I$ I$ I$ I$D$ D$ D$ D$
L2 Cache + SCU L2 Cache + SCU
GIC-400
Distributor interface
CPU 0
Interface
CPU 1
Interface
CPU 2
Interface
CPU 3
Interface
Cache coherent interconnect
Interrupts
ALL Rights Reserved
11. May 1, 2013 11
CCI-400 and System Coherency
• CCI-400 2+3 (x3)
– 2 full AMBA 4 ACE slave
interfaces
– +3 ACE-Lite I/O Coherent
Slave interfaces
– +3 ACE-Lite master
interfaces
• CCI interfaces:
– AMBA 4 ACE and ACE-
Lite manage all
coherency and barriers
– Distributed Virtual
Memory signaling for
System MMU
ALL Rights Reserved
12. May 1, 2013 12
Heterogeneous Multi-Processing
• SMP OS runs across all CPUs, all clusters
• Some CPUs may be taken offline to save power
– Possibly even all CPUs in a cluster
• OS may support heterogeneous cluster configurations
– Scheduler potentially limits resource-sensitive threads to a specific cluster
SMP Operating System
C-A7 C-A7 C-A7 C-A7
Cluster 0
Thread
Thread
Thread
Thread
Thread
Thread
Thread
Thread
Thread
Thread
Thread
Thread
C-A15 C-A15 C-A15 C-A15
Cluster 1
Thread
Thread
Thread
Thread
Thread
Thread
Thread
Thread
Thread
Thread
Thread
Thread
ALL Rights Reserved
13. May 1, 2013 13
Principles of Task Migration
• System running on Cluster 0; Virtualizer decides more computational power is needed
• Cluster 1 powered up
• Threads migrated to Cluster 1 but Cluster 0 caches kept powered so they can still be
snooped
• When the Cluster 0 caches have gone cold, remaining system state cleaned from Cluster 0,
Cluster 0 powered down
SMP Operating System
Thread
Thread
Thread
Thread
Thread
Thread
Thread
Thread
Thread
Thread
Thread
Thread
C-A7 C-A7 C-A7 C-A7
Cluster 0
C-A15 C-A15 C-A15 C-A15
Cluster 1
SMP Operating System
Thread
Thread
Thread
Thread
Thread
Thread
Thread
Thread
Thread
Thread
Thread
Thread
Virtualizer
ALL Rights Reserved
14. May 1, 2013 14
Coherent multi-core
• In MPCore systems a resource may be shared between threads
running on different CPUs within the cluster
– The coherency logic connects Local Monitors in each of the CPUs in the cluster
Cortex-A LocalMonitor
GlobalMonitor
AXIInterconnect
Memory
Cortex-A
LocalMonitor
CoherencyLogic
Cortex-A MPCore
Thread 0
Thread1
ALL Rights Reserved
15. May 1, 2013 15
Summary
• Multicore, Multiprocessing, SoC and NoC are
the current technologies
• There are many challenges and considerations
while designing and programming MP system
• You have to acquire an architecture, tools,
programming know how, in order to get the
best trade-off between performance-power
ALL Rights Reserved