Publicité
Publicité

Contenu connexe

Présentations pour vous(20)

Similaire à "Processor Options for Edge Inference: Options and Trade-offs," a Presentation from Micron Technology(20)

Publicité

Plus de Edge AI and Vision Alliance(20)

Publicité

"Processor Options for Edge Inference: Options and Trade-offs," a Presentation from Micron Technology

  1. © 2019 Micron Processor Architectures For Machine Learning Dr. Raj Talluri SVP & GM, Mobile Business Unit Micron Technology May 2019
  2. © 2019 Micron AI, Machine Learning, Neural Networks, Deep Learning….. Artificial Intelligence Machine Learning Brain Inspired Neural Networks Spiking Deep Learning John McCarthy - 1950
  3. © 2019 Micron Myriad of Applications of DNNs • Image, Object Recognition and Tracking • Video Security and Surveillance Cameras • Autonomous Navigation • Online Shopping • Recommendation Engines • Medical Diagnosis • New Drug Creation • Material Science….
  4. © 2019 Micron Basic Neural Compute Model and Terminology Hidden Layer Activations Input Layer Output Layer Weights Inputs (Activations) Weights (Synapses) Training Inference Backpropagation
  5. © 2019 Micron DNNs • DNNs have become commercially important • DNNs represent a very different type of workload vs. previous mainstream compute workloads. • DNN workloads are big, but very uniform and parallelizable – and hence ideal for specialized processors.
  6. © 2019 Micron Cloud vs Edge Cloud • Training and Inference • Complex and Resource Intensive • Floating Point Operations • Non-real time • Millions of data points • GPUs, CPUs • Supervised Learning Edge • Mostly Inference • Real time • Low Latency • Can be Integer or even binary • Battery and Energy sensitive • Limited resources
  7. © 2019 Micron Convolution Networks Filter Input Feature Map Output Feature Map Multiplication Addition • Each layer in the network generates successively higher level of abstraction • Stack of filters, stack of images, generate a stack of output features maps • Convolutions are typically implemented as matrix multiplications – basic compute element is a MAC (Multiply And Accumulate)
  8. © 2019 Micron Memory Access and Computation in a MAC Multiply And Accumulate (MAC) MEMORY MEMORY DRAM OR LOCAL MEMORY ALU MEMORY ACCESS IS THE BOTTLE NECK
  9. © 2019 Micron Memory Hierarchy and Data Movement Energy DRAM Global Buffer PE PE PE ALU Fetch data to run the MAC engine Normalized Energy Cost ALU ALU ALU ALU ALU RF PE Buffer DRAM 0.5 – 1 KB Noc: 200 – 1000 PEs 100 – 500 KB 200x 6x 2x 1x 1x (reference) PE – Processing Element ALU – Arithmetic Logic Unit RF – Register File - Efficient Processing of Deep Neural Networks: A Tutorial and Survey – Proceedings of the IEEE; vol. 105, issue 12, 2017
  10. © 2019 Micron Temporal and Spatial Hardware Architectures ALU ALU ALU ALU ALU ALU ALU ALU ALU Control Memory Memory ALU ALU ALU ALU ALU ALU ALU ALU ALU
  11. © 2019 Micron Hardware Architectures for DNNs at the Edge • CPUs, GPUs, DSPs • nVidia Jetson, Intel CPUs, Arduino, ARM Cores, TI C6X, Qualcomm Hexagon • Specialized Processors • Brainchip, Kneron, Knuedge, Gyrfalcon, Wave Computing, MIT Eyeriss, ThinCI, Graphcore, Intel Movidius, Mythic etc. • Licenceable cores – Cadence, Imagination, Cambricon etc. • Moble SoCs – Combination of CPUs, GPUs, DSPs, and hardware accelerators to augment DNN processing • Samsung Exynos, Qualcomm Snapdragon, Hisilicon Kirin, Mediatek Helio P90 • FPGAs • Xilinx, Altera • In Memory Compute Architectures • Mythic
  12. © 2019 Micron General Purpose CPUs and GPUs • Most Versatile for a variety of AI and non-AI tasks • Extensive Software APIs for popular DNN frame works – e.g. nVIDIA Jetson • Much higher power consumption than specialized processors or mobile SoCs • Readily available development platforms - Easy to get up and going on your task
  13. © 2019 Micron nVIDIA Jetson 13
  14. © 2019 Micron Jetson AI Pipeline 14
  15. © 2019 Micron ARM DNN Offerings 15
  16. © 2019 Micron ARM ML Processor 16
  17. © 2019 Micron Mobile SoCs • Versatile for a variety of AI and non-AI tasks • Good support for Software APIs for popular DNN frame works • Much higher power consumption than specialized processors but less than general purpose CPUs and GPUs • Moderate support for development kits • More challenging to get started – mostly Android platforms, limited availability of dev platforms
  18. © 2019 Micron Qualcomm Snapdragon 845 18
  19. © 2019 Micron Qualcomm Snapdragon 845 19
  20. © 2019 Micron Specialized Processors • Typically most efficient in terms of power and performance • Good support for Software APIs for popular DNN frame works • Not as flexible for general purpose compute tasks • Mostly from start-ups • Used as accelerators for general purpose processors • Lower cost, scalable • Limited support and general availability
  21. © 2019 Micron Eyeriss DNN Accelerator – from MIT Eyeriss http://eyeriss.mit.edu/ 21
  22. © 2019 Micron In Memory Computing • Mythic Computing • Avoid the memory bottleneck in MAC Operation • Ohm’s law is used to compute multiplication Y = VG and Kirchoff’s law calculate the sum • Good support for Software APIs for popular DNN frame works • Could be potential much lower power consumption than even specialized processors • Could be limited in application domains and scale
  23. © 2019 Micron Thoughts Choosing an Architecture • Application Requirements • Flexibility (general purpose vs specific) • DNN support • Accuracy • Energy and Power • Latency • Cost • Volume • Support Requirements, Longevity
  24. © 2019 Micron Resources 24 Qualcomm Snapdragon https://www.qualcomm.com/ nVIDIA Jetson https://www.nvidia.com/en-us/ Eyeriss http://eyeriss.mit.edu/ Mythic https://www.mythic-ai.com/ Kneron http://www.kneron.com/ ARM https://www.arm.com/ Efficient Processing of Deep Neural Networks: A Tutorial and Survey - 2017 https://ieeexplore.ieee.org/document/81 14708
Publicité