Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST
Bringing Intelligent Motion Using
Reinforcement Learning On Intel Client
Manuj Sabharwal, Yaz Khabiri
Agenda
3
Ø Overview of Reinforcement Learning (RL)
Ø Reinforcement Learning in Gaming
Ø Training RL Algorithms
Ø Intellige...
Overview of Machine Learning
4
4
m
Machine Learning
Supervised Unsupervised Reinforcement
Data; labels à Class
Task driven...
Successes Of
Reinforcement
Learning
SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST
High-Level Reinforcement Learning Overview
SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST
Agent gets state (s) from envir...
Examples Of RL Algorithms
SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST
• Actor-Critic algorithms (model based learning)...
Brain behind Algorithms
SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST
• Value Functions
• How much reward a state or an ...
Popular Path
To Bring
Machine
Learning In
Games
• Microsoft*
• DirectML (DML) framework
• Ubisoft* – LaForge
• Bringing re...
Motion With Reinforcement Learning
SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST
• Understanding path or motion planning...
SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST
• Q-learning (Q) : State × Action → Result, if we were to take an action i...
SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST
Evaluating Motion Algorithms On Intel® Core Processors
https://github.com/...
SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST
Analyzing Software Stack
~20% of actual time is spend in compute and rest ...
SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST
Optimizing the Software Stack - 1
ØRe-evaluating libraries included in sof...
Optimizing the Software Stack - 2
SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST
ØOptimizing math libraries to use FP32 d...
Optimization Results
SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST
Baseline After Optimizations
Putting CPUs to Work
• A...
SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST
• Optimizing training is first step for deployment
• Correct libraries and...
Take-away
Use of optimization libraries to train machine
learning algorithms help to boost
performance and reduce training...
Bringing Motion to Production
SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST
SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST
Understanding inference model
Training checkpoint
Inference Model
How can ...
SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST
Unity® ML Agents
Bridging Gap between Research and Game integration
SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST
Overview : Unity ML-Agents
Unity
Environment
Agent
Collect
Observations
Ag...
SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST
• Goal: Puppy runs for bone
• Agent: Corgi
• About 50 float32 inputs
• Thr...
SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST
Analyzing inference performance à 1 Agent
No Meta command : 1.8 seconds/in...
Microsoft® PIX Tool – Benefits of using Meta Commands
SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST
3.064msec
1.364msec
...
Results
SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST
0.00
0.50
1.00
1.50
2.00
2.50
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
1 Agen...
Intel® Graphics Performance Analyzer (GPA) DX12 Profiling
Preview
SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST
DX12 Dir...
Summary
SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST
• Tensorflow with Intel® MKLDNN build is now available on Windows
...
References
SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST
Tensorflow https://www.tensorflow.org/
Tensorflow Optimization ...
• Subtitle Copy Goes Here
Prochain SlideShare
Chargement dans…5
×

Bring Intelligent Motion Using Reinforcement Learning Engines | SIGGRAPH 2019 Technical Sessions

507 vues

Publié le

Review state-of-the-art techniques that use neural networks to synthesize motion, such as mode-adaptive neural network and phase-functioned neural networks. See how next-generation CPUs with reinforcement learning can offer better performance.

Publié dans : Technologie
  • Check out the brain training for Dogs course now. It's great for eliminating any bad behaviors by tapping into your dog's hidden intelligence. ●●● https://tinyurl.com/rrpbzfr
       Répondre 
    Voulez-vous vraiment ?  Oui  Non
    Votre message apparaîtra ici
  • Soyez le premier à aimer ceci

Bring Intelligent Motion Using Reinforcement Learning Engines | SIGGRAPH 2019 Technical Sessions

  1. 1. SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST
  2. 2. Bringing Intelligent Motion Using Reinforcement Learning On Intel Client Manuj Sabharwal, Yaz Khabiri
  3. 3. Agenda 3 Ø Overview of Reinforcement Learning (RL) Ø Reinforcement Learning in Gaming Ø Training RL Algorithms Ø Intelligent Motion Use case Ø Performance Optimization on Intel® CPU Ø Inference RL Algorithms Ø Understanding Motion models Ø Using DirectML* to leverage Intel GPUs Ø Summary
  4. 4. Overview of Machine Learning 4 4 m Machine Learning Supervised Unsupervised Reinforcement Data; labels à Class Task driven Data à Cluster State à Action Learn from mistake
  5. 5. Successes Of Reinforcement Learning SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST
  6. 6. High-Level Reinforcement Learning Overview SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST Agent gets state (s) from environment Agent takes action (a) using policy (π) Agent receives reward (r) Goal: Maximize large future reward return (R) https://unity3d.com/machine-learning
  7. 7. Examples Of RL Algorithms SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST • Actor-Critic algorithms (model based learning)* • Reduce variance of policy gradient using the actor (the policy) and critic (value function) • Value Based • Q-Learning • Find best action under current state • Policy based • Trust Region Policy Optimization • Generalized Advantage estimation http://rail.eecs.berkeley.edu/deeprlcourse-fa17/f17docs/lecture_3_rl_intro.pdf
  8. 8. Brain behind Algorithms SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST • Value Functions • How much reward a state or an action by prediction of total future reward (return) • Policy Methods • Find the best action directly • Optimize policy (behavior) directly • Vanilla Policy Gradients • For every episode with positive reward use gradient to increase probability of future actions • Improved Policy Gradients • Multiple gradient steps per episode
  9. 9. Popular Path To Bring Machine Learning In Games • Microsoft* • DirectML (DML) framework • Ubisoft* – LaForge • Bringing research into industry • Access to game engines and data • Unity* • First party support via ML-Agents • Interface between research and gaming • DML backend coming soon
  10. 10. Motion With Reinforcement Learning SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST • Understanding path or motion planning problem is crucial in unstructured environment • Data driven input in combination of physics based animation character to create smooth and robust animation • RL offers a convenient framework for learning different strategies without mountain of data • Solves generalization problems by path or motion planning Deep Q-Networks : Volodymyr Mnih, Deep RL Bootcamp, Berkeley, DeepMind*
  11. 11. SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST • Q-learning (Q) : State × Action → Result, if we were to take an action in a given state, then we could easily construct a policy that maximizes our rewards: • A = argmax Q (s,a) • Neural network helps to resemble Q as it can calculate universal function approximators • Q(s,a)=r+γQa’(sʹ,aʹ)) Equations to framework (e.g. Q-Learning à DQN Learning) Layer-1 Layer-3Layer-2state Q(s,n) conv conv conv FC FC Q Values Straight Left Right Activation function Activation function Activation function
  12. 12. SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST Evaluating Motion Algorithms On Intel® Core Processors https://github.com/xbpeng/DeepMimic 0 500 1000 1500 2000 2500 3000 3500 5 10 15 20 25 30 35 40 45 50 55 60 Minutes MillionIterations TensorFlow Baseline ~52hours of training on 8Core platform ~52hours to train on CPU à Can we do better? Testing by Intel as of June 28th , 2019 Intel® i9-9900k, 95W TDP, 8C16T; Frequency : 4.3Ghz, Turbo Enabled Graphics: NVIDIA* GTX 2080, Memory: 4x8GB@2133Mhz, Storage: Intel SSD 545 Series 240GB, OS: Windows* 10 RS5 BIOS build: CFLSFX1.R00.X151B01. All data is collected with Tensorflow* 1.12 and DeepMimic branch dates June 28th 2019
  13. 13. SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST Analyzing Software Stack ~20% of actual time is spend in compute and rest are overhead Intel® VTune™ Amplifier XE Actual compute Inefficiency due to spins
  14. 14. SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST Optimizing the Software Stack - 1 ØRe-evaluating libraries included in software stack for DeepMimic • Recompiling Tensorflow* with Intel® MKLDNN bazel --output_base=output_dir build --config=mkl --config=opt //tensorflow/tools/pip_package:build_pip_package python -c "import tensorflow; print(tensorflow.pywrap_tensorflow.IsMklEnabled())“ à Result : True • Evaluate different threading parameters to reduce spin time import tensorflow # this sets KMP_BLOCKTIME and OMP_PROC_BIND import os # delete the existing values del os.environ['OMP_PROC_BIND’] del os.environ['KMP_BLOCKTIME’] ØMoving Python installation à Optimize Intel Python libraries • Simple optimizations by moving numpy libraries to more efficient Intel Numpy libraries
  15. 15. Optimizing the Software Stack - 2 SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST ØOptimizing math libraries to use FP32 datatype and parallelism instead of double precision and scalar code • Mapping libraries from Eigen scaler to Eigen with MKL Compiling EIGEN with MKL and Bullet3 (Physics SDK : real-time collision library) to use AVX2 code path
  16. 16. Optimization Results SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST Baseline After Optimizations Putting CPUs to Work • Application is able to train with acceptable compute instead of spinning • Most of spinning from OpenMP and threading is removed due to Tensorflow with MKLDNN • Eigen MKL library in DeepMimic Core is able to take advantage of intrinsic code
  17. 17. SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST • Optimizing training is first step for deployment • Correct libraries and datatype is important for deep learning training performance Training Result with Optimized Stack Reducing training time by 2.6x by enabling multithreading and using MKLDNN instead of Eigen à 50hours to 19hours 0 1000 2000 3000 4000 5 10 15 20 25 30 35 40 45 50 55 60 MINUTES ITERATIONS (MILLIONS) Timing After Optimizations TensorFlow - Baseline TensorFlow- MKLDNN Tensorflow+MKLDNN+EIGEN Libs 2.6x better training performance
  18. 18. Take-away Use of optimization libraries to train machine learning algorithms help to boost performance and reduce training time SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST
  19. 19. Bringing Motion to Production SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST
  20. 20. SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST Understanding inference model Training checkpoint Inference Model How can developer read?
  21. 21. SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST Unity® ML Agents Bridging Gap between Research and Game integration
  22. 22. SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST Overview : Unity ML-Agents Unity Environment Agent Collect Observations Agent Action Vector Action Brain Academy Unity Inference Engine DirectML CS CPU
  23. 23. SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST • Goal: Puppy runs for bone • Agent: Corgi • About 50 float32 inputs • Three hidden layers of 512 nodes • About 20 float output Puppo Motion Using Unity ML Agent
  24. 24. SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST Analyzing inference performance à 1 Agent No Meta command : 1.8 seconds/inference Meta command : 0.8 seconds/inference https://devblogs.microsoft.com/pix/download/ Execution time reduced by 2x with meta commands on kernel level
  25. 25. Microsoft® PIX Tool – Benefits of using Meta Commands SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST 3.064msec 1.364msec More the Agents à Better performance with Metacommands
  26. 26. Results SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST 0.00 0.50 1.00 1.50 2.00 2.50 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 1 Agent 10 Agent 50 Agent GAIN(%) MSEC SCALING WITH Multiple AGENTS Computer Shader Metacommands Gain Lower is better Metacommands gives significant boost in performance by leveraging Intel® Graphics driver optimizations
  27. 27. Intel® Graphics Performance Analyzer (GPA) DX12 Profiling Preview SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST DX12 DirectML profiling in Intel® GPA
  28. 28. Summary SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST • Tensorflow with Intel® MKLDNN build is now available on Windows • Leveraging new instruction set on Intel® Xeon™ and Core™ Processors • Performance boost on training as Reinforcement learning use cases are CPU favorable • Using optimized pre-post libraries gives E2E performance boost • DirectML from Microsoft leverages metacommands which gives good boost in performance for game + deep learning infused workloads
  29. 29. References SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST Tensorflow https://www.tensorflow.org/ Tensorflow Optimization guide https://software.intel.com/en-us/articles/intel- optimization-for-tensorflow-installation-guide DeepMimic https://github.com/xbpeng/DeepMimic/tree/master/learning AI4Animation https://github.com/xbpeng/DeepMimic/tree/master/learning Unity-ML Agents https://github.com/Unity-Technologies/ml-agents RL beginner guide https://skymind.ai/wiki/deep-reinforcement-learning Gym https://gym.openai.com/ Ubisoft https://montreal.ubisoft.com/en/our-engagements/research-and- development/ Intel® GPA - https://software.intel.com/en-us/gpa
  30. 30. • Subtitle Copy Goes Here

×