Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

ujava.org Reinforcement Learning (2nd)

510 vues

Publié le

ujava.org workshop 2016-08-12. Reinforcement Learning (2nd)

Publié dans : Données & analyses
  • Soyez le premier à commenter

ujava.org Reinforcement Learning (2nd)

  1. 1. Reinforcement Learning (2nd) ujava.org Workshop 2016-08-12 www.idosi.com CEO 강신동 Shindong KANG (주)지능도시
  2. 2. www.idosi.comujava.org
  3. 3. www.idosi.comspaceapi.org
  4. 4. www.idosi.comReinforcement Learning for Brick Game
  5. 5. www.idosi.comReinforcement Learning for Brick Game
  6. 6. www.idosi.comTo Flip Pancake
  7. 7. www.idosi.comCrawling Robot on Carpet
  8. 8. www.idosi.comPavlov's Dog
  9. 9. www.idosi.comPavlov
  10. 10. www.idosi.comReinforcement (강화)
  11. 11. www.idosi.comReinforcement Learning
  12. 12. www.idosi.comForecast
  13. 13. www.idosi.comForecast with probability
  14. 14. www.idosi.comUnknown model & real facts Deep Neural Network Bayesian Probability
  15. 15. www.idosi.comVariance (분산)
  16. 16. www.idosi.comVariance (분산)
  17. 17. www.idosi.comRandrom Variable
  18. 18. www.idosi.comTypes of Randrom Variable
  19. 19. www.idosi.comDiscrete Probability Distribution
  20. 20. www.idosi.comContinuous Probability Distribution, Probability Density Function Density (밀도)
  21. 21. www.idosi.comExpected value (기대값) EV = xP/1
  22. 22. www.idosi.comExpected Value for Continuous variable
  23. 23. www.idosi.comCovariance (공분산)
  24. 24. www.idosi.comCovariance
  25. 25. www.idosi.comProbability (확률)
  26. 26. www.idosi.comConditional Probability (조건부 확률)
  27. 27. www.idosi.comBayes rule
  28. 28. www.idosi.comBayesian Probability (베이지안 확률)
  29. 29. www.idosi.comBayesian Probability (베이지안 확률) P(fair|H) = ? P(A) = P(fair) = ½ P(B) = P(H) = ¾ P(B|A) = P(H|fair) = ½ ½ ½ 1 --- = –-- ¾ 3
  30. 30. www.idosi.comBrownian motion (브라운 운동)
  31. 31. www.idosi.comBrownian motion, Gaussian distribution
  32. 32. www.idosi.comSnapshot of state
  33. 33. www.idosi.comMarkov Chain
  34. 34. www.idosi.comProcess Probability (과정 확률) s1 s2 s3 Episode process : s1, s2 = ? s2, s3 = ? s1, s3 = ?
  35. 35. www.idosi.comMarkov Process
  36. 36. www.idosi.comMarkov Process
  37. 37. www.idosi.comMath Product Symbol
  38. 38. www.idosi.comMarkov Process
  39. 39. www.idosi.comMarkov Process
  40. 40. www.idosi.comMarkov Process
  41. 41. www.idosi.comStochastic Matrix
  42. 42. www.idosi.comStochastic Matrix 0.4 0.6 0.7 0.3
  43. 43. www.idosi.com2 Snapshots of state Direction using Second Order
  44. 44. www.idosi.comMarkov Process
  45. 45. www.idosi.com3 Snapshots of state Acceleration using 3rd order
  46. 46. www.idosi.comExploitation and Exploration (개발 and 탐험)
  47. 47. www.idosi.comState-action exploration vs. Parameter exploration
  48. 48. www.idosi.comMulti-armed bandit problem
  49. 49. www.idosi.comThompson sampling
  50. 50. www.idosi.comSimulated Bandit Performance
  51. 51. www.idosi.comMulti-armed bandit problem
  52. 52. www.idosi.comMulti-Armed Bandit Algorithms
  53. 53. www.idosi.comMAB Reward
  54. 54. www.idosi.comFunction's Probability Distribution Function's Probability Distribution ?
  55. 55. www.idosi.comFunction's Probability Distribution y = ax^2 +b
  56. 56. www.idosi.comFunction's Probability Distribution with Gaussian Distribution y = ax^2 +b
  57. 57. www.idosi.comFunction's Probability Distribution with Gaussian Distribution
  58. 58. www.idosi.comGaussian Process Regreesion
  59. 59. www.idosi.comGaussian Process From “C. E. Rasmussen & C. K. I. Williams, Gaussian Processes for Machine Learning, the MIT Press, 2006”
  60. 60. www.idosi.comThompson sampling
  61. 61. www.idosi.com Thank you ! (주)지능도시 Intelligent City Ltd. 강신동 Shindong KANG www.idosi.com ceo@idosi.com

×