Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

What the Brain says about Machine Intelligence

10 941 vues

Publié le

What the Brain says about Machine Intelligence. Presentation by Jeff Hawkins of Numenta

Publié dans : Technologie
  • Soyez le premier à commenter

What the Brain says about Machine Intelligence

  1. 1. November 21, 2014 Jeff Hawkins jhawkins@Numenta.com What the Brain Says About Machine Intelligence
  2. 2. 1940’s 1950’s - Dedicated vs. universal - Analog vs. digital - Decimal vs. binary - Wired vs. memory-based programming - Serial vs. random access memory Many approaches - Universal - Digital - Binary - Memory-based programming - Two tier memory One dominant paradigm The Birth of Programmable Computing Why Did One Paradigm Win? - Network effects Why Did This Paradigm Win? - Most flexible - Most scalable
  3. 3. 2010’s 2020’s The Birth of Machine Intelligence - Specific vs. universal algorithms - Mathematical vs. memory-based - Batch vs. on-line learning - Labeled vs. behavior-based learning Many approaches - Universal algorithms - Memory-based - On-line learning - Behavior-based learning One dominant paradigm Why Will One Paradigm Win? - Network effects Why Will This Paradigm Win? - Most flexible - Most scalable How Do We Know This is Going to Happen? - Brain is proof case - We have made great progress
  4. 4. 1) Discover operating principles of neocortex. 2) Create machine intelligence technology based on neocortical principles. Numenta’s Mission Talk Topics - Cortical facts - Cortical theory - Research roadmap - Applications - Thoughts on Machine Intelligence
  5. 5. What the Cortex Does patterns Learns a model of world from changing sensory data The model generates - predictions - anomalies - actions Most sensory changes are due to your own movement The neocortex learns a sensory-motor model of the world patterns patterns light sound touch retina cochlear somatic
  6. 6. Cortical Facts Hierarchy Cellular layers Mini-columns Neurons: 3-10K synapses - 10% proximal - 90% distal Active dendrites Learning = new synapses Remarkably uniform - anatomically - functionally 2.5 mm Sheet of cells 2/3 4 6 5
  7. 7. Cortical Theory Hierarchy Cellular layers Mini-columns Neurons: 3-10K synapses - 10% proximal - 90% distal Active dendrites Learning = new synapses Remarkably uniform - anatomically - functionally Sheet of cellsHTM Hierarchical Temporal Memory 1) Hierarchy of identical regions 2) Each region learns sequences 3) Stability increases going up hierarchy if input is predictable 4) Sequences unfold going down Questions - What does a region do? - What do the cellular layers do? - How do neurons implement this? - How does this work in hierarchy? 2/3 4 6 5
  8. 8. 2/3 4 5 6 Cellular Layers Sequence memory: Sequence memory: Sequence memory: Sequence memory: Inference (high-order) Inference (sensory-motor) Motor Attention FeedforwardFeedback Each layer is a variation of common sequence memory algorithm. These are universal functions. They apply to: - all cortical regions - all sensory-motor modalities. Copy of motor commands Sensor data Higher region Sub-cortical Motor centers Lower region
  9. 9. 2/3 4 5 6 Sequence memory: Sequence memory: Sequence memory: Sequence memory: ? ? ? ? How Does Sequence Memory Work?
  10. 10. HTM Temporal Memory Learns sequences Recognizes and recalls sequences Predicts next inputs - High capacity - Distributed - Local learning rules - Fault tolerant - No sensitive parameters - Generalizes
  11. 11. HTM Temporal Memory Not Just Another ANN 1) Cortical Anatomy Mini-columns Inhibitory cells Cell connectivity patterns 2) Sparse Distributed Representations 3) Realistic Neurons Active dendrites Thousands of synapses Learn via synapse formation numenta.com/learn/
  12. 12. 2/3 4 5 6 Research Roadmap Sensory-motor Inference High-order Inference Motor Sequences Attention/Feedback Theory 98% Extensively tested Commercial Theory 80% In development Theory 50% Theory 30% Streaming Data Capabilities: Prediction Anomaly detection Classification Applications: Predictive maintenance Security Natural Language Processing
  13. 13. HTM Encoder SDRData stream Predictions Anomalies Classification Streaming Data Applications Numbers Categories Date Time GPS Words Applications Servers Biometrics Medical Vehicles Industrial equipment Social media Comm. networks
  14. 14. Streaming Data Applications Server metrics Human metrics Natural languageGPS dataEEG data Financial data
  15. 15. . . . Anomaly Detection in Server Metrics (Grok for AWS) HTM Encoder SDRServer Metric Anomaly Score HTM Encoder SDRServer Metric Anomaly Score Mobile Dashboard  Servers sorted by anomaly score  Continuously updated Web Dashboard
  16. 16. What Kind of Anomalies Can HTM Detect? Sudden changes Slow changes Changes in noisy dataSubtle changes in regular data
  17. 17. Changes that humans can’t see Engineer manually started build on automated build server What Kind of Anomalies Can HTM Detect?
  18. 18. Created large Zip file Anomaly Detection in Human Metrics Keystrokes File access CPU usage App access
  19. 19. Anomaly Detection in Financial and Social Media Data Stock volume Social media Stock volume Social media
  20. 20. Berkeley Cognitive Technology Group Classification of EEG Data
  21. 21. GPS Data: SmartHarbors
  22. 22. Document corpus (e.g. Wikipedia) 128 x 128 100K “Word SDRs” - = Apple Fruit Computer Macintosh Microsoft Mac Linux Operating system …. Natural Language
  23. 23. Training set frog eats flies cow eats grain elephant eats leaves goat eats grass wolf eats rabbit cat likes ball elephant likes water sheep eats grass cat eats salmon wolf eats mice lion eats cow dog likes sleep elephant likes water cat likes ball coyote eats rodent coyote eats rabbit wolf eats squirrel dog likes sleep cat likes ball ---- ---- ----- Word 3Word 2Word 1 Sequences of Word SDRs HTM
  24. 24. Training set eats“fox” ? frog eats flies cow eats grain elephant eats leaves goat eats grass wolf eats rabbit cat likes ball elephant likes water sheep eats grass cat eats salmon wolf eats mice lion eats cow dog likes sleep elephant likes water cat likes ball coyote eats rodent coyote eats rabbit wolf eats squirrel dog likes sleep cat likes ball ---- ---- ----- Sequences of Word SDRs HTM
  25. 25. Training set eats“fox” rodent - Learning is unsupervised - Semantic generalization - Works across languages - Many applications Intelligent search Sentiment analysis Semantic filtering frog eats flies cow eats grain elephant eats leaves goat eats grass wolf eats rabbit cat likes ball elephant likes water sheep eats grass cat eats salmon wolf eats mice lion eats cow dog likes sleep elephant likes water cat likes ball coyote eats rodent coyote eats rabbit wolf eats squirrel dog likes sleep cat likes ball ---- ---- ----- Sequences of Word SDRs HTM
  26. 26. Server metrics Human metrics Natural language GPS dataEEG dataFinancial data All these applications run on the exact same HTM code.
  27. 27. 2/3 4 5 6 Research Roadmap Sensory-motor Inference High-order Inference Motor Sequences Attention/Feedback Theory 98% Extensively tested Commercial Theory 80% In development Theory 50% Theory 30% Streaming Data Capabilities: Prediction Anomaly detection Classification Applications: IT Security Natural Language Processing Static Data (via active learning) Capabilities: Classification Prediction Applications: Vision image classification Network classification Classification of connected graphs
  28. 28. 2/3 4 5 6 Research Roadmap Sensory-motor Inference High-order Inference Motor Sequences Attention/Feedback Theory 98% Extensively tested Commercial Theory 80% In development Theory 50% Theory 30% Streaming Data Capabilities: Prediction Anomaly detection Classification Applications: IT Security Natural Language Processing Static Data (via active learning) Capabilities: Classification Prediction Applications: Vision image classification Network classification Classification of connected graphs Static and/or streaming Data Capabilities: Goal-oriented behavior Applications: Robotics Smart bots Proactive defense
  29. 29. 2/3 4 5 6 Research Roadmap Sensory-motor Inference High-order Inference Motor Sequences Attention/Feedback Theory 98% Extensively tested Commercial Theory 80% In development Theory 50% Theory 30% Streaming Data Capabilities: Prediction Anomaly detection Classification Applications: IT Security Natural Language Processing Static Data (via active learning) Capabilities: Classification Prediction Applications: Vision image classification Network classification Classification of connected graphs Static and/or streaming Data Capabilities: Goal-oriented behavior Applications: Robotics Smart bots Proactive defense Enables : Multi-sensory modalities Multi-behavioral modalities
  30. 30. - Algorithms are documented - Multiple independent implementations NuPIC www.Numenta.org - Numenta’s software is open source (GPLv3) - Numenta’s daily research code is online - Active discussion groups for theory and implementation - Collaborative IBM Almaden Research, San Jose, CA DARPA, Washington D.C Cortical.IO, Austria Research Transparency
  31. 31. NuPIC Community
  32. 32. Machine Intelligence Landscape Cortical (e.g. HTM) ANNs (e.g. Deep learning) A.I. (e.g. Watson)
  33. 33. Machine Intelligence Landscape Premise Biological Mathematical Engineered Cortical (e.g. HTM) ANNs (e.g. Deep learning) A.I. (e.g. Watson)
  34. 34. Machine Intelligence Landscape Premise Biological Mathematical Engineered Data Spatial-temporal Language, Behavior Spatial-temporal Language Documents Cortical (e.g. HTM) ANNs (e.g. Deep learning) A.I. (e.g. Watson)
  35. 35. Machine Intelligence Landscape Premise Biological Mathematical Engineered Data Spatial-temporal Language, Behavior Spatial-temporal Language Documents Capabilities Classification Prediction Goal-oriented Behavior Classification NL Query Cortical (e.g. HTM) ANNs (e.g. Deep learning) A.I. (e.g. Watson)
  36. 36. Machine Intelligence Landscape Premise Biological Mathematical Engineered Data Spatial-temporal Language, Behavior Spatial-temporal Language Documents Capabilities Classification Prediction Goal-oriented Behavior Classification NL Query Path to M.I.? Yes Probably not Probably not Cortical (e.g. HTM) ANNs (e.g. Deep learning) A.I. (e.g. Watson)
  37. 37. Learning Normal Behavior
  38. 38. Learning Normal Behavior
  39. 39. Learning Normal Behavior
  40. 40. Geospatial Anomalies Deviation in path Change in direction
  41. 41. Learning Transitions
  42. 42. Time = 1 Learning Transitions
  43. 43. Time = 2 Learning Transitions
  44. 44. Learning Transitions Form connections to previously active cells. Predict future activity.
  45. 45. - This is a first order sequence memory. - It cannot learn A-B-C-D vs. X-B-C-Y. - Mini-columns turn this into a high-order sequence memory. Learning Transitions Multiple predictions can occur at once. A-B A-C A-D
  46. 46. Forming High Order Representations Feedforward: Sparse activation of columns Burst of activity Highly sparse unique pattern Unpredicted Predicted Feedforward: Sparse activation of columns
  47. 47. Representing High-order Sequences A X B B C C Y D Before training A X B’’ B’ C’’ C’ Y’’ D’ After training Same columns, but only one cell active per column. IF 40 active columns, 10 cells per column THEN 1040 ways to represent the same input in different contexts
  48. 48. SDR Properties subsampling is OK 3) Union membership: Indices 1 2 | 10 Is this SDR a member? 2) Store and Compare: store indices of active bits Indices 1 2 3 4 5 | 40 1) 2) 3) …. 10) 2% 20%Union 1) Similarity: shared bits = semantic similarity
  49. 49. What Can Be Done With Software 1 layer 30 msec / learning-inference-prediction step 10-6 of human cortex 2048 columns 65,000 neurons 300M synapses
  50. 50. Challenges Dendritic regions Active dendrites 1,000s of synapses 10,000s of potential synapses Continuous learning Challenges and Opportunities for Neuromorphic HW Opportunities Low precision memory (synapses) Fault tolerant - memory - connectivity - neurons - natural recovery Simple activation states (no spikes) Connectivity - very sparse, topological
  51. 51. 2/3 4 5 6 Cellular Layers Sequence memory Sequence memory Sequence memory Sequence memory Inference Inference Motor Attention FeedforwardFeedback Each layer implements a variation of a common sequence memory algorithm. Higher cortexSensor/lower cortex Lower cortex Motor center
  52. 52. Why Will Machine Intelligence be Based on Cortical Principles? 1) Cortex uses a common learning algorithm vision hearing touch behavior 2) Cortical algorithm is incredibly adaptable languages engineering science arts … 3) Network effects Hardware and software efforts will focus on most universal solution
  53. 53. 2/3 4 5 6 Cellular Layers Sequence memory: Sequence memory: Sequence memory: Sequence memory: Inference Inference Motor Attention FeedforwardFeedback Each layer is a variation of a common sequence memory algorithm. Higher cortexSensor/lower cortex Lower cortex Sub-cortical motor center Inputs/outputs define the role of each layer.
  54. 54. Learning Transitions Feedforward activation
  55. 55. Learning Transitions Inhibition
  56. 56. Sparse Distributed Representations (SDRs) - Sensory perception - Planning - Motor control - Prediction - Attention Sparse Distribution Representations are used everywhere in the cortex.
  57. 57. Sparse Distributed Representations What are they • Many bits (thousands) • Few 1’s mostly 0’s • Example: 2,000 bits, 2% active • Each bit has semantic meaning • No bit is essential 01000000000000000001000000000000000000000000000000000010000…………01000 Desirable attributes • High capacity • Robust to noise and deletion • Efficient and fast • Enable new operations
  58. 58. SDR Operations 1) Similarity: shared bits = semantic similarity subsampling is OK 3) Union membership: Indices 1 2 | 10 Is this SDR a member? 2) Store and Compare: store indices of active bits Indices 1 2 3 4 5 | 40 1) 2) 3) …. 10) 2% 20%Union
  59. 59. SmartHarbors
  60. 60. GPS to SDR Encoder
  61. 61. GPS to SDR Encoder
  62. 62. GPS to SDR Encoder
  63. 63. GPS to SDR Encoder
  64. 64. Feedback Local Feedforward Activates cell Neurons Biological neuron HTM neuron Non-linear Dendritic AP’s Depolarize soma Coincidence detectors HTM SynapsesBiological Synapses Learning is formation of new synapses. Synapses have low fidelity. Connection weight is binary 0.0 1.00.4 Learning forms new connections (“permanence” is scalar) 0 1 Feedforward Activates cell Prediction: Recognize hundreds of unique patterns Synapses Activation: Recognize dozens of unique patterns
  65. 65. SDRs are used everywhere in the cortex. Sparse Distributed Representations (SDRs)
  66. 66. From: Prof. Hasan, Max-Planck- Institute for Research
  67. 67. x = 0100000000000000000100000000000110000000 • Extremely high capacity • Robust to noise and deletions • Have many desirable properties • Solve semantic representation problem Attributes SDR Basics • Large number of neurons • Few active at once • Every cell represents something • Information is distributed • SDRs are binary 10 to 15 synapses are sufficient to recognize patterns in thousands of cells. A single dendrite can recognize multiple unique patterns without confusion.
  68. 68. Example: SDR Classification Capacity in Presence of Noise • n = number of bits in SDR • w = number of 1 bits • W = number of vectors that overlap vector x by b bits • Probability of false positive for one stored pattern • Probability of false positive for M stored patterns Wx (n,w,b) = wx b æ èç ö ø÷ ´ n - wx w - b æ èç ö ø÷ fpw n (q) = Wx (n,w,b) b=q w å n w æ èç ö ø÷ fpX (q) £ fpwxi n (q) i=0 M-1 å n = 2048, w = 40 With 50% noise, you can classify 1015 patterns with an error < 10-11 n = 64, w=12 With 33% noise, you can classify only 10 patterns with an error 0.04% Link.to.whitepaper.com

×