SlideShare une entreprise Scribd logo
1  sur  72
Cognitive Computing
From Brain Modelling to Large
   Scale Machine Learning
      Dariusz Plewczynski, PhD

        ICM, University of Warsaw
          D.Plewczynski@icm.edu.pl
How the brain works?




                       P. Latham
                        P. Dayan
Simulating the Brain

neuron introduced by Heinrich von Waldeyer-Hartz 1891




              http://en.wikipedia.org/wiki/Neuron
Simulating the Brain

synapse introduced by Charles Sherrington 1897




             http://en.wikipedia.org/wiki/Synapse
How the brain works?

Cortex numbers:
                                                 1 mm^2
                  1 mm3 of cortex:

                  50,000 neurons
                  10,000 connections/neuron
                  (=> 500 million connections)
                  4 km of axons


                  whole brain (2 kg):

                  1011 neurons
                  1015 connections
                  8 million km of axons

                                                          P. Latham
                                                           P. Dayan
How the brain really learns?

Time & Learning:

• You have about 1015 synapses.

• If it takes 1 bit of information to set a synapse, you need 1015
  bits to set all of them.

• 30 years ≈ 109 seconds.

• To set 1/10 of your synapses in 30 years, you must absorb
  100,000 bits/second.

Learning in the brain is almost completely unsupervised!
                                                              P. Latham
                                                               P. Dayan
Neurons: Models




                  E. Izhikevich
Levels of details




                    E. Izhikevich
Something Smaller ...

Mammalian thalamo-cortical
System by E. Izhikevich:
     The simulation of a model that has the size of the human
     brain: a detailed large-scale thalamocortical model based
     on experimental measures in several mammalian species.

     The model exhibits behavioral regimes of normal brain activity
     that were not explicitly built-in but emerged spontaneously as the
     result of interactions among anatomical and dynamic processes. It
     describes spontaneous activity, sensitivity to changes in individual
     neurons, emergence of waves and rhythms, and functional
     connectivity on different scales.


                                                                   E. Izhikevich
...and Less Complicated




                          E. Izhikevich
... and Even More ...




                        E. Izhikevich
Smaller?

Large-Scale Model of
Mammalian Thalamocortical System

    The model has 1011 neurons and almost 1015 synapses.
    It represents 300x300 mm2 of mammalian thalamo-cortical
    surface, and reticular thalamic nuclei, and spiking neurons with
    firing properties corresponding to those recorded in the
    mammalian brain.

    The model simulates one million multicompartmental spiking
    neurons calibrated to reproduce known types of responses
    recorded in vitro in rats. It has almost half a billion synapses with
    appropriate receptor kinetics, short-term plasticity, and long-term
    dendritic spike-timing-dependent synaptic plasticity (dendritic
    STDP).                                                           E. Izhikevich
Why?

<<Why did I do that?>>

     “Indeed, no significant contribution to neuroscience could
     be made by simulating one second of a model, even if it
     has the size of the human brain. However, I learned what it
     takes to simulate such a large-scale system.

     Implementation challenges:
     Since 2^32 < 10^11, a standard integer number cannot
     even encode the indices of all neurons.
     To store all synaptic weights, one needs 10,000 terabytes.
     How was the simulation done? Instead of saving synaptic
     connections, I regenerated the anatomy every time step (1
     ms).”                                                E. Izhikevich
Time ...

<<Why did I do that?>>

     “Question: When can we simulate the human brain in real
     time?
     Answer: The computational power to handle such a
     simulation will be available sooner than you think. “
     His benchmark:
        1 sec = 50 days on 27 3GHz processors

     “However, many essential details of the anatomy and
     dynamics of the mammalian nervous system would
     probably be still unknown.”
     Size doesn't matter; it's what you put into your model and
     how you embed it into the environment (to close the loop).
                                                            E. Izhikevich
... and Space

Spiking activity of the human brain
model

       Connects three drastically different scales:
   o   It is based on global (white-matter) thalamocortical anatomy
       obtained by means of diffusion tensor imaging (DTI) of a human
       brain.
   o   It includes multiple thalamic nuclei and six-layered cortical
       microcircuitry based on in vitro labeling and three-dimensional
       reconstruction of single neurons of cat visual cortex.
   o   It has 22 basic types of neurons with appropriate laminar
       distribution of their branching dendritic trees.

                                                                  E. Izhikevich
Less Complicated?

Spiking activity of the human brain
model
       Connects three drastically different scales:
   o   Single neurons with branching dendritic morphology
       (pyramidal, stellate, basket, non-basket, etc.); synaptic dynamics with
       GABA, AMPA, and NMDA kinetics, short-term and long-term synaptic
       plasticity (in the form of dendritic STDP); neuromodulation of plasticity by
       dopamine; firing patterns representing 21 basic types found in the rat cortex
       and thalamus (includes Regular Spiking, Intrinsically
       Bursting, Chattering, Fast Spiking, Late Spiking, Low-Threshold
       Spiking, Thamamic Bursting, etc., types);
   o   6-Layer thalamo-cortical microcircuitry based on quantitative anatomical
       studies of cat cortex (area 17) and on anatomical studies of thalamic
       circuitry in mammals;
   o   Large-scale white-matter anatomy using human DTI (diffusion tensor
       imaging) and fibertraking methods.
                                                                             E. Izhikevich
so - is it less complicated?




                               E. Izhikevich
Less Complicated?




                    E. Izhikevich
Goal

Large-Scale Computer model of
the whole human brain


    “This research has a more ambitious goal than the the Blue Brain
    Project conducted by IBM and EPFL (Lausanne, Switzerland).
    The Blue Brain Project builds a small part of the brain that
    represents a cortical column, though to a much greater detail than
    the models I develop. In contrast, my goal is a large-scale
    biologically acurate computer model of the whole human brain. At
    present, it has only cortex and thalamus; other subcortical
    structures, including hippocampus, cerebellum, basal
    ganglia, etc., will be added later. Spiking models of neurons in
    these structures have already being developed and fine-tuned.”
                                                                E. Izhikevich
Another Approach: Cortical Simulator

C2 Simulator IBM Research




                                       D. Modha
Cortical Simulator: C2 simulator

C2 Simulator incorporates:

   1.Phenomenological spiking neurons by Izhikevich, 2004;
   2.Phenomenological STDP synapses model of spike-timing
     dependent plasticity by Song, Miller, Abbot, 2000;
   3.axonal conductance delays 1-20 ms;
   4.80% excitatory neurons and 20% inhibitory neurons;
   5.a certain random graph of neuronal interconnectivity, like
     Mouse-scale by Braitenberg & Schuz, 1998:
          16x106 neurons
          8x103 synapses per neuron
          0.09 local probability of connection


                                                            D. Modha
Cortical Simulator: Connectivity

Micro Anatomy
Gray matter, short-distance, statistical/random;
• Binzeggar, Douglas, Martin, 2004 (cat anatomy);
• 13% I + 87% E with 77% E->E; 10% E->I; 11%I->E; 2% I->I
Macro Anatomy
White matter, long-distance, specific;
• CoCoMac www.cocomac.org (Collations of Connectivity data
  on the Macaque brain);
• 413 papers;
• two relationships: subset (20,000) and connectivity (40,000)
• roughly 1,000 areas with 10,000 connections:

The first neuro-anatomical graph of this size
                                                           D. Modha
     Emerging dynamics, small world phenomena
Cortical Simulator

Cortical network puts together neurons connected via an
interconnections. The goal: to understand the information
processing capability of such networks.

1. For every neuron:
   a. For every clock step (say 1 ms):
      i. Update the state of each neuron
      ii. If the neuron fires, generate an event for each synapse that
          the neuron is post-synaptic to and pre-synaptic to.
2. For every synapse:
   When it receives a pre- or post-synaptic event,
   update its state and, if necessary, the state of the
                                                post-synaptic neuron

                                                                         D. Modha
Cortical Simulator in action




                               D. Modha
Cortical Simulator in action

C2 Simulator is demonstrating how information percolates and
propagates. It is NOT learning!




                                                         D. Modha
yet, Cortical Simulator is not Brain

The cortex is an
analog, asynchronous, parallel, biophysical, fault-tolerant, and
distributed memory machine. Therefore, a biophysically-realistic
simulation is NOT the focus of C2!
The goal is to simulate only those details that lead us towards
insights into brain's high-level computational principles. C2
represents one logical abstraction of the cortex that is suitable
for Its simulation on modern distributed memory
multiprocessors.
Simulated high-level principles will hopefully guide us to novel
cognitive systems, computing architectures, programming
paradigms, and numerous practical applications.

                                                             D. Modha
Brain: Physical Structure




                            P. Latham
                             P. Dayan
Brain: Connectivity

Red - Interhemispheric fibers projecting
between the corpus callosum and frontal
cortex.
Green - Interhemispheric fibers projecting
between primary visual cortex and the
corpus callosum.
Yellow - Interhemispheric fibers projecting
from corpus callosum and not Red or Green.
Brown - Fibers of the superior longitudinal
fasciculus, connecting regions critical for
language processing.
Orange - Fibers of inferior longitudinal
fasciculus and uncinate fasciculus,
connecting regions to cortex responsible for
memory.
Purple - Projections between parietal lobe
and lateral cortex
Blue - Fibers connecting local regions of the
frontal cortex


                                                D. Modha
but what about Brain in Action?




                                  P. Latham
                                   P. Dayan
Very Long Story: Artificial Intelligence
Q. What is artificial intelligence?
A. It is the science and engineering of
making intelligent machines, especially
intelligent computer programs. It is related
to the similar task of using computers to
understand human intelligence, but AI does
not have to confine itself to methods that
are biologically observable.
Q. Yes, but what is intelligence?
A. Intelligence is the computational part of
the ability to achieve goals in the world.
Varying kinds and degrees of intelligence
occur in people, many animals and some
machines.
Q. Isn't there a solid definition of
intelligence that doesn't depend on relating
it to human intelligence?
A. Not yet.
John McCarthy
http://en.wikiversity.org/wiki/Artificial_Intelligence
Less Ambitious: Machine Learning

The goal of machine learning is to build computer systems that can
                 adapt and learn from their experience.
Different learning techniques have been developed for different
performance tasks.

The primary tasks that have been investigated are SUPERVISED
LEARNING for discrete decision-making, supervised learning for
continuous prediction, REINFORCEMENT LEARNING for sequential
decision making, and UNSUPERVISED LEARNING




                                                             T. Dietterich
Machine Learning

Problems:
•   High dimensionality of data;
•   Complex rules;
•   Amount of data;
•   Automated data processing;
•   Statistical analysis;
•   Pattern construction;

Examples:
•   Support Vector Machines
•   Artificial Neural Networks
•   Boosting
•   Hidden Markov Models
•   Random Forest
•   Trend Vectors ...
Machine Learning Tasks

Learning Task
 • Given: Set of cases verified by experiments to be positives
   and the given set of negatives
 • Compute: A model distinguishing if an item has prefered
   characteristic or not

Classification Task
• Given: calculated characteristics of a new case + a learned
   model
• Determine: If new case is a positive or not.

                                 black box       Predictive
         Training data      (learning machine)   Model
                                                   Model
Machine Learning Tasks

Training data
 • GivenCases with known class membership (Positives) +
   background preferences (Negatives)

Model
• The model which can distinguish between active and non-
  active cases. Can be used for prediction.
Knowledge
• Rules and additional information (databases and ontologies)

       Training data                       Acquired
                           Discovery          Model
                            Process        Knowledge
         Existing
        Knowledge                           Emad, El-Sebakhy: Ensemble Learning
                            Learning
A Single Model

 Goal is always to minimize the probability of model errors
                      on future data!

Motivation - build a single good model.

• Models that don’t adhere to Occam’s razor:
  o Minimax Probability Machine (MPM)
  o Trees
  o Neural Networks
  o Nearest Neighbor
  o Radial Basis Functions
• Occam’s razor models: The best model is the simplest one!
  o Support Vector Machines
  o Bayesian Methods                                     G. Grudic

  o Other kernel based methods (Kernel Maching Pursuit ...)
An Ensemble of Models

Motivation - a good single model is difficult to compute
(impossible?), so build many and combine them. Combining
many uncorrelated models produces better predictors...

• don’t use randomness or use directed randomness:
   o Boosting
   o Specific cost function
   o Gradient Boosting
   o Derive a boosting algorithm for any cost function
• Models that incorporate randomness:
   o Bagging: Bootstrap Sample by uniform random sampling
   o Stochastic Gradient Boosting: Bootstrap Sample: Uniform
     random sampling (with replacement)
   o Random Forests uniform random sampling & randomize Grudic
                                                          G.

     inputs
Machine Learning Software
•   WEKA
            University of Waikato, New Zealand
•   YALE                                          •   RAPIDMINER (->YALE)
            University of Dortmund, Germany                   Rapid-I GmbH
•   MiningMart                                    •   SCRIPTELLA ETL
            University of Dortmund, Germany                   ETL (Extract-Transform-Load)
•   Orange                                        •   JAVA MACHINE LEARNING LIBRARY
            University of Ljubljana, Slovenia                 machine learning library
•   Rattle                                        •   IBM Intelligent MINER
            Togaware                                          IBM
•   AlphaMiner                                    •   MALIBU
            University of Hong Kong                           University of Illinois
•   Databionic ESOM Tools                         •   KNime
            University of Marburg, Germany                    University of Konstanz
•   MLC++                                         •   SHOGUN
            SGI, USA                                          Friedrich Miescher Laboratory
•   MLJ                                           •   ELEFANT
            Kansas State University                           National ICT Australia
•   BORGELT                                       •   PLearn
            University of Magdeburg, Germany                  University of Montreal
•   GNOME DATA MINE                                     o  TORCH
            Togaware                                          Institut de Recherche Idiap
•   TANAGRA                                       •   PyML/MLPy
            University of Lyon 2                              Colorado State University /Fondazione Bruno Kessler
•   XELOPES
            Prudsys
•   SPAGOBI
            ObjectWeb, Italy
•   JASPERINTELLIGENCE
            JasperSoft                                                                                 http://mloss.org/
•   PENTAHO
            Pentaho
Consensus Learning

Consensus Approaches in Machine Learning

Consensus: A general agreement about a matter of opinion.

Learning: Knowledge obtained by study.

Machine Learning: Discover the relationships between the
variables of a system (input, output and hidden) from direct
samples of the system.

Brainstorming: A method of solving problems in which all the
members of a group suggest ideas and then discuss them (a
brainstorming session).
                                                Oxford Advanced Learner’s Dictionary
Consensus Learning

Motivation

Distributed artificial intelligence (DAI): the strong need to equip
multi-agent systems with a learning mechanism so as to adapt
them to a complex environment;

Goal: improve the problem-solving abilities of individual agents
and the overall system in complex multi-agent systems.

Using learning approach, agents can exchange their ideas with
each other and enrich their knowledge about the domain
problem;

Solution: the consensus learning approach.        Oxford Advanced Learner’s Dictionary
Consensus Learning

diferent data representations + ensemble of ML algorithms

Build the expert system using various input data representations
& different learning algorithms:

•   Support Vector Machines
•   Artificial Neural Networks
•   Boosting
•   Hidden Markov Models
•   Decision Trees
•   Random Forest
•   ….
Brainstorming
                                                                   INPUT: objects & features
              Representations
                                                                            annotations generation

               Similarity                   Text       External     External    External
  Structure                 Properties
               PCA/ICA                     Mining       Tools      Annotations Databases …


                      SQL Training Database
   feature
   decomposition
                      Scores,       Structural      Features
                     Thresholds     Similarity      Similarity

                                                                                 OUTPUT:
      Support                                                                      Model
                        Neural           Random         Decision
       Vector                                                                    Features
                       Networks           Forest         Trees
      Machine                                                                     Decision
                                                                              Reliability Score
  machine learning
                                                        MLcons
                                                       consensus
Brainstorming

Advantages

• Statistical If training data is very small, there are many
  hypotheses which describe it. Consensus reduces the
  chance of selecting a bad classifier.
• Computational Each ML might get stuck in local minima of
  training errors. Consensus reduces the chance of getting
  stuck in wrong minima.
• Representational Consensus may represent a complex
  classifier which was not possible to formulate in the original
  set of hypotheses.
Brainstorming
                                   Uwe Koch &
                                   Stephane Spieser
                                                  Pawel G Sadowski, Tom
Dariusz Plewczyński                               Kathryn S Lilley
          darman@icm.edu.pl                   Marcin von Grotthuss
Krzysztof Ginalski
kginal@icm.edu.pl

                              Leszek Rychlewski

                                  Adrian Tkacz
    Jan Komorowski & Marcin
    Kierczak

                                     Lucjan Wyrwicz
Back to the Brain

Cortical simulator puts together neurons connected via an
interconnections. The goal: to understand the information
processing capability of such networks.

1. For every neuron:
   a. For every clock step (say 1 ms):
      i. Update the state of each neuron
      ii. If the neuron fires, generate an event for each synapse that
          the neuron is post-synaptic to and pre-synaptic to.
2. For every synapse:
   When it receives a pre- or post-synaptic event,
   update its state and, if necessary, the state of the
                                                post-synaptic neuron

                                                                         D. Modha
Cognitive Network

Cognitive network CN similarly puts together single learning
units connected via an interconnections.

The goal: to understand the learning as the spatiotemporal
information processing and storing capability of such networks.

1. For every LU (learning unit):
   a. For every clock step:
      i. Update the state of each LU if data is changing
      ii. If the LU retraining was performed with success, generate
an event for each coupling that the LU is
post-event coupled to and pre-event coupled to.
2. For every TC (time coupling):
   When it receives a pre- or post-event,
   update its state and, if necessary, the state of the
                                                post-event LUs
Spatial Cognitive Networks


Cellular automata has only two states and do not retrain the
individual LUs


Each cell in cellular automata static model represent single
machine learning algorithm, or certain combination of
parameters, and optimization conditions affecting the
classification output of this particular method.
Spatial Cognitive Networks

I call this single learner by the term “learning agent”, each
characterized for example by its prediction quality on a selected
training dataset.
The coupling of individual learners is described by short-
, medium- or long-range interaction strength, so called learning
coupling.
The actual structure, or topology of coupling between various
learners is described using term “learning space”, and can
have different representations, such as Cartesian space, fully
connected or hierarchical geometry.
The result of the evolution, dynamics of such system given by its
stationary state is defined here as the consensus equilibration.
The majority of learners define, in the stationary limit, the
“learning consensus” outcome of the meta-learning procedure.
Spatial Cognitive Networks

1) Binary Logic

  I assume the binary logic of individual learners, i.e. we deal
  with cellular automata consisting of N agents, each holding
  one of two opposite states (“NO” or “YES”). These states are
  binary , similarly to Ising model of ferromagnet. In most cases
  the machine learning algorithms that can model those
  agents, such as support vector machines, decision
  trees, trend vectors, artificial neural networks, random
  forest, predict two classes for incoming data, based on
  previous experience in the form of trained models. The
  prediction of an agent answers single question: is a query
  data contained in class A (“YES”), or it is different from items
  gathered in this class (“NO”).
Spatial Cognitive Networks

2) Disorder and random strength parameter

  Each learner is characterized by two random parameters:
  persuasiveness and supportiveness that describe how
  individual agent interact with others. Persuasiveness
  describes how effectively the individual state of agent is
  propagated to neighboring agents, whereas supportiveness
  represent self-supportiveness of single agent.
Spatial Cognitive Networks

3) Learning space and learning metric

  Each agent is characterized by a location in the learning
  space, therefore one can calculate the abstract learning
  distance of two learners i and j.
  The strength of coupling between two agents tend to
  decrease with the learning distance between them.
  Determination of the learning metric is a separate
  problem, and the particular form of the metric and the
  learning distance function should be empirically
  determined, and in principle can be a very peculiar geometry.
  For example: the fully connected learning space, where all
  distances between agents are equal usefull in the case of
  simple consensus between different, yet not organized
  machine learning algorithms.
Spatial Cognitive Networks

4) Learning coupling

  Agents exchange their opinions by biasing others toward
  their own classification outcome. This influence can be
  described by the total learning impact I that ith agent is
  experiencing from all other learners.

  Within the cellular automata approach this impact is the difference
  between positive coupling of those agents that hold identical
  classification outcome, relative to negative influence of those who
  share opposite state, and can be formalized as: where Ip and Is are
  the functions of persuasiveness and supportiveness impact of the
  other agents on the i-th agent.
Spatial Cognitive Networks

5) Meta-Learning equations

  The equation of dynamics of the learning model defines the
  state of ith individual at the next time step as follows:
  with rescaled learning influence:



  I assume a synchronous dynamics, i.e. states of all agents
  are updated in parallel. In comparison to standard Monte
  Carlo methods the synchronous dynamics takes shorter time
  to equilibrate than serial methods, yet it can be trapped into
  periodic asymptotic states with oscillations between
  neighboring agents.
Spatial Cognitive Networks

6) Presence of noise

  The randomness of state change (phenomenological
  modeling of various random elements in the learning
  system, and training data) is given by introducing noise into
  dynamics:

  where h is the site-dependent white noise, or a uniform white
  noise. In the first case h are random variables independent
  for different agents and time instants, whereas in the second
  case h are independent for different time instants.
Back to AI Cognitive Networks
Cognitive networks CN are inspired by
brain structure and performed cognitive functions

CN put together single machine learning units connected via an
interconnections. The goal is to understand the learning as the
spatiotemporal information processing and storing capability of
such networks (Meta-Learning!).

1. Space: for every LU (learning unit):
   a. For every time step:
      i. Update the state of each LU using changed training data
      ii. If the LU learning was performed with success,
generate an event for each coupling that the LU is
post-event coupled to and pre-event coupled to.
2. Time: For every TC (time coupling):
   When it receives a pre- or post-event,
and Cortical Simulator

Cortical simulator puts together neurons connected via an
interconnections. The goal: to understand the information
processing capability of such networks.

1. For every neuron:
   a. For every clock step (say 1 ms):
      i. Update the state of each neuron
      ii. If the neuron fires, generate an event for each synapse that
          the neuron is post-synaptic to and pre-synaptic to.
2. For every synapse:
   When it receives a pre- or post-synaptic event,
   update its state and, if necessary, the state of the
                                                post-synaptic neuron

                                                                         D. Modha
both are Information Processing
NETWORKS



Network puts together single units connected via an
interconnections. The goal is to understand the information
processing of such network, as well as the spatiotemporal
characteristics and ability to store information in such networks
(Learning).
Network: technical challenges

Memory constrains to achieve near real-time simulation
times, the state of all learning units LUs and time couplings TCs
must fit in the random access memory of the system.
In real brain synapses far outnumber the neurons, therefore the
total available memory divided by the number of bytes per
synapse limits the number of synapses that can be modeled.
The rat-scale model (~55 million neurons, ~450 billion
synapses) needs to store state for all synapses and neurons
where later being negligible in comparison to the former.




                                                              D. Modha
Network: technical challenges

Communication constrains that is on an average, each LU fires
once a second.
Each neuron connects to roughly 8,000 other neurons, and,
therefore each neuron would generate 8,000 spikes
(“messages" or so called neurobits) per second.
This amounts to a total of 448 billion messages/neurobits per
second.




                                                         D. Modha
Network: technical challenges

Computation constrains: on an average, each LU fires once a
second.

In brain modelling, on an average, each synapse would be
activated twice: once when its pre-synaptic neuron fires and
once when its post-synaptic neuron fires.
This amounts to 896 billion synaptic updates per second.
Let us assume that the state of each neuron is updated every
millisecond. This amounts to 55 billion neuronal updates per
second. Synapses seem to dominate the computational cost.



                                                           D. Modha
Network: hardware

Hardware solution: a state-of-the-art supercomputer
BlueGene/L with 32,768 CPUs, 256MB of memory per
processor (8 TB in total), and 1.05GB/sec of in/out
communication bandwidth per node.
To meet the above three constraints, one has to design data
structure and algorithms that require no more than 16 bytes of
storage per TC, 175 Flops per TC per second, and 66 bytes per
neurobit (spike message).
A rat-scale brain, near real-time learning network!




                                                           D. Modha
Network: software

Software solution: a massively parallel learning network
simulator, LN, that runs on distributed memory multiprocessors.
Algorithmic enhancements:
 1. a computationally efficient way to simulate learning units in a
    clock-driven ("synchronous") and couplings in an event-
    driven ("asynchronous") fashion;
 2. a memory efficient representation to compactly represent the
    state of the simulation;
 3. a communication efficient way to minimize the number of
    messages sent by aggregating them in several ways and by
    mapping message exchanges between processors onto
    judiciously chosen MPI primitives for synchronization.

                                                               D. Modha
Computational Size of the problem
BlueGene/L can simulate the 1 sec of model in 10 sec at 1Hz
firing rate and 1ms simulation resolution using random stimulus.
                          MOUSE          HUMAN BlueGene/L
         Neurons            2x8x106        2x50x109        3x104
                                                           CPUs


         Synapses           128x109          1015         109
                                                        CPUs pairs


         Communication      128x109         1015        1.05 GB/sec
         (66 B/spike)      Spikes/sec     Spikes/sec    64x109 b/sec
                                                           in/out

         Computation        2x128x109       2x1015         45 TF
         (350              synapse/sec    synapse/sec     4x1013F
         F/synapse/sec)                                 8,192 CPUs


         Memory            O(128x109)      O(1015)         4 TB
         (32 B/synapse)                                 O(64x1012)

                                                        (D. Modha: assuming neurons fire at 1Hz)
D. Modha
   Progress in large-scale cortical simulations
IBM BlueGene




               D. Modha
D. Modha
   Future of cortical simulations
D. Modha

C2 Simulator vs Human Brain:

     The human cortex has about 22 billion neurons which is
     roughly a factor of 400 larger than the rat-scale model
     which has 55 million neurons. They used a BlueGene/L
     with 92 TF and 8 TB to carry out rat-scale simulations in
     near real-time [one tenth speed].
     So, by naïve extrapolation, one would require at least a
     machine with a computation capacity of 36.8 PF and a
     memory capacity of 3.2 PB. Furthermore, assuming that
     there are 8,000 synapses per neuron, that neurons fire at
     an average rate of 1 Hz, and that each spike message can
     be communicated in, say, 66 Bytes. One would need an
     aggregate communication bandwidth of ~ 2 PBps.           D. Modha
SyNAPSE

The goal is to create new
electronics hardware and
architecture that can
understand, adapt and respond to
an informative environment in ways
that extend traditional computation
to include fundamentally
different capabilities found in
biological brains.
Stanford University: Brian A. Wandell, H.-S. Philip Wong
Cornell University: Rajit Manohar
Columbia University Medical Center: Stefano Fusi
University of Wisconsin-Madison: Giulio Tononi
University of California-Merced: Christopher Kello
IBM Research: Rajagopal Ananthanarayanan, Leland Chang, Daniel
Friedman, Christoph Hagleitner, Bulent Kurdi, Chung Lam, Paul
Maglio, Stuart Parkin, Bipin Rajendran, Raghavendra Singh
NeuralEnsemble.org

NeuralEnsemble is a multilateral
effort to coordinate and organize
Neuroscience software
development efforts into a larger
meta-simulator software system, a
natural and alternate approach to
incrementally address what is
known as the complexity
bottleneck, presently a major
roadblock for neural modelling.
"Increasingly, the real limit on what computational
scientists can accomplish is how quickly and reliably
they can translate their ideas into working code."

Gregory V. Wilson, Where's the Real Bottleneck in
Scientific Computing?
Emergent


A comprehensive, full-featured neural network simulator that
allows for the creation and analysis of complex, sophisticated
models of the brain.

It has high level drag-and-drop programming interface, built on
top of a scripting language that has full introspective access to
all aspects of networks and the software itself, allows one to
write programs that seamlessly weave together the training of
a network and evolution of its environment without ever typing
out a line of code.

Networks and all of their state variables are visually inspected
in 3D, allowing for a quick "visual regression" of network
dynamics and robot behavior.
Cognitive Computers The Future


Today algorithms and computers deals with structured data
(age, salary, etc.) and semi-structured data (text and web
pages).
No mechanisms exists that is able to act in a context-
dependent fashion while integrating ambiguous information
across different senses (sight, hearing, touch, taste, and
smell) and coordinating multiple motor modalities.
Cognitive computing targets the boundary between digital and
physical worlds where raw sensory information abounds.
For example, while instrumenting the outside world’s with some sensors, and
streaming this information in the real-time manner to a cognitive computer that
may be able to detect spatio-temporal correlations.
Niels Bohr



             Predicting is very difficult
                  especially about the
                               future…
Brainstorming
                                   Uwe Koch &
                                   Stephane Spieser
                                                  Pawel G Sadowski, Tom
Dariusz Plewczyński                               Kathryn S Lilley
          darman@icm.edu.pl                   Marcin von Grotthuss
Krzysztof Ginalski
kginal@icm.edu.pl

                              Leszek Rychlewski

                                  Adrian Tkacz
    Jan Komorowski & Marcin
    Kierczak

                                     Lucjan Wyrwicz

Contenu connexe

Tendances

The Predictive Neuron: How Active Dendrites Enable Spatiotemporal Computation...
The Predictive Neuron: How Active Dendrites Enable Spatiotemporal Computation...The Predictive Neuron: How Active Dendrites Enable Spatiotemporal Computation...
The Predictive Neuron: How Active Dendrites Enable Spatiotemporal Computation...Numenta
 
Sparsity In The Neocortex, And Its Implications For Machine Learning
Sparsity In The Neocortex,  And Its Implications For Machine LearningSparsity In The Neocortex,  And Its Implications For Machine Learning
Sparsity In The Neocortex, And Its Implications For Machine LearningNumenta
 
Jeff Hawkins Human Brain Project Summit Keynote: "Location, Location, Locatio...
Jeff Hawkins Human Brain Project Summit Keynote: "Location, Location, Locatio...Jeff Hawkins Human Brain Project Summit Keynote: "Location, Location, Locatio...
Jeff Hawkins Human Brain Project Summit Keynote: "Location, Location, Locatio...Numenta
 
Locations in the Neocortex: A Theory of Sensorimotor Prediction Using Cortica...
Locations in the Neocortex: A Theory of Sensorimotor Prediction Using Cortica...Locations in the Neocortex: A Theory of Sensorimotor Prediction Using Cortica...
Locations in the Neocortex: A Theory of Sensorimotor Prediction Using Cortica...Numenta
 
Blue brain - A Revolution in Medical Technology
Blue brain - A Revolution in Medical TechnologyBlue brain - A Revolution in Medical Technology
Blue brain - A Revolution in Medical TechnologyNivetha Clementina D
 
Have We Missed Half of What the Neocortex Does? A New Predictive Framework ...
 Have We Missed Half of What the Neocortex Does?  A New Predictive Framework ... Have We Missed Half of What the Neocortex Does?  A New Predictive Framework ...
Have We Missed Half of What the Neocortex Does? A New Predictive Framework ...Numenta
 
CVPR 2020 Workshop: Sparsity in the neocortex, and its implications for conti...
CVPR 2020 Workshop: Sparsity in the neocortex, and its implications for conti...CVPR 2020 Workshop: Sparsity in the neocortex, and its implications for conti...
CVPR 2020 Workshop: Sparsity in the neocortex, and its implications for conti...Christy Maver
 
Introduction to Spiking Neural Networks: From a Computational Neuroscience pe...
Introduction to Spiking Neural Networks: From a Computational Neuroscience pe...Introduction to Spiking Neural Networks: From a Computational Neuroscience pe...
Introduction to Spiking Neural Networks: From a Computational Neuroscience pe...Jason Tsai
 
Sparse Distributed Representations: Our Brain's Data Structure
Sparse Distributed Representations: Our Brain's Data Structure Sparse Distributed Representations: Our Brain's Data Structure
Sparse Distributed Representations: Our Brain's Data Structure Numenta
 
What is (computational) neuroscience?
What is (computational) neuroscience?What is (computational) neuroscience?
What is (computational) neuroscience?SSA KPI
 
ICMNS Presentation: Presence of high order cell assemblies in mouse visual co...
ICMNS Presentation: Presence of high order cell assemblies in mouse visual co...ICMNS Presentation: Presence of high order cell assemblies in mouse visual co...
ICMNS Presentation: Presence of high order cell assemblies in mouse visual co...Numenta
 
Numenta Brain Theory Discoveries of 2016/2017 by Jeff Hawkins
Numenta Brain Theory Discoveries of 2016/2017 by Jeff HawkinsNumenta Brain Theory Discoveries of 2016/2017 by Jeff Hawkins
Numenta Brain Theory Discoveries of 2016/2017 by Jeff HawkinsNumenta
 
Recognizing Locations on Objects by Marcus Lewis
Recognizing Locations on Objects by Marcus LewisRecognizing Locations on Objects by Marcus Lewis
Recognizing Locations on Objects by Marcus LewisNumenta
 
Blue brain n
Blue brain nBlue brain n
Blue brain nSumipuchu
 
Loihi many core_neuromorphic_chip
Loihi many core_neuromorphic_chipLoihi many core_neuromorphic_chip
Loihi many core_neuromorphic_chipMehmood Saleem
 

Tendances (20)

The Predictive Neuron: How Active Dendrites Enable Spatiotemporal Computation...
The Predictive Neuron: How Active Dendrites Enable Spatiotemporal Computation...The Predictive Neuron: How Active Dendrites Enable Spatiotemporal Computation...
The Predictive Neuron: How Active Dendrites Enable Spatiotemporal Computation...
 
Sparsity In The Neocortex, And Its Implications For Machine Learning
Sparsity In The Neocortex,  And Its Implications For Machine LearningSparsity In The Neocortex,  And Its Implications For Machine Learning
Sparsity In The Neocortex, And Its Implications For Machine Learning
 
Jeff Hawkins Human Brain Project Summit Keynote: "Location, Location, Locatio...
Jeff Hawkins Human Brain Project Summit Keynote: "Location, Location, Locatio...Jeff Hawkins Human Brain Project Summit Keynote: "Location, Location, Locatio...
Jeff Hawkins Human Brain Project Summit Keynote: "Location, Location, Locatio...
 
Locations in the Neocortex: A Theory of Sensorimotor Prediction Using Cortica...
Locations in the Neocortex: A Theory of Sensorimotor Prediction Using Cortica...Locations in the Neocortex: A Theory of Sensorimotor Prediction Using Cortica...
Locations in the Neocortex: A Theory of Sensorimotor Prediction Using Cortica...
 
Blue brain - A Revolution in Medical Technology
Blue brain - A Revolution in Medical TechnologyBlue brain - A Revolution in Medical Technology
Blue brain - A Revolution in Medical Technology
 
Have We Missed Half of What the Neocortex Does? A New Predictive Framework ...
 Have We Missed Half of What the Neocortex Does?  A New Predictive Framework ... Have We Missed Half of What the Neocortex Does?  A New Predictive Framework ...
Have We Missed Half of What the Neocortex Does? A New Predictive Framework ...
 
Soft computing BY:- Dr. Rakesh Kumar Maurya
Soft computing BY:- Dr. Rakesh Kumar MauryaSoft computing BY:- Dr. Rakesh Kumar Maurya
Soft computing BY:- Dr. Rakesh Kumar Maurya
 
CVPR 2020 Workshop: Sparsity in the neocortex, and its implications for conti...
CVPR 2020 Workshop: Sparsity in the neocortex, and its implications for conti...CVPR 2020 Workshop: Sparsity in the neocortex, and its implications for conti...
CVPR 2020 Workshop: Sparsity in the neocortex, and its implications for conti...
 
Ann first lecture
Ann first lectureAnn first lecture
Ann first lecture
 
Introduction to Spiking Neural Networks: From a Computational Neuroscience pe...
Introduction to Spiking Neural Networks: From a Computational Neuroscience pe...Introduction to Spiking Neural Networks: From a Computational Neuroscience pe...
Introduction to Spiking Neural Networks: From a Computational Neuroscience pe...
 
Sparse Distributed Representations: Our Brain's Data Structure
Sparse Distributed Representations: Our Brain's Data Structure Sparse Distributed Representations: Our Brain's Data Structure
Sparse Distributed Representations: Our Brain's Data Structure
 
What is (computational) neuroscience?
What is (computational) neuroscience?What is (computational) neuroscience?
What is (computational) neuroscience?
 
Sporns kavli2008
Sporns kavli2008Sporns kavli2008
Sporns kavli2008
 
ICMNS Presentation: Presence of high order cell assemblies in mouse visual co...
ICMNS Presentation: Presence of high order cell assemblies in mouse visual co...ICMNS Presentation: Presence of high order cell assemblies in mouse visual co...
ICMNS Presentation: Presence of high order cell assemblies in mouse visual co...
 
Numenta Brain Theory Discoveries of 2016/2017 by Jeff Hawkins
Numenta Brain Theory Discoveries of 2016/2017 by Jeff HawkinsNumenta Brain Theory Discoveries of 2016/2017 by Jeff Hawkins
Numenta Brain Theory Discoveries of 2016/2017 by Jeff Hawkins
 
Ayush
AyushAyush
Ayush
 
Lec 5
Lec 5Lec 5
Lec 5
 
Recognizing Locations on Objects by Marcus Lewis
Recognizing Locations on Objects by Marcus LewisRecognizing Locations on Objects by Marcus Lewis
Recognizing Locations on Objects by Marcus Lewis
 
Blue brain n
Blue brain nBlue brain n
Blue brain n
 
Loihi many core_neuromorphic_chip
Loihi many core_neuromorphic_chipLoihi many core_neuromorphic_chip
Loihi many core_neuromorphic_chip
 

Similaire à Nencki321 day2

Art of Neuroscience 2017 - Team Favorites
Art of Neuroscience 2017 - Team FavoritesArt of Neuroscience 2017 - Team Favorites
Art of Neuroscience 2017 - Team FavoritesArt of Neuroscience
 
Lect1_Threshold_Logic_Unit lecture 1 - ANN
Lect1_Threshold_Logic_Unit  lecture 1 - ANNLect1_Threshold_Logic_Unit  lecture 1 - ANN
Lect1_Threshold_Logic_Unit lecture 1 - ANNMostafaHazemMostafaa
 
Computational neuropharmacology drug designing
Computational neuropharmacology drug designingComputational neuropharmacology drug designing
Computational neuropharmacology drug designingRevathi Boyina
 
Cracking the neural code
Cracking the neural codeCracking the neural code
Cracking the neural codeAlbanLevy
 
Brain machine-interface
Brain machine-interfaceBrain machine-interface
Brain machine-interfaceVysakh Sharma
 
Thoughtworks Ignite On Brains And Computing
Thoughtworks  Ignite    On  Brains And  ComputingThoughtworks  Ignite    On  Brains And  Computing
Thoughtworks Ignite On Brains And ComputingAnthony Hsiao
 
Blue brain seminar report
Blue brain seminar reportBlue brain seminar report
Blue brain seminar reportSuryoday Nth
 
Neural Netwrok
Neural NetwrokNeural Netwrok
Neural NetwrokRabin BK
 
Neural Computing
Neural ComputingNeural Computing
Neural ComputingESCOM
 
Computational neuroscience
Computational neuroscienceComputational neuroscience
Computational neuroscienceNicolas Rougier
 
2023-1113e-INFN-Seminari-Paolucci-BioInspiredSpikingLearningSleepCycles.pdf
2023-1113e-INFN-Seminari-Paolucci-BioInspiredSpikingLearningSleepCycles.pdf2023-1113e-INFN-Seminari-Paolucci-BioInspiredSpikingLearningSleepCycles.pdf
2023-1113e-INFN-Seminari-Paolucci-BioInspiredSpikingLearningSleepCycles.pdfpierstanislaopaolucc1
 

Similaire à Nencki321 day2 (20)

Blue Brain
Blue BrainBlue Brain
Blue Brain
 
Art of Neuroscience 2017 - Team Favorites
Art of Neuroscience 2017 - Team FavoritesArt of Neuroscience 2017 - Team Favorites
Art of Neuroscience 2017 - Team Favorites
 
blue brain
blue brainblue brain
blue brain
 
Lect1_Threshold_Logic_Unit lecture 1 - ANN
Lect1_Threshold_Logic_Unit  lecture 1 - ANNLect1_Threshold_Logic_Unit  lecture 1 - ANN
Lect1_Threshold_Logic_Unit lecture 1 - ANN
 
138693 28152-brain-chips
138693 28152-brain-chips138693 28152-brain-chips
138693 28152-brain-chips
 
1.ppt
1.ppt1.ppt
1.ppt
 
Computational neuropharmacology drug designing
Computational neuropharmacology drug designingComputational neuropharmacology drug designing
Computational neuropharmacology drug designing
 
Axon to axon
Axon to axonAxon to axon
Axon to axon
 
Cracking the neural code
Cracking the neural codeCracking the neural code
Cracking the neural code
 
Blue Brain Project
Blue Brain Project Blue Brain Project
Blue Brain Project
 
Brain chips
Brain chipsBrain chips
Brain chips
 
Brain machine-interface
Brain machine-interfaceBrain machine-interface
Brain machine-interface
 
Thoughtworks Ignite On Brains And Computing
Thoughtworks  Ignite    On  Brains And  ComputingThoughtworks  Ignite    On  Brains And  Computing
Thoughtworks Ignite On Brains And Computing
 
Blue brain seminar report
Blue brain seminar reportBlue brain seminar report
Blue brain seminar report
 
Blue brain technology
Blue brain technologyBlue brain technology
Blue brain technology
 
Neural Netwrok
Neural NetwrokNeural Netwrok
Neural Netwrok
 
Neural Computing
Neural ComputingNeural Computing
Neural Computing
 
Computational neuroscience
Computational neuroscienceComputational neuroscience
Computational neuroscience
 
Mind uploading
Mind uploadingMind uploading
Mind uploading
 
2023-1113e-INFN-Seminari-Paolucci-BioInspiredSpikingLearningSleepCycles.pdf
2023-1113e-INFN-Seminari-Paolucci-BioInspiredSpikingLearningSleepCycles.pdf2023-1113e-INFN-Seminari-Paolucci-BioInspiredSpikingLearningSleepCycles.pdf
2023-1113e-INFN-Seminari-Paolucci-BioInspiredSpikingLearningSleepCycles.pdf
 

Nencki321 day2

  • 1. Cognitive Computing From Brain Modelling to Large Scale Machine Learning Dariusz Plewczynski, PhD ICM, University of Warsaw D.Plewczynski@icm.edu.pl
  • 2. How the brain works? P. Latham P. Dayan
  • 3. Simulating the Brain neuron introduced by Heinrich von Waldeyer-Hartz 1891 http://en.wikipedia.org/wiki/Neuron
  • 4. Simulating the Brain synapse introduced by Charles Sherrington 1897 http://en.wikipedia.org/wiki/Synapse
  • 5. How the brain works? Cortex numbers: 1 mm^2 1 mm3 of cortex: 50,000 neurons 10,000 connections/neuron (=> 500 million connections) 4 km of axons whole brain (2 kg): 1011 neurons 1015 connections 8 million km of axons P. Latham P. Dayan
  • 6. How the brain really learns? Time & Learning: • You have about 1015 synapses. • If it takes 1 bit of information to set a synapse, you need 1015 bits to set all of them. • 30 years ≈ 109 seconds. • To set 1/10 of your synapses in 30 years, you must absorb 100,000 bits/second. Learning in the brain is almost completely unsupervised! P. Latham P. Dayan
  • 7. Neurons: Models E. Izhikevich
  • 8. Levels of details E. Izhikevich
  • 9. Something Smaller ... Mammalian thalamo-cortical System by E. Izhikevich: The simulation of a model that has the size of the human brain: a detailed large-scale thalamocortical model based on experimental measures in several mammalian species. The model exhibits behavioral regimes of normal brain activity that were not explicitly built-in but emerged spontaneously as the result of interactions among anatomical and dynamic processes. It describes spontaneous activity, sensitivity to changes in individual neurons, emergence of waves and rhythms, and functional connectivity on different scales. E. Izhikevich
  • 10. ...and Less Complicated E. Izhikevich
  • 11. ... and Even More ... E. Izhikevich
  • 12. Smaller? Large-Scale Model of Mammalian Thalamocortical System The model has 1011 neurons and almost 1015 synapses. It represents 300x300 mm2 of mammalian thalamo-cortical surface, and reticular thalamic nuclei, and spiking neurons with firing properties corresponding to those recorded in the mammalian brain. The model simulates one million multicompartmental spiking neurons calibrated to reproduce known types of responses recorded in vitro in rats. It has almost half a billion synapses with appropriate receptor kinetics, short-term plasticity, and long-term dendritic spike-timing-dependent synaptic plasticity (dendritic STDP). E. Izhikevich
  • 13. Why? <<Why did I do that?>> “Indeed, no significant contribution to neuroscience could be made by simulating one second of a model, even if it has the size of the human brain. However, I learned what it takes to simulate such a large-scale system. Implementation challenges: Since 2^32 < 10^11, a standard integer number cannot even encode the indices of all neurons. To store all synaptic weights, one needs 10,000 terabytes. How was the simulation done? Instead of saving synaptic connections, I regenerated the anatomy every time step (1 ms).” E. Izhikevich
  • 14. Time ... <<Why did I do that?>> “Question: When can we simulate the human brain in real time? Answer: The computational power to handle such a simulation will be available sooner than you think. “ His benchmark: 1 sec = 50 days on 27 3GHz processors “However, many essential details of the anatomy and dynamics of the mammalian nervous system would probably be still unknown.” Size doesn't matter; it's what you put into your model and how you embed it into the environment (to close the loop). E. Izhikevich
  • 15. ... and Space Spiking activity of the human brain model Connects three drastically different scales: o It is based on global (white-matter) thalamocortical anatomy obtained by means of diffusion tensor imaging (DTI) of a human brain. o It includes multiple thalamic nuclei and six-layered cortical microcircuitry based on in vitro labeling and three-dimensional reconstruction of single neurons of cat visual cortex. o It has 22 basic types of neurons with appropriate laminar distribution of their branching dendritic trees. E. Izhikevich
  • 16. Less Complicated? Spiking activity of the human brain model Connects three drastically different scales: o Single neurons with branching dendritic morphology (pyramidal, stellate, basket, non-basket, etc.); synaptic dynamics with GABA, AMPA, and NMDA kinetics, short-term and long-term synaptic plasticity (in the form of dendritic STDP); neuromodulation of plasticity by dopamine; firing patterns representing 21 basic types found in the rat cortex and thalamus (includes Regular Spiking, Intrinsically Bursting, Chattering, Fast Spiking, Late Spiking, Low-Threshold Spiking, Thamamic Bursting, etc., types); o 6-Layer thalamo-cortical microcircuitry based on quantitative anatomical studies of cat cortex (area 17) and on anatomical studies of thalamic circuitry in mammals; o Large-scale white-matter anatomy using human DTI (diffusion tensor imaging) and fibertraking methods. E. Izhikevich
  • 17. so - is it less complicated? E. Izhikevich
  • 18. Less Complicated? E. Izhikevich
  • 19. Goal Large-Scale Computer model of the whole human brain “This research has a more ambitious goal than the the Blue Brain Project conducted by IBM and EPFL (Lausanne, Switzerland). The Blue Brain Project builds a small part of the brain that represents a cortical column, though to a much greater detail than the models I develop. In contrast, my goal is a large-scale biologically acurate computer model of the whole human brain. At present, it has only cortex and thalamus; other subcortical structures, including hippocampus, cerebellum, basal ganglia, etc., will be added later. Spiking models of neurons in these structures have already being developed and fine-tuned.” E. Izhikevich
  • 20. Another Approach: Cortical Simulator C2 Simulator IBM Research D. Modha
  • 21. Cortical Simulator: C2 simulator C2 Simulator incorporates: 1.Phenomenological spiking neurons by Izhikevich, 2004; 2.Phenomenological STDP synapses model of spike-timing dependent plasticity by Song, Miller, Abbot, 2000; 3.axonal conductance delays 1-20 ms; 4.80% excitatory neurons and 20% inhibitory neurons; 5.a certain random graph of neuronal interconnectivity, like Mouse-scale by Braitenberg & Schuz, 1998:  16x106 neurons  8x103 synapses per neuron  0.09 local probability of connection D. Modha
  • 22. Cortical Simulator: Connectivity Micro Anatomy Gray matter, short-distance, statistical/random; • Binzeggar, Douglas, Martin, 2004 (cat anatomy); • 13% I + 87% E with 77% E->E; 10% E->I; 11%I->E; 2% I->I Macro Anatomy White matter, long-distance, specific; • CoCoMac www.cocomac.org (Collations of Connectivity data on the Macaque brain); • 413 papers; • two relationships: subset (20,000) and connectivity (40,000) • roughly 1,000 areas with 10,000 connections: The first neuro-anatomical graph of this size D. Modha Emerging dynamics, small world phenomena
  • 23. Cortical Simulator Cortical network puts together neurons connected via an interconnections. The goal: to understand the information processing capability of such networks. 1. For every neuron: a. For every clock step (say 1 ms): i. Update the state of each neuron ii. If the neuron fires, generate an event for each synapse that the neuron is post-synaptic to and pre-synaptic to. 2. For every synapse: When it receives a pre- or post-synaptic event, update its state and, if necessary, the state of the post-synaptic neuron D. Modha
  • 24. Cortical Simulator in action D. Modha
  • 25. Cortical Simulator in action C2 Simulator is demonstrating how information percolates and propagates. It is NOT learning! D. Modha
  • 26. yet, Cortical Simulator is not Brain The cortex is an analog, asynchronous, parallel, biophysical, fault-tolerant, and distributed memory machine. Therefore, a biophysically-realistic simulation is NOT the focus of C2! The goal is to simulate only those details that lead us towards insights into brain's high-level computational principles. C2 represents one logical abstraction of the cortex that is suitable for Its simulation on modern distributed memory multiprocessors. Simulated high-level principles will hopefully guide us to novel cognitive systems, computing architectures, programming paradigms, and numerous practical applications. D. Modha
  • 27. Brain: Physical Structure P. Latham P. Dayan
  • 28. Brain: Connectivity Red - Interhemispheric fibers projecting between the corpus callosum and frontal cortex. Green - Interhemispheric fibers projecting between primary visual cortex and the corpus callosum. Yellow - Interhemispheric fibers projecting from corpus callosum and not Red or Green. Brown - Fibers of the superior longitudinal fasciculus, connecting regions critical for language processing. Orange - Fibers of inferior longitudinal fasciculus and uncinate fasciculus, connecting regions to cortex responsible for memory. Purple - Projections between parietal lobe and lateral cortex Blue - Fibers connecting local regions of the frontal cortex D. Modha
  • 29. but what about Brain in Action? P. Latham P. Dayan
  • 30. Very Long Story: Artificial Intelligence Q. What is artificial intelligence? A. It is the science and engineering of making intelligent machines, especially intelligent computer programs. It is related to the similar task of using computers to understand human intelligence, but AI does not have to confine itself to methods that are biologically observable. Q. Yes, but what is intelligence? A. Intelligence is the computational part of the ability to achieve goals in the world. Varying kinds and degrees of intelligence occur in people, many animals and some machines. Q. Isn't there a solid definition of intelligence that doesn't depend on relating it to human intelligence? A. Not yet. John McCarthy http://en.wikiversity.org/wiki/Artificial_Intelligence
  • 31. Less Ambitious: Machine Learning The goal of machine learning is to build computer systems that can adapt and learn from their experience. Different learning techniques have been developed for different performance tasks. The primary tasks that have been investigated are SUPERVISED LEARNING for discrete decision-making, supervised learning for continuous prediction, REINFORCEMENT LEARNING for sequential decision making, and UNSUPERVISED LEARNING T. Dietterich
  • 32. Machine Learning Problems: • High dimensionality of data; • Complex rules; • Amount of data; • Automated data processing; • Statistical analysis; • Pattern construction; Examples: • Support Vector Machines • Artificial Neural Networks • Boosting • Hidden Markov Models • Random Forest • Trend Vectors ...
  • 33. Machine Learning Tasks Learning Task • Given: Set of cases verified by experiments to be positives and the given set of negatives • Compute: A model distinguishing if an item has prefered characteristic or not Classification Task • Given: calculated characteristics of a new case + a learned model • Determine: If new case is a positive or not. black box Predictive Training data (learning machine) Model Model
  • 34. Machine Learning Tasks Training data • GivenCases with known class membership (Positives) + background preferences (Negatives) Model • The model which can distinguish between active and non- active cases. Can be used for prediction. Knowledge • Rules and additional information (databases and ontologies) Training data Acquired Discovery Model Process Knowledge Existing Knowledge Emad, El-Sebakhy: Ensemble Learning Learning
  • 35. A Single Model Goal is always to minimize the probability of model errors on future data! Motivation - build a single good model. • Models that don’t adhere to Occam’s razor: o Minimax Probability Machine (MPM) o Trees o Neural Networks o Nearest Neighbor o Radial Basis Functions • Occam’s razor models: The best model is the simplest one! o Support Vector Machines o Bayesian Methods G. Grudic o Other kernel based methods (Kernel Maching Pursuit ...)
  • 36. An Ensemble of Models Motivation - a good single model is difficult to compute (impossible?), so build many and combine them. Combining many uncorrelated models produces better predictors... • don’t use randomness or use directed randomness: o Boosting o Specific cost function o Gradient Boosting o Derive a boosting algorithm for any cost function • Models that incorporate randomness: o Bagging: Bootstrap Sample by uniform random sampling o Stochastic Gradient Boosting: Bootstrap Sample: Uniform random sampling (with replacement) o Random Forests uniform random sampling & randomize Grudic G. inputs
  • 37. Machine Learning Software • WEKA  University of Waikato, New Zealand • YALE • RAPIDMINER (->YALE)  University of Dortmund, Germany  Rapid-I GmbH • MiningMart • SCRIPTELLA ETL  University of Dortmund, Germany  ETL (Extract-Transform-Load) • Orange • JAVA MACHINE LEARNING LIBRARY  University of Ljubljana, Slovenia  machine learning library • Rattle • IBM Intelligent MINER  Togaware  IBM • AlphaMiner • MALIBU  University of Hong Kong  University of Illinois • Databionic ESOM Tools • KNime  University of Marburg, Germany  University of Konstanz • MLC++ • SHOGUN  SGI, USA  Friedrich Miescher Laboratory • MLJ • ELEFANT  Kansas State University  National ICT Australia • BORGELT • PLearn  University of Magdeburg, Germany  University of Montreal • GNOME DATA MINE o TORCH  Togaware  Institut de Recherche Idiap • TANAGRA • PyML/MLPy  University of Lyon 2  Colorado State University /Fondazione Bruno Kessler • XELOPES  Prudsys • SPAGOBI  ObjectWeb, Italy • JASPERINTELLIGENCE  JasperSoft http://mloss.org/ • PENTAHO  Pentaho
  • 38. Consensus Learning Consensus Approaches in Machine Learning Consensus: A general agreement about a matter of opinion. Learning: Knowledge obtained by study. Machine Learning: Discover the relationships between the variables of a system (input, output and hidden) from direct samples of the system. Brainstorming: A method of solving problems in which all the members of a group suggest ideas and then discuss them (a brainstorming session). Oxford Advanced Learner’s Dictionary
  • 39. Consensus Learning Motivation Distributed artificial intelligence (DAI): the strong need to equip multi-agent systems with a learning mechanism so as to adapt them to a complex environment; Goal: improve the problem-solving abilities of individual agents and the overall system in complex multi-agent systems. Using learning approach, agents can exchange their ideas with each other and enrich their knowledge about the domain problem; Solution: the consensus learning approach. Oxford Advanced Learner’s Dictionary
  • 40. Consensus Learning diferent data representations + ensemble of ML algorithms Build the expert system using various input data representations & different learning algorithms: • Support Vector Machines • Artificial Neural Networks • Boosting • Hidden Markov Models • Decision Trees • Random Forest • ….
  • 41. Brainstorming INPUT: objects & features Representations annotations generation Similarity Text External External External Structure Properties PCA/ICA Mining Tools Annotations Databases … SQL Training Database feature decomposition Scores, Structural Features Thresholds Similarity Similarity OUTPUT: Support Model Neural Random Decision Vector Features Networks Forest Trees Machine Decision Reliability Score machine learning MLcons consensus
  • 42. Brainstorming Advantages • Statistical If training data is very small, there are many hypotheses which describe it. Consensus reduces the chance of selecting a bad classifier. • Computational Each ML might get stuck in local minima of training errors. Consensus reduces the chance of getting stuck in wrong minima. • Representational Consensus may represent a complex classifier which was not possible to formulate in the original set of hypotheses.
  • 43. Brainstorming Uwe Koch & Stephane Spieser Pawel G Sadowski, Tom Dariusz Plewczyński Kathryn S Lilley darman@icm.edu.pl Marcin von Grotthuss Krzysztof Ginalski kginal@icm.edu.pl Leszek Rychlewski Adrian Tkacz Jan Komorowski & Marcin Kierczak Lucjan Wyrwicz
  • 44. Back to the Brain Cortical simulator puts together neurons connected via an interconnections. The goal: to understand the information processing capability of such networks. 1. For every neuron: a. For every clock step (say 1 ms): i. Update the state of each neuron ii. If the neuron fires, generate an event for each synapse that the neuron is post-synaptic to and pre-synaptic to. 2. For every synapse: When it receives a pre- or post-synaptic event, update its state and, if necessary, the state of the post-synaptic neuron D. Modha
  • 45. Cognitive Network Cognitive network CN similarly puts together single learning units connected via an interconnections. The goal: to understand the learning as the spatiotemporal information processing and storing capability of such networks. 1. For every LU (learning unit): a. For every clock step: i. Update the state of each LU if data is changing ii. If the LU retraining was performed with success, generate an event for each coupling that the LU is post-event coupled to and pre-event coupled to. 2. For every TC (time coupling): When it receives a pre- or post-event, update its state and, if necessary, the state of the post-event LUs
  • 46. Spatial Cognitive Networks Cellular automata has only two states and do not retrain the individual LUs Each cell in cellular automata static model represent single machine learning algorithm, or certain combination of parameters, and optimization conditions affecting the classification output of this particular method.
  • 47. Spatial Cognitive Networks I call this single learner by the term “learning agent”, each characterized for example by its prediction quality on a selected training dataset. The coupling of individual learners is described by short- , medium- or long-range interaction strength, so called learning coupling. The actual structure, or topology of coupling between various learners is described using term “learning space”, and can have different representations, such as Cartesian space, fully connected or hierarchical geometry. The result of the evolution, dynamics of such system given by its stationary state is defined here as the consensus equilibration. The majority of learners define, in the stationary limit, the “learning consensus” outcome of the meta-learning procedure.
  • 48. Spatial Cognitive Networks 1) Binary Logic I assume the binary logic of individual learners, i.e. we deal with cellular automata consisting of N agents, each holding one of two opposite states (“NO” or “YES”). These states are binary , similarly to Ising model of ferromagnet. In most cases the machine learning algorithms that can model those agents, such as support vector machines, decision trees, trend vectors, artificial neural networks, random forest, predict two classes for incoming data, based on previous experience in the form of trained models. The prediction of an agent answers single question: is a query data contained in class A (“YES”), or it is different from items gathered in this class (“NO”).
  • 49. Spatial Cognitive Networks 2) Disorder and random strength parameter Each learner is characterized by two random parameters: persuasiveness and supportiveness that describe how individual agent interact with others. Persuasiveness describes how effectively the individual state of agent is propagated to neighboring agents, whereas supportiveness represent self-supportiveness of single agent.
  • 50. Spatial Cognitive Networks 3) Learning space and learning metric Each agent is characterized by a location in the learning space, therefore one can calculate the abstract learning distance of two learners i and j. The strength of coupling between two agents tend to decrease with the learning distance between them. Determination of the learning metric is a separate problem, and the particular form of the metric and the learning distance function should be empirically determined, and in principle can be a very peculiar geometry. For example: the fully connected learning space, where all distances between agents are equal usefull in the case of simple consensus between different, yet not organized machine learning algorithms.
  • 51. Spatial Cognitive Networks 4) Learning coupling Agents exchange their opinions by biasing others toward their own classification outcome. This influence can be described by the total learning impact I that ith agent is experiencing from all other learners. Within the cellular automata approach this impact is the difference between positive coupling of those agents that hold identical classification outcome, relative to negative influence of those who share opposite state, and can be formalized as: where Ip and Is are the functions of persuasiveness and supportiveness impact of the other agents on the i-th agent.
  • 52. Spatial Cognitive Networks 5) Meta-Learning equations The equation of dynamics of the learning model defines the state of ith individual at the next time step as follows: with rescaled learning influence: I assume a synchronous dynamics, i.e. states of all agents are updated in parallel. In comparison to standard Monte Carlo methods the synchronous dynamics takes shorter time to equilibrate than serial methods, yet it can be trapped into periodic asymptotic states with oscillations between neighboring agents.
  • 53. Spatial Cognitive Networks 6) Presence of noise The randomness of state change (phenomenological modeling of various random elements in the learning system, and training data) is given by introducing noise into dynamics: where h is the site-dependent white noise, or a uniform white noise. In the first case h are random variables independent for different agents and time instants, whereas in the second case h are independent for different time instants.
  • 54. Back to AI Cognitive Networks Cognitive networks CN are inspired by brain structure and performed cognitive functions CN put together single machine learning units connected via an interconnections. The goal is to understand the learning as the spatiotemporal information processing and storing capability of such networks (Meta-Learning!). 1. Space: for every LU (learning unit): a. For every time step: i. Update the state of each LU using changed training data ii. If the LU learning was performed with success, generate an event for each coupling that the LU is post-event coupled to and pre-event coupled to. 2. Time: For every TC (time coupling): When it receives a pre- or post-event,
  • 55. and Cortical Simulator Cortical simulator puts together neurons connected via an interconnections. The goal: to understand the information processing capability of such networks. 1. For every neuron: a. For every clock step (say 1 ms): i. Update the state of each neuron ii. If the neuron fires, generate an event for each synapse that the neuron is post-synaptic to and pre-synaptic to. 2. For every synapse: When it receives a pre- or post-synaptic event, update its state and, if necessary, the state of the post-synaptic neuron D. Modha
  • 56. both are Information Processing NETWORKS Network puts together single units connected via an interconnections. The goal is to understand the information processing of such network, as well as the spatiotemporal characteristics and ability to store information in such networks (Learning).
  • 57. Network: technical challenges Memory constrains to achieve near real-time simulation times, the state of all learning units LUs and time couplings TCs must fit in the random access memory of the system. In real brain synapses far outnumber the neurons, therefore the total available memory divided by the number of bytes per synapse limits the number of synapses that can be modeled. The rat-scale model (~55 million neurons, ~450 billion synapses) needs to store state for all synapses and neurons where later being negligible in comparison to the former. D. Modha
  • 58. Network: technical challenges Communication constrains that is on an average, each LU fires once a second. Each neuron connects to roughly 8,000 other neurons, and, therefore each neuron would generate 8,000 spikes (“messages" or so called neurobits) per second. This amounts to a total of 448 billion messages/neurobits per second. D. Modha
  • 59. Network: technical challenges Computation constrains: on an average, each LU fires once a second. In brain modelling, on an average, each synapse would be activated twice: once when its pre-synaptic neuron fires and once when its post-synaptic neuron fires. This amounts to 896 billion synaptic updates per second. Let us assume that the state of each neuron is updated every millisecond. This amounts to 55 billion neuronal updates per second. Synapses seem to dominate the computational cost. D. Modha
  • 60. Network: hardware Hardware solution: a state-of-the-art supercomputer BlueGene/L with 32,768 CPUs, 256MB of memory per processor (8 TB in total), and 1.05GB/sec of in/out communication bandwidth per node. To meet the above three constraints, one has to design data structure and algorithms that require no more than 16 bytes of storage per TC, 175 Flops per TC per second, and 66 bytes per neurobit (spike message). A rat-scale brain, near real-time learning network! D. Modha
  • 61. Network: software Software solution: a massively parallel learning network simulator, LN, that runs on distributed memory multiprocessors. Algorithmic enhancements: 1. a computationally efficient way to simulate learning units in a clock-driven ("synchronous") and couplings in an event- driven ("asynchronous") fashion; 2. a memory efficient representation to compactly represent the state of the simulation; 3. a communication efficient way to minimize the number of messages sent by aggregating them in several ways and by mapping message exchanges between processors onto judiciously chosen MPI primitives for synchronization. D. Modha
  • 62. Computational Size of the problem BlueGene/L can simulate the 1 sec of model in 10 sec at 1Hz firing rate and 1ms simulation resolution using random stimulus. MOUSE HUMAN BlueGene/L Neurons 2x8x106 2x50x109 3x104 CPUs Synapses 128x109 1015 109 CPUs pairs Communication 128x109 1015 1.05 GB/sec (66 B/spike) Spikes/sec Spikes/sec 64x109 b/sec in/out Computation 2x128x109 2x1015 45 TF (350 synapse/sec synapse/sec 4x1013F F/synapse/sec) 8,192 CPUs Memory O(128x109) O(1015) 4 TB (32 B/synapse) O(64x1012) (D. Modha: assuming neurons fire at 1Hz)
  • 63. D. Modha Progress in large-scale cortical simulations
  • 64. IBM BlueGene D. Modha
  • 65. D. Modha Future of cortical simulations
  • 66. D. Modha C2 Simulator vs Human Brain: The human cortex has about 22 billion neurons which is roughly a factor of 400 larger than the rat-scale model which has 55 million neurons. They used a BlueGene/L with 92 TF and 8 TB to carry out rat-scale simulations in near real-time [one tenth speed]. So, by naïve extrapolation, one would require at least a machine with a computation capacity of 36.8 PF and a memory capacity of 3.2 PB. Furthermore, assuming that there are 8,000 synapses per neuron, that neurons fire at an average rate of 1 Hz, and that each spike message can be communicated in, say, 66 Bytes. One would need an aggregate communication bandwidth of ~ 2 PBps. D. Modha
  • 67. SyNAPSE The goal is to create new electronics hardware and architecture that can understand, adapt and respond to an informative environment in ways that extend traditional computation to include fundamentally different capabilities found in biological brains. Stanford University: Brian A. Wandell, H.-S. Philip Wong Cornell University: Rajit Manohar Columbia University Medical Center: Stefano Fusi University of Wisconsin-Madison: Giulio Tononi University of California-Merced: Christopher Kello IBM Research: Rajagopal Ananthanarayanan, Leland Chang, Daniel Friedman, Christoph Hagleitner, Bulent Kurdi, Chung Lam, Paul Maglio, Stuart Parkin, Bipin Rajendran, Raghavendra Singh
  • 68. NeuralEnsemble.org NeuralEnsemble is a multilateral effort to coordinate and organize Neuroscience software development efforts into a larger meta-simulator software system, a natural and alternate approach to incrementally address what is known as the complexity bottleneck, presently a major roadblock for neural modelling. "Increasingly, the real limit on what computational scientists can accomplish is how quickly and reliably they can translate their ideas into working code." Gregory V. Wilson, Where's the Real Bottleneck in Scientific Computing?
  • 69. Emergent A comprehensive, full-featured neural network simulator that allows for the creation and analysis of complex, sophisticated models of the brain. It has high level drag-and-drop programming interface, built on top of a scripting language that has full introspective access to all aspects of networks and the software itself, allows one to write programs that seamlessly weave together the training of a network and evolution of its environment without ever typing out a line of code. Networks and all of their state variables are visually inspected in 3D, allowing for a quick "visual regression" of network dynamics and robot behavior.
  • 70. Cognitive Computers The Future Today algorithms and computers deals with structured data (age, salary, etc.) and semi-structured data (text and web pages). No mechanisms exists that is able to act in a context- dependent fashion while integrating ambiguous information across different senses (sight, hearing, touch, taste, and smell) and coordinating multiple motor modalities. Cognitive computing targets the boundary between digital and physical worlds where raw sensory information abounds. For example, while instrumenting the outside world’s with some sensors, and streaming this information in the real-time manner to a cognitive computer that may be able to detect spatio-temporal correlations.
  • 71. Niels Bohr Predicting is very difficult especially about the future…
  • 72. Brainstorming Uwe Koch & Stephane Spieser Pawel G Sadowski, Tom Dariusz Plewczyński Kathryn S Lilley darman@icm.edu.pl Marcin von Grotthuss Krzysztof Ginalski kginal@icm.edu.pl Leszek Rychlewski Adrian Tkacz Jan Komorowski & Marcin Kierczak Lucjan Wyrwicz