SlideShare une entreprise Scribd logo
1  sur  52
Télécharger pour lire hors ligne
Cognitive Vision – After the hype
Nicolas Pugeault
n.pugeault@surrey.ac.uk
Centre for Vision, Speech and Signal Processing
University of Surrey
What is vision?
Example: detection/recognition
PASCAL Visual Object Classes Challenge 2007
● Given examples from
N classes, we want
to detect and
recognise new
instances of one
class in images
Detection/recognition
Some Limitations
● Domain adaptation
● Performance
depends on number
of classes
● Complexity grows
with number of
classes
● Hard to extend.
Example: Tracking
● A target is identified in a
video, we want the
system to follow its
location and pose over
time.
● Template based
– Template drift problem
– Template udpate strategies
● … We're pretty good at it
now.
Videos from the ALIEN tracker,
Z. Kalal, K. Mikolajczyk, and J. Matas,
“Tracking-Learning-Detection,” IEEE TPAMI 2011.
F.Pernici. “FaceHugger: The ALIEN Tracker Applied to Faces.”
ECCV 2012
Robot Vision?
● Navigation (path planning,
obstacle avoidance, SLAM)
● Grasping, manipulation,
tool use.
● Planning (not strictly vision,
but connected)
● Human-robot interaction?
● Mostly a strong need for
precise 3D estimates of the
world and objects' shapes.
NAO robot (Aldebaran robotics)
Robot Vision: Grasping ?
● Grasping remains a challenging
task.
● Five-finger hands are complex
to control.
● Choosing (stable) points of
contact for fingers – depends on
texture, object's 3D shape and
weight...
●
Precise 3D shape and 6D pose
estimation, motion planning,
obstacle detection...
● Hard to estimate from vision...
R. Detry, C. H. Ek, M. Madry, J. Piater and D. Kragic,
Generalizing Grasps Across Partly Similar Objects.
IEEE ICRA 2012.
Robot Vision: Affordances ??
● James J. Gibson The Theory
of Affordances (1977)
● Latent “action possibilities”
connected to objects.
● Affordance generalisation
across object classes...
● Neural evidence: Mirror
neurons (Rizzolatti, G., Craighero,
L.: The mirror neuron system. Annual
Review of Physiology 27, 169–192,
2004)
Robot Vision: Tool use ???
● Using tools for solving
tasks is still a
challenge – especially
learning to!
● Primates (and even
some birds) can do it
(The Mentality of
Apes. Wolfgang,
Kohler, 1925).
?
Tool use (cont'd)
Face detection/recognition...
So... what is vision?
● Loosely defined concept
● Pretty much vision is what we experience on a
daily basis
● A rich, vivid and complete representation of
the world...
● … except most of it is made up...
The truth about human vision
● Human eye:
– high resolution in a small, central area called the fovea
(cones).
– colour only in the fovea (cones).
– very coarse elsewhere.
– low light and motion sensitivity in the periphery (rods).
– we're virtually blind to static areas.
– Ah... and we have a significant blind spot in the middle of
our field of view.
– … never noticed all that?
Human vision: the dualist illusion
● Our intuition is similar to
Descartes' vision
● “The cartesian theatre”
● We now know this is not the
case (from neuroscience).
● There is no clear
delineation in the brain
between perception and
cognition.
Vision module
Cognition/
consciousness
Action module
Diagram from Descartes' “Meditations”
Vision in the brain
Figure 25-12 from E.R. Kandel, J.H. Schwartz and T.M. Jessel, Eds.
Principles of Neural Science, 4th
Edition.
Cognitive Vision
● The ideal vision of
vision as a separate
module feeding
information to
cognition does not
work.
● So, where do we put
the bar?
Low level
signal processing
High-level cognition,
consciousness
Cognitive vision
feedback
Today's roadmap
● A (non-)definition of cognitive
vision and its flavours
– Cognitivist/Symbolic AI approach and
its problems
● The frame problem
● Symbol grounding problem
– The emergent view
● Aside: Neural networks
– The embodiment question
● How to get there? Some insights
from representation learning and
deep architectures.
– Autoencoders
– Convolutional networks
What is Cognitive Vision?
● H.H. Nagel (2003):
– improving computer vision algorithms by
adding numerous consistency check
mechanisms, at a logical level.
● David Vernon (2008, first draft 2004):
– “... attempt to achieve more robust, resilient
and adaptable computer vision systems by
endowing them with cognitive capabilities”
– “... able to adapt to unforeseen changes in
the visual environment”
– “... in essence, a combination of computer
vision and cognition”
● Multiple approaches to Cog-V
– Symbolic AI
– Emergent view
– Embodied AI
Cognitive vision ?
Symbolic AI
(dualist)
Emergent
Embodied
H.H. Nagel, Reflections on cognitive vision systems. In proc. of ICVS 2003.
D. Vernon. Cognitive Vision: The case for an embodied perception. Image and Vision Computing 26 (2008).
Example of Cognitive Architecture
The KnowHow system
KnowRob -- A Knowledge Processing Infrastructure for Cognition-enabled Robots.
Part 1: The KnowRob System (Moritz Tenorth, Michael Beetz), IJRR 2013.
Symbolic AI
●
Cognition involves operations
over symbolic representations.
● “Perception” is the process
abstracting symbolic
representations from sensory
signals.
●
Mostly, the symbolic
representation is the product of
human design and choice.
●
→ problem when we go away
from the domain of human
experience (ie, “semantic gap”) sensory signals
interpretation
symbolic
representation
logical
reasoning
The symbol grounding problem
Searle's “Chinese room argument” (1980):
– The symbols do not have the same semantics attached to
them as for the designer...
Harnad (1990)
– Cognition is more than symbol manipulation
→ In other words, the system should learn its own symbols,
grounded in its own experiences...
Barsalou (1999)
– Cognition is inherently perceptual
– (and therefore, perception is inherently cognitive)
The frame problem in AI – part I
(Daniel C. Dennett)
● Once upon a time,
there was a robot,
called R1...
"Cognitive Wheels: The Frame Problem of AI,"
in C. Hookway, ed., Minds, Machines and Evolution,
Cambridge University Press 1984, 129-151.
PULL WAGON
The frame problem in AI – part I
(Daniel C. Dennett)
● Once upon a time,
there was a robot,
called R1...
"Cognitive Wheels: The Frame Problem of AI,"
in C. Hookway, ed., Minds, Machines and Evolution,
Cambridge University Press 1984, 129-151.
The frame problem in AI – part II
(Daniel C. Dennett)
● A new robot was built
to recognise, and
handle side-effects:
R1D1
Pulling wagon does not change
wall colour
Pull the wagon?
Pulling the wagon does not
discharge the batteries
...
...
The frame problem in AI – part III
(Daniel C. Dennett)
● The designers built a
third robot to assess
the relevance of
implications: Say
hello to R2D1.
...
The frame problem in AI – part III
(Daniel C. Dennett)
● In sum, any action requires a large, a priori
unknown, amount of world knowledge
● Hard to predict for the system designer
● Hard to deduce by symbolically by the system
● → need for common sense associations
For vision: it is hard to predetermine a priori
the features and detectors that will be
required.
Issues with Symbolic AI
● Symbolic AI is an
efficient architecture
● Has solved successfully
some hard problems
● ...but faces some
complex limitations due
to the separation
between symbolic /
sub-symbolic
components.
Low level
signal processing
High-level cognition,
consciousness
Symbolic reasoning
detectors
symbols
Computer vision
AI
Emergent Cognition
● The system develops its own
epistemiology (set of symbols &
associations) from interacting with its
environment.
●
Enactive view (Maturana, H., Varela, F. The
Tree of Knowledge – The Biological Roots of
Human Understanding. New Science Library,
Boston & London (1987))
– autonomous system
– can affect the environment
– is affected by the environment
(embodied)
– self-organised and self-generated.
●
Central nervous system
– prediction & adaptation
Fig from Vernon, von Hofsten & Fadiga
“A Roadmap for Cognitive Development
in Humanoid Robots”. Springer, 2010.
Emergent Cognition: Shared
Epistemiology
● Pb: different experience → different symbols!
● Shared epistemiology comes from communication between
agents (my and your concept of “red” are shared, even if you're
colour blind)
● Note: communication between artificial systems can be a lot faster!
Artificial Neural Networks
● An “artificial neuron”
is in effect
– a linear
transformation
– a linear squashing
function s
an=f w ,b(x)=s(∑
i
wi xi+b)
x1
x2
x3
+1
n an
w1
w2
w3
b
Non-linearities
● Smooth squashing
functions.
● continuous and
differentiable.
● sigmoid → [0,1]
● tanh → [-1,+1]
s(x)=
1
1+e
−x
s(x)=tanh(x)=
e
x
−e
−x
e
x
+e
−x
s'(x)=(1−s(x))s(x)
Artificial Neural Network
(aka Multilayer perceptron)
x1
x2
x3
h1
r1
h2
+1+1
input layer
layer #1
(N^1=3 inputs)
“hidden” layer
layer #2
(N^2=2 nodes)
output layer
layer #3
(N^3=1 node)
θ=(W
1
,b
1
,W
2
,b
2
)
parameters:
f θ(x)=s( ∑
j∈[1,N
2
]
W j1
2
s( ∑
i∈[1, N
1
]
Wij
1
xi+bi
1
)+bj
2
)
zi
l+1
= ∑
j∈[1, N
l
]
W j1
l
aj
l
+bj
l
ai
l
=s(zi
l
)
Generic node activation:
b1
2
Learning by back-propagation
E=
1
2
∥a
3
− y∥
δj
L
=
∂ E
∂ aj
L
s'(zj
L
)
(⇔δj
L
=(aj
L
− y j)s'(zj
L
))
x1
x2
x3
h1
r1
h2
+1+1
input layer
layer #1
(N^1=3 inputs)
“hidden” layer
layer #2
(N^2=2 nodes)
output layer
layer #3
(N^3=1 node)
(x , y)
δj
l
=∑
i
W ji
l
δi
l+1
s'(zj
l
)
δ1
3
δ1
2
TOP LAYER ERROR
OTHER LAYERS ERROR
For a given datapoint with label
We have a error for the network
a1
3
b1
2
Learning by back-propagation
x1
x2
x3
h1
r1
h2
+1+1
input layer
layer #1
(N^1=3 inputs)
“hidden” layer
layer #2
(N^2=2 nodes)
output layer
layer #3
(N^3=1 node)
b1
2
δ1
3
δ1
2
a1
3
∂ E
∂Wij
l
=ai
l
δj
l+1
∂ E
∂bj
l
=δj
l+1
→ Update parameters
with gradient descent
Finally we get the error derivative for all
network parameters:
Embodiment
● Idea: Concepts can only be
learnt for and by a body
– → being affected by the
environment
– actions and perception and learnt
jointly.
– good perception is what allows
successful actions.
● Example of reaching with neural
network (Jamone, L.; Natale, L.; Metta,
G.; Nori, F.; Sandini, G. .2012, “Autonomous
Online Learning of Reaching Behavior in a
Humanoid Robot.” International Journal of
Humanoid Robotics 9(3), 2012.)
Do we need embodiment?
● If you buy the emergent thesis, it is required
– joint development of perception & action
– symbol grounding in experience
– → emergent epistemiology
● What type of embodiment?
– strong: physical body (or even organic body!)
– weak: a system coupled with its environment
●
it can affect its environment, and
● it is affected by it
Phylogeny vs. Ontogeny
● Phylogeny: the system's design (eg features like SIFT
or lines). High with cognitivist approach, more limited
in the emergent paradigm.
● Ontogeny: the system's development during its
lifetime, drawn from experiences with its environment.
● Challenges for artificial systems:
– hard to learn high level, abstract symbols autonomously.
– hard to generalise across experiences
– → how to learn abstract representations from experience?
Representation Learning
● Simple example: PCA
● Aim: identify dimensions that
vary jointly
● Components are axes of largest
variation.
● Linear transformation
● Orthogonal basis
● Applied on natural images,
generate filters similar to early
cortical cells (V1)
PJB Hancock, RJ Baddeley and LS Smith (1992)
The principal components of natural images
Network: computation in neural systems 3(1)
y=W
T
x+μ
Arguments for deep hierarchies
● Feature sharing at intermediate
levels → sub-linear coding and
computation requirements (Fidler,
Boben & Leonardis. Evaluating multi-class
learning strategies in a generative hierarchical
framework for object detection. NIPS'09.)
● compact coding (Bengio, Courville, &
Vincent. Representation Learning: A Review and
New Perspectives. IEEE PAMI 35(8), 2013.)
● → Human visual system, estimated
to have 5-10 levels (Krueger et al, “Deep
Hierarchies in the Primate Visual Cortex: What
Can We Learn for Computer Vision? 2013)
● →NN,CART,SVM → 2 layers
Figure from Fidler, Boben & Leonardis 2009
Arguments for Deep Hierarchies
● Problem with linear
representations :
– A combination of any
number of linear
representations is
also a linear
representation...
y=W1
T
x+μ1
z=W2
T
y+μ2
⇔ z=W2
T
W1
T
x+μ2+μ1
⇔ z=W3
T
x+μ3
x
y
(W1,μ1)
z
(W2, μ2)
(W3, μ3)
Data driven hierarchies:
Autoencoders
● Idea: learn jointly a
pair of mappings
and
● that minimises
information loss
● often using a neural
network formulation
z=ψ( y)y=ϕ(x)
argmin
ϕ ,ψ
∑
x
∥x−ψ(ϕ(x))∥D
ϕ(x)=s(W x+b)
ψ(x)=W ' x+b'
x
y
ϕ ψ
x1 x2 x3 x4
y1 y2 y3
Remember: ANNs
● An “artificial neuron”
is in effect
– a linear
transformation
– a linear squashing
function s
an=f w ,b(x)=s(∑
i
wi xi+b)
x1
x2
x3
+1
n an
w1
w2
w3
b
Data driven hierarchies:
Autoencoders
● Idea: learn jointly a
pair of mappings
and
● that minimises
information loss
● often using a neural
network formulation
z=ψ( y)y=ϕ(x)
argmin
ϕ ,ψ
∑
x
∥x−ψ(ϕ(x))∥D
ϕ(x)=s(W x+b)
ψ(x)=W ' x+b'
x
y
ϕ ψ
x1 x2 x3 x4
y1 y2 y3
Data driven hierarchies:
Sparse Autoencoders
● Trivial solution when
dim(Y) >= dim(X) !
● But overcomplete bases
can be beneficial
(Olhsausen & Field 1996)
● Solution→sparse coding
argmin
ϕ ,ψ
∑
x
∥x−ψ(ϕ(x))∥D+g(ϕ(x))
x
y
ϕ ψ
Olshausen, B. and Field, D. (1996). Emergence of simple-cell receptive field properties by learning
a sparse code for natural images. Nature, 381(6583):607–609.
Stacked Auto-encoders
● you can stack
multiple layers of AE
● Trained layer-wise
● Note that the
structure is the same
as ANN.
● Can be fine-tuned
with backpropagation
x
h
ϕ1 ψ1
y
ϕ2 ψ2
Limitations of ANN
● Problem with ANN, doesn't work well with more than 2
layers
● pb. with backprop, probably the gradient gets too
diluted.
● Problem for emergent cognition: we want to learn
higher level of abstraction!
● More recently several alternatives have been
developped (Deep Learning): Restricted Boltzman
Machines (RBM), Stacked autoencoder, Convolutional
nets.
Today's hot topic:
Convolutional Neural Nets
● CNNs are neural nets (of
course)
● sparse connectivity
● shared weight →
convolutional.
● receptive field span all
input dimensions
● typically alternating layers
of convolution and
max-pooling
x1 x1 x1 x1 x1
h1 h1 h1
Fig from http://deeplearning.net/tutorial/lenet.html
CNNs (cont'd)
● ex: LeNet (LeCun et al, 1998)
● Alternating convolution & subsampling (ie, max-pooling) layes
● Top-layer is a typical ANN.
● Train using backprop & stochastic gradient descent
Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to
document recognition,” Proceedings of the IEEE, 1998.
Figure from http://deeplearning.net/tutorial/lenet.html
CNNs (cont'd)
● Pb: training deep networks is difficult with backprop (slow, requires LOTS of data)
● CNN do better (because sparse) but still a pb.
● → Unsupervised pre-training of the network
– using, ie, sparse autoencoders, layer-wise.
– refine the weights with supervised backprop afterwards.
● Top results in MNIST, ILSRVC, PASCAL VOC.
Other Part-based Hierarchies
● Deep belief networks (DBN): Restricted
Boltzmann Machines
(Hinton, Osindero, and Teh, “A Fast Learning Algorithm
for Deep Belief Nets,” Neural Computation 18, 2006.)
● Slow Feature Analysis (SFA) (Franzius,
Wilber and Wiskott. “Invariant object recognition and
pose estimation with slow feature analysis”. Neural
Computation, 2011)
● Compositional hierarchies (Fidler, Boben &
Leonardis. “Evaluating multi-class learning strategies in
a generative hierarchical framework for object
detection”. NIPS, 2009)
● → Good review by Yoshua Bengio:
Yoshua Bengio, Aaron Courville, and
Pascal Vincent. “Representation
Learning: A Review and New
Perspectives.” IEEE PAMI 35(8), 2013.
Fidler, S., M. Boben, and A. Leonardis 2009a.
“Learning hierarchical compositional representations of object structure.”
Pp. 196-215 in Object categorization : computer and human vision perspectives,
edited by Sven J Dickinson, Aleš Leonardis, Bernt Schiele, and Michael J Tarr.
New York: Cambridge University Press.
Summary and conclusions
● There is no delineation
between cognition and vision.
● Reasoning on hand-crafted
symbols may be inadequate
(semantic gap) or brittle.
● Learning abstraction is hard,
but possible using deep
hierarchies.
● Unsupervised pre-training for
deep hierarchies is critical →
tells us something about
cognition.

Contenu connexe

Tendances

Senior Project Paper
Senior Project PaperSenior Project Paper
Senior Project PaperMark Kurtz
 
Aquila: An Open-Source GPU-Accelerated Toolkit for Cognitive and Neuro-Roboti...
Aquila: An Open-Source GPU-Accelerated Toolkit for Cognitive and Neuro-Roboti...Aquila: An Open-Source GPU-Accelerated Toolkit for Cognitive and Neuro-Roboti...
Aquila: An Open-Source GPU-Accelerated Toolkit for Cognitive and Neuro-Roboti...Martin Peniak
 
Towards A Cross-context IA
Towards A Cross-context IATowards A Cross-context IA
Towards A Cross-context IAAndrea Resmini
 
Cognitive Computing for Tacit Knowledge1
Cognitive Computing for Tacit Knowledge1Cognitive Computing for Tacit Knowledge1
Cognitive Computing for Tacit Knowledge1Lucia Gradinariu
 
Deep Learning for Computer Vision (3/4): Video Analytics @ laSalle 2016
Deep Learning for Computer Vision (3/4): Video Analytics @ laSalle 2016Deep Learning for Computer Vision (3/4): Video Analytics @ laSalle 2016
Deep Learning for Computer Vision (3/4): Video Analytics @ laSalle 2016Universitat Politècnica de Catalunya
 

Tendances (7)

Senior Project Paper
Senior Project PaperSenior Project Paper
Senior Project Paper
 
Visual learning
Visual learning Visual learning
Visual learning
 
Aquila: An Open-Source GPU-Accelerated Toolkit for Cognitive and Neuro-Roboti...
Aquila: An Open-Source GPU-Accelerated Toolkit for Cognitive and Neuro-Roboti...Aquila: An Open-Source GPU-Accelerated Toolkit for Cognitive and Neuro-Roboti...
Aquila: An Open-Source GPU-Accelerated Toolkit for Cognitive and Neuro-Roboti...
 
Towards A Cross-context IA
Towards A Cross-context IATowards A Cross-context IA
Towards A Cross-context IA
 
Cognitive Computing for Tacit Knowledge1
Cognitive Computing for Tacit Knowledge1Cognitive Computing for Tacit Knowledge1
Cognitive Computing for Tacit Knowledge1
 
Deep Learning for Computer Vision (3/4): Video Analytics @ laSalle 2016
Deep Learning for Computer Vision (3/4): Video Analytics @ laSalle 2016Deep Learning for Computer Vision (3/4): Video Analytics @ laSalle 2016
Deep Learning for Computer Vision (3/4): Video Analytics @ laSalle 2016
 
Deep Learning for Computer Vision: Video Analytics (UPC 2016)
Deep Learning for Computer Vision: Video Analytics (UPC 2016)Deep Learning for Computer Vision: Video Analytics (UPC 2016)
Deep Learning for Computer Vision: Video Analytics (UPC 2016)
 

Similaire à Cognitive Vision - After the hype

Deep Visual Understanding from Deep Learning by Prof. Jitendra Malik
Deep Visual Understanding from Deep Learning by Prof. Jitendra MalikDeep Visual Understanding from Deep Learning by Prof. Jitendra Malik
Deep Visual Understanding from Deep Learning by Prof. Jitendra MalikThe Hive
 
[DSC Adria 23] Tomislav Stipancic PLEA-Affective interactive virtual agents t...
[DSC Adria 23] Tomislav Stipancic PLEA-Affective interactive virtual agents t...[DSC Adria 23] Tomislav Stipancic PLEA-Affective interactive virtual agents t...
[DSC Adria 23] Tomislav Stipancic PLEA-Affective interactive virtual agents t...DataScienceConferenc1
 
Materi_01_VK_2223_3.pdf
Materi_01_VK_2223_3.pdfMateri_01_VK_2223_3.pdf
Materi_01_VK_2223_3.pdfichsan6
 
The Technological Singularity - Risks & Opportunities - Monash University
The Technological Singularity - Risks & Opportunities - Monash UniversityThe Technological Singularity - Risks & Opportunities - Monash University
The Technological Singularity - Risks & Opportunities - Monash UniversityAdam Ford
 
artificial intelligence
artificial intelligenceartificial intelligence
artificial intelligenceMayank Saxena
 
David Barber - Deep Nets, Bayes and the story of AI
David Barber - Deep Nets, Bayes and the story of AIDavid Barber - Deep Nets, Bayes and the story of AI
David Barber - Deep Nets, Bayes and the story of AIBayes Nets meetup London
 
Deep Learning
Deep LearningDeep Learning
Deep LearningJun Wang
 
Artificial Intelligence
Artificial Intelligence Artificial Intelligence
Artificial Intelligence NIKHILMALPURE3
 
Ai introduction and production system and search patterns
Ai introduction and production system and search patternsAi introduction and production system and search patterns
Ai introduction and production system and search patternsaadip5069118
 
Artificial Intelligence is back, Deep Learning Networks and Quantum possibili...
Artificial Intelligence is back, Deep Learning Networks and Quantum possibili...Artificial Intelligence is back, Deep Learning Networks and Quantum possibili...
Artificial Intelligence is back, Deep Learning Networks and Quantum possibili...John Mathon
 
Semester vi bt9403-virtual reality-de (2)
Semester vi bt9403-virtual reality-de (2)Semester vi bt9403-virtual reality-de (2)
Semester vi bt9403-virtual reality-de (2)smumbahelp
 
Deep learning introduction
Deep learning introductionDeep learning introduction
Deep learning introductiongiangbui0816
 
lec_11_self_supervised_learning.pdf
lec_11_self_supervised_learning.pdflec_11_self_supervised_learning.pdf
lec_11_self_supervised_learning.pdfAlamgirAkash3
 
Cognitive systems institute talk 8 june 2017 - v.1.0
Cognitive systems institute talk   8 june 2017 - v.1.0Cognitive systems institute talk   8 june 2017 - v.1.0
Cognitive systems institute talk 8 june 2017 - v.1.0diannepatricia
 

Similaire à Cognitive Vision - After the hype (20)

Deep Visual Understanding from Deep Learning by Prof. Jitendra Malik
Deep Visual Understanding from Deep Learning by Prof. Jitendra MalikDeep Visual Understanding from Deep Learning by Prof. Jitendra Malik
Deep Visual Understanding from Deep Learning by Prof. Jitendra Malik
 
Suman
SumanSuman
Suman
 
[DSC Adria 23] Tomislav Stipancic PLEA-Affective interactive virtual agents t...
[DSC Adria 23] Tomislav Stipancic PLEA-Affective interactive virtual agents t...[DSC Adria 23] Tomislav Stipancic PLEA-Affective interactive virtual agents t...
[DSC Adria 23] Tomislav Stipancic PLEA-Affective interactive virtual agents t...
 
Materi_01_VK_2223_3.pdf
Materi_01_VK_2223_3.pdfMateri_01_VK_2223_3.pdf
Materi_01_VK_2223_3.pdf
 
The Technological Singularity - Risks & Opportunities - Monash University
The Technological Singularity - Risks & Opportunities - Monash UniversityThe Technological Singularity - Risks & Opportunities - Monash University
The Technological Singularity - Risks & Opportunities - Monash University
 
Artificial intelligence
Artificial intelligenceArtificial intelligence
Artificial intelligence
 
lecun-01.ppt
lecun-01.pptlecun-01.ppt
lecun-01.ppt
 
artificial intelligence
artificial intelligenceartificial intelligence
artificial intelligence
 
Artificial intelligence
Artificial intelligenceArtificial intelligence
Artificial intelligence
 
David Barber - Deep Nets, Bayes and the story of AI
David Barber - Deep Nets, Bayes and the story of AIDavid Barber - Deep Nets, Bayes and the story of AI
David Barber - Deep Nets, Bayes and the story of AI
 
Vass2012 fisher
Vass2012 fisherVass2012 fisher
Vass2012 fisher
 
Deep Learning
Deep LearningDeep Learning
Deep Learning
 
Artificial Intelligence
Artificial Intelligence Artificial Intelligence
Artificial Intelligence
 
Ai introduction and production system and search patterns
Ai introduction and production system and search patternsAi introduction and production system and search patterns
Ai introduction and production system and search patterns
 
Artificial Intelligence is back, Deep Learning Networks and Quantum possibili...
Artificial Intelligence is back, Deep Learning Networks and Quantum possibili...Artificial Intelligence is back, Deep Learning Networks and Quantum possibili...
Artificial Intelligence is back, Deep Learning Networks and Quantum possibili...
 
Semester vi bt9403-virtual reality-de (2)
Semester vi bt9403-virtual reality-de (2)Semester vi bt9403-virtual reality-de (2)
Semester vi bt9403-virtual reality-de (2)
 
Deep learning introduction
Deep learning introductionDeep learning introduction
Deep learning introduction
 
Deep learning
Deep learningDeep learning
Deep learning
 
lec_11_self_supervised_learning.pdf
lec_11_self_supervised_learning.pdflec_11_self_supervised_learning.pdf
lec_11_self_supervised_learning.pdf
 
Cognitive systems institute talk 8 june 2017 - v.1.0
Cognitive systems institute talk   8 june 2017 - v.1.0Cognitive systems institute talk   8 june 2017 - v.1.0
Cognitive systems institute talk 8 june 2017 - v.1.0
 

Plus de potaters

Image formation
Image formationImage formation
Image formationpotaters
 
Ln l.agapito
Ln l.agapitoLn l.agapito
Ln l.agapitopotaters
 
Motion and tracking
Motion and trackingMotion and tracking
Motion and trackingpotaters
 
BMVA summer school MATLAB programming tutorial
BMVA summer school MATLAB programming tutorialBMVA summer school MATLAB programming tutorial
BMVA summer school MATLAB programming tutorialpotaters
 
Statistical models of shape and appearance
Statistical models of shape and appearanceStatistical models of shape and appearance
Statistical models of shape and appearancepotaters
 
Vision Algorithmics
Vision AlgorithmicsVision Algorithmics
Vision Algorithmicspotaters
 
Performance characterization in computer vision
Performance characterization in computer visionPerformance characterization in computer vision
Performance characterization in computer visionpotaters
 
Machine learning for computer vision part 2
Machine learning for computer vision part 2Machine learning for computer vision part 2
Machine learning for computer vision part 2potaters
 
Machine learning fro computer vision - a whirlwind of key concepts for the un...
Machine learning fro computer vision - a whirlwind of key concepts for the un...Machine learning fro computer vision - a whirlwind of key concepts for the un...
Machine learning fro computer vision - a whirlwind of key concepts for the un...potaters
 
Low level vision - A tuturial
Low level vision - A tuturialLow level vision - A tuturial
Low level vision - A tuturialpotaters
 
Local feature descriptors for visual recognition
Local feature descriptors for visual recognitionLocal feature descriptors for visual recognition
Local feature descriptors for visual recognitionpotaters
 
Image segmentation
Image segmentationImage segmentation
Image segmentationpotaters
 
A primer for colour computer vision
A primer for colour computer visionA primer for colour computer vision
A primer for colour computer visionpotaters
 
Graphical Models for chains, trees and grids
Graphical Models for chains, trees and gridsGraphical Models for chains, trees and grids
Graphical Models for chains, trees and gridspotaters
 
Medical image computing - BMVA summer school 2014
Medical image computing - BMVA summer school 2014Medical image computing - BMVA summer school 2014
Medical image computing - BMVA summer school 2014potaters
 
Decision Forests and discriminant analysis
Decision Forests and discriminant analysisDecision Forests and discriminant analysis
Decision Forests and discriminant analysispotaters
 

Plus de potaters (16)

Image formation
Image formationImage formation
Image formation
 
Ln l.agapito
Ln l.agapitoLn l.agapito
Ln l.agapito
 
Motion and tracking
Motion and trackingMotion and tracking
Motion and tracking
 
BMVA summer school MATLAB programming tutorial
BMVA summer school MATLAB programming tutorialBMVA summer school MATLAB programming tutorial
BMVA summer school MATLAB programming tutorial
 
Statistical models of shape and appearance
Statistical models of shape and appearanceStatistical models of shape and appearance
Statistical models of shape and appearance
 
Vision Algorithmics
Vision AlgorithmicsVision Algorithmics
Vision Algorithmics
 
Performance characterization in computer vision
Performance characterization in computer visionPerformance characterization in computer vision
Performance characterization in computer vision
 
Machine learning for computer vision part 2
Machine learning for computer vision part 2Machine learning for computer vision part 2
Machine learning for computer vision part 2
 
Machine learning fro computer vision - a whirlwind of key concepts for the un...
Machine learning fro computer vision - a whirlwind of key concepts for the un...Machine learning fro computer vision - a whirlwind of key concepts for the un...
Machine learning fro computer vision - a whirlwind of key concepts for the un...
 
Low level vision - A tuturial
Low level vision - A tuturialLow level vision - A tuturial
Low level vision - A tuturial
 
Local feature descriptors for visual recognition
Local feature descriptors for visual recognitionLocal feature descriptors for visual recognition
Local feature descriptors for visual recognition
 
Image segmentation
Image segmentationImage segmentation
Image segmentation
 
A primer for colour computer vision
A primer for colour computer visionA primer for colour computer vision
A primer for colour computer vision
 
Graphical Models for chains, trees and grids
Graphical Models for chains, trees and gridsGraphical Models for chains, trees and grids
Graphical Models for chains, trees and grids
 
Medical image computing - BMVA summer school 2014
Medical image computing - BMVA summer school 2014Medical image computing - BMVA summer school 2014
Medical image computing - BMVA summer school 2014
 
Decision Forests and discriminant analysis
Decision Forests and discriminant analysisDecision Forests and discriminant analysis
Decision Forests and discriminant analysis
 

Dernier

User Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationUser Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationColumbia Weather Systems
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentationtahreemzahra82
 
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPirithiRaju
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...lizamodels9
 
Citronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayCitronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayupadhyaymani499
 
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPirithiRaju
 
Topic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxTopic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxJorenAcuavera1
 
basic entomology with insect anatomy and taxonomy
basic entomology with insect anatomy and taxonomybasic entomology with insect anatomy and taxonomy
basic entomology with insect anatomy and taxonomyDrAnita Sharma
 
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptxECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptxmaryFF1
 
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingBase editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingNetHelix
 
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...D. B. S. College Kanpur
 
Pests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPirithiRaju
 
Forensic limnology of diatoms by Sanjai.pptx
Forensic limnology of diatoms by Sanjai.pptxForensic limnology of diatoms by Sanjai.pptx
Forensic limnology of diatoms by Sanjai.pptxkumarsanjai28051
 
Carbon Dioxide Capture and Storage (CSS)
Carbon Dioxide Capture and Storage (CSS)Carbon Dioxide Capture and Storage (CSS)
Carbon Dioxide Capture and Storage (CSS)Tamer Koksalan, PhD
 
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 GenuineCall Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuinethapagita
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...Universidade Federal de Sergipe - UFS
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trssuser06f238
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxpriyankatabhane
 
Radiation physics in Dental Radiology...
Radiation physics in Dental Radiology...Radiation physics in Dental Radiology...
Radiation physics in Dental Radiology...navyadasi1992
 

Dernier (20)

User Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationUser Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather Station
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentation
 
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
 
Citronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayCitronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyay
 
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
 
Topic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxTopic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptx
 
basic entomology with insect anatomy and taxonomy
basic entomology with insect anatomy and taxonomybasic entomology with insect anatomy and taxonomy
basic entomology with insect anatomy and taxonomy
 
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptxECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
 
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingBase editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
 
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
 
Pests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdf
 
Volatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -IVolatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -I
 
Forensic limnology of diatoms by Sanjai.pptx
Forensic limnology of diatoms by Sanjai.pptxForensic limnology of diatoms by Sanjai.pptx
Forensic limnology of diatoms by Sanjai.pptx
 
Carbon Dioxide Capture and Storage (CSS)
Carbon Dioxide Capture and Storage (CSS)Carbon Dioxide Capture and Storage (CSS)
Carbon Dioxide Capture and Storage (CSS)
 
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 GenuineCall Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 tr
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
 
Radiation physics in Dental Radiology...
Radiation physics in Dental Radiology...Radiation physics in Dental Radiology...
Radiation physics in Dental Radiology...
 

Cognitive Vision - After the hype

  • 1. Cognitive Vision – After the hype Nicolas Pugeault n.pugeault@surrey.ac.uk Centre for Vision, Speech and Signal Processing University of Surrey
  • 3. Example: detection/recognition PASCAL Visual Object Classes Challenge 2007 ● Given examples from N classes, we want to detect and recognise new instances of one class in images
  • 5. Some Limitations ● Domain adaptation ● Performance depends on number of classes ● Complexity grows with number of classes ● Hard to extend.
  • 6. Example: Tracking ● A target is identified in a video, we want the system to follow its location and pose over time. ● Template based – Template drift problem – Template udpate strategies ● … We're pretty good at it now. Videos from the ALIEN tracker, Z. Kalal, K. Mikolajczyk, and J. Matas, “Tracking-Learning-Detection,” IEEE TPAMI 2011. F.Pernici. “FaceHugger: The ALIEN Tracker Applied to Faces.” ECCV 2012
  • 7. Robot Vision? ● Navigation (path planning, obstacle avoidance, SLAM) ● Grasping, manipulation, tool use. ● Planning (not strictly vision, but connected) ● Human-robot interaction? ● Mostly a strong need for precise 3D estimates of the world and objects' shapes. NAO robot (Aldebaran robotics)
  • 8. Robot Vision: Grasping ? ● Grasping remains a challenging task. ● Five-finger hands are complex to control. ● Choosing (stable) points of contact for fingers – depends on texture, object's 3D shape and weight... ● Precise 3D shape and 6D pose estimation, motion planning, obstacle detection... ● Hard to estimate from vision... R. Detry, C. H. Ek, M. Madry, J. Piater and D. Kragic, Generalizing Grasps Across Partly Similar Objects. IEEE ICRA 2012.
  • 9. Robot Vision: Affordances ?? ● James J. Gibson The Theory of Affordances (1977) ● Latent “action possibilities” connected to objects. ● Affordance generalisation across object classes... ● Neural evidence: Mirror neurons (Rizzolatti, G., Craighero, L.: The mirror neuron system. Annual Review of Physiology 27, 169–192, 2004)
  • 10. Robot Vision: Tool use ??? ● Using tools for solving tasks is still a challenge – especially learning to! ● Primates (and even some birds) can do it (The Mentality of Apes. Wolfgang, Kohler, 1925). ?
  • 13. So... what is vision? ● Loosely defined concept ● Pretty much vision is what we experience on a daily basis ● A rich, vivid and complete representation of the world... ● … except most of it is made up...
  • 14. The truth about human vision ● Human eye: – high resolution in a small, central area called the fovea (cones). – colour only in the fovea (cones). – very coarse elsewhere. – low light and motion sensitivity in the periphery (rods). – we're virtually blind to static areas. – Ah... and we have a significant blind spot in the middle of our field of view. – … never noticed all that?
  • 15. Human vision: the dualist illusion ● Our intuition is similar to Descartes' vision ● “The cartesian theatre” ● We now know this is not the case (from neuroscience). ● There is no clear delineation in the brain between perception and cognition. Vision module Cognition/ consciousness Action module Diagram from Descartes' “Meditations”
  • 16. Vision in the brain Figure 25-12 from E.R. Kandel, J.H. Schwartz and T.M. Jessel, Eds. Principles of Neural Science, 4th Edition.
  • 17. Cognitive Vision ● The ideal vision of vision as a separate module feeding information to cognition does not work. ● So, where do we put the bar? Low level signal processing High-level cognition, consciousness Cognitive vision feedback
  • 18. Today's roadmap ● A (non-)definition of cognitive vision and its flavours – Cognitivist/Symbolic AI approach and its problems ● The frame problem ● Symbol grounding problem – The emergent view ● Aside: Neural networks – The embodiment question ● How to get there? Some insights from representation learning and deep architectures. – Autoencoders – Convolutional networks
  • 19. What is Cognitive Vision? ● H.H. Nagel (2003): – improving computer vision algorithms by adding numerous consistency check mechanisms, at a logical level. ● David Vernon (2008, first draft 2004): – “... attempt to achieve more robust, resilient and adaptable computer vision systems by endowing them with cognitive capabilities” – “... able to adapt to unforeseen changes in the visual environment” – “... in essence, a combination of computer vision and cognition” ● Multiple approaches to Cog-V – Symbolic AI – Emergent view – Embodied AI Cognitive vision ? Symbolic AI (dualist) Emergent Embodied H.H. Nagel, Reflections on cognitive vision systems. In proc. of ICVS 2003. D. Vernon. Cognitive Vision: The case for an embodied perception. Image and Vision Computing 26 (2008).
  • 20. Example of Cognitive Architecture The KnowHow system KnowRob -- A Knowledge Processing Infrastructure for Cognition-enabled Robots. Part 1: The KnowRob System (Moritz Tenorth, Michael Beetz), IJRR 2013.
  • 21. Symbolic AI ● Cognition involves operations over symbolic representations. ● “Perception” is the process abstracting symbolic representations from sensory signals. ● Mostly, the symbolic representation is the product of human design and choice. ● → problem when we go away from the domain of human experience (ie, “semantic gap”) sensory signals interpretation symbolic representation logical reasoning
  • 22. The symbol grounding problem Searle's “Chinese room argument” (1980): – The symbols do not have the same semantics attached to them as for the designer... Harnad (1990) – Cognition is more than symbol manipulation → In other words, the system should learn its own symbols, grounded in its own experiences... Barsalou (1999) – Cognition is inherently perceptual – (and therefore, perception is inherently cognitive)
  • 23. The frame problem in AI – part I (Daniel C. Dennett) ● Once upon a time, there was a robot, called R1... "Cognitive Wheels: The Frame Problem of AI," in C. Hookway, ed., Minds, Machines and Evolution, Cambridge University Press 1984, 129-151. PULL WAGON
  • 24. The frame problem in AI – part I (Daniel C. Dennett) ● Once upon a time, there was a robot, called R1... "Cognitive Wheels: The Frame Problem of AI," in C. Hookway, ed., Minds, Machines and Evolution, Cambridge University Press 1984, 129-151.
  • 25. The frame problem in AI – part II (Daniel C. Dennett) ● A new robot was built to recognise, and handle side-effects: R1D1 Pulling wagon does not change wall colour Pull the wagon? Pulling the wagon does not discharge the batteries ... ...
  • 26. The frame problem in AI – part III (Daniel C. Dennett) ● The designers built a third robot to assess the relevance of implications: Say hello to R2D1. ...
  • 27. The frame problem in AI – part III (Daniel C. Dennett) ● In sum, any action requires a large, a priori unknown, amount of world knowledge ● Hard to predict for the system designer ● Hard to deduce by symbolically by the system ● → need for common sense associations For vision: it is hard to predetermine a priori the features and detectors that will be required.
  • 28. Issues with Symbolic AI ● Symbolic AI is an efficient architecture ● Has solved successfully some hard problems ● ...but faces some complex limitations due to the separation between symbolic / sub-symbolic components. Low level signal processing High-level cognition, consciousness Symbolic reasoning detectors symbols Computer vision AI
  • 29. Emergent Cognition ● The system develops its own epistemiology (set of symbols & associations) from interacting with its environment. ● Enactive view (Maturana, H., Varela, F. The Tree of Knowledge – The Biological Roots of Human Understanding. New Science Library, Boston & London (1987)) – autonomous system – can affect the environment – is affected by the environment (embodied) – self-organised and self-generated. ● Central nervous system – prediction & adaptation Fig from Vernon, von Hofsten & Fadiga “A Roadmap for Cognitive Development in Humanoid Robots”. Springer, 2010.
  • 30. Emergent Cognition: Shared Epistemiology ● Pb: different experience → different symbols! ● Shared epistemiology comes from communication between agents (my and your concept of “red” are shared, even if you're colour blind) ● Note: communication between artificial systems can be a lot faster!
  • 31. Artificial Neural Networks ● An “artificial neuron” is in effect – a linear transformation – a linear squashing function s an=f w ,b(x)=s(∑ i wi xi+b) x1 x2 x3 +1 n an w1 w2 w3 b
  • 32. Non-linearities ● Smooth squashing functions. ● continuous and differentiable. ● sigmoid → [0,1] ● tanh → [-1,+1] s(x)= 1 1+e −x s(x)=tanh(x)= e x −e −x e x +e −x s'(x)=(1−s(x))s(x)
  • 33. Artificial Neural Network (aka Multilayer perceptron) x1 x2 x3 h1 r1 h2 +1+1 input layer layer #1 (N^1=3 inputs) “hidden” layer layer #2 (N^2=2 nodes) output layer layer #3 (N^3=1 node) θ=(W 1 ,b 1 ,W 2 ,b 2 ) parameters: f θ(x)=s( ∑ j∈[1,N 2 ] W j1 2 s( ∑ i∈[1, N 1 ] Wij 1 xi+bi 1 )+bj 2 ) zi l+1 = ∑ j∈[1, N l ] W j1 l aj l +bj l ai l =s(zi l ) Generic node activation: b1 2
  • 34. Learning by back-propagation E= 1 2 ∥a 3 − y∥ δj L = ∂ E ∂ aj L s'(zj L ) (⇔δj L =(aj L − y j)s'(zj L )) x1 x2 x3 h1 r1 h2 +1+1 input layer layer #1 (N^1=3 inputs) “hidden” layer layer #2 (N^2=2 nodes) output layer layer #3 (N^3=1 node) (x , y) δj l =∑ i W ji l δi l+1 s'(zj l ) δ1 3 δ1 2 TOP LAYER ERROR OTHER LAYERS ERROR For a given datapoint with label We have a error for the network a1 3 b1 2
  • 35. Learning by back-propagation x1 x2 x3 h1 r1 h2 +1+1 input layer layer #1 (N^1=3 inputs) “hidden” layer layer #2 (N^2=2 nodes) output layer layer #3 (N^3=1 node) b1 2 δ1 3 δ1 2 a1 3 ∂ E ∂Wij l =ai l δj l+1 ∂ E ∂bj l =δj l+1 → Update parameters with gradient descent Finally we get the error derivative for all network parameters:
  • 36. Embodiment ● Idea: Concepts can only be learnt for and by a body – → being affected by the environment – actions and perception and learnt jointly. – good perception is what allows successful actions. ● Example of reaching with neural network (Jamone, L.; Natale, L.; Metta, G.; Nori, F.; Sandini, G. .2012, “Autonomous Online Learning of Reaching Behavior in a Humanoid Robot.” International Journal of Humanoid Robotics 9(3), 2012.)
  • 37. Do we need embodiment? ● If you buy the emergent thesis, it is required – joint development of perception & action – symbol grounding in experience – → emergent epistemiology ● What type of embodiment? – strong: physical body (or even organic body!) – weak: a system coupled with its environment ● it can affect its environment, and ● it is affected by it
  • 38. Phylogeny vs. Ontogeny ● Phylogeny: the system's design (eg features like SIFT or lines). High with cognitivist approach, more limited in the emergent paradigm. ● Ontogeny: the system's development during its lifetime, drawn from experiences with its environment. ● Challenges for artificial systems: – hard to learn high level, abstract symbols autonomously. – hard to generalise across experiences – → how to learn abstract representations from experience?
  • 39. Representation Learning ● Simple example: PCA ● Aim: identify dimensions that vary jointly ● Components are axes of largest variation. ● Linear transformation ● Orthogonal basis ● Applied on natural images, generate filters similar to early cortical cells (V1) PJB Hancock, RJ Baddeley and LS Smith (1992) The principal components of natural images Network: computation in neural systems 3(1) y=W T x+μ
  • 40. Arguments for deep hierarchies ● Feature sharing at intermediate levels → sub-linear coding and computation requirements (Fidler, Boben & Leonardis. Evaluating multi-class learning strategies in a generative hierarchical framework for object detection. NIPS'09.) ● compact coding (Bengio, Courville, & Vincent. Representation Learning: A Review and New Perspectives. IEEE PAMI 35(8), 2013.) ● → Human visual system, estimated to have 5-10 levels (Krueger et al, “Deep Hierarchies in the Primate Visual Cortex: What Can We Learn for Computer Vision? 2013) ● →NN,CART,SVM → 2 layers Figure from Fidler, Boben & Leonardis 2009
  • 41. Arguments for Deep Hierarchies ● Problem with linear representations : – A combination of any number of linear representations is also a linear representation... y=W1 T x+μ1 z=W2 T y+μ2 ⇔ z=W2 T W1 T x+μ2+μ1 ⇔ z=W3 T x+μ3 x y (W1,μ1) z (W2, μ2) (W3, μ3)
  • 42. Data driven hierarchies: Autoencoders ● Idea: learn jointly a pair of mappings and ● that minimises information loss ● often using a neural network formulation z=ψ( y)y=ϕ(x) argmin ϕ ,ψ ∑ x ∥x−ψ(ϕ(x))∥D ϕ(x)=s(W x+b) ψ(x)=W ' x+b' x y ϕ ψ x1 x2 x3 x4 y1 y2 y3
  • 43. Remember: ANNs ● An “artificial neuron” is in effect – a linear transformation – a linear squashing function s an=f w ,b(x)=s(∑ i wi xi+b) x1 x2 x3 +1 n an w1 w2 w3 b
  • 44. Data driven hierarchies: Autoencoders ● Idea: learn jointly a pair of mappings and ● that minimises information loss ● often using a neural network formulation z=ψ( y)y=ϕ(x) argmin ϕ ,ψ ∑ x ∥x−ψ(ϕ(x))∥D ϕ(x)=s(W x+b) ψ(x)=W ' x+b' x y ϕ ψ x1 x2 x3 x4 y1 y2 y3
  • 45. Data driven hierarchies: Sparse Autoencoders ● Trivial solution when dim(Y) >= dim(X) ! ● But overcomplete bases can be beneficial (Olhsausen & Field 1996) ● Solution→sparse coding argmin ϕ ,ψ ∑ x ∥x−ψ(ϕ(x))∥D+g(ϕ(x)) x y ϕ ψ Olshausen, B. and Field, D. (1996). Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature, 381(6583):607–609.
  • 46. Stacked Auto-encoders ● you can stack multiple layers of AE ● Trained layer-wise ● Note that the structure is the same as ANN. ● Can be fine-tuned with backpropagation x h ϕ1 ψ1 y ϕ2 ψ2
  • 47. Limitations of ANN ● Problem with ANN, doesn't work well with more than 2 layers ● pb. with backprop, probably the gradient gets too diluted. ● Problem for emergent cognition: we want to learn higher level of abstraction! ● More recently several alternatives have been developped (Deep Learning): Restricted Boltzman Machines (RBM), Stacked autoencoder, Convolutional nets.
  • 48. Today's hot topic: Convolutional Neural Nets ● CNNs are neural nets (of course) ● sparse connectivity ● shared weight → convolutional. ● receptive field span all input dimensions ● typically alternating layers of convolution and max-pooling x1 x1 x1 x1 x1 h1 h1 h1 Fig from http://deeplearning.net/tutorial/lenet.html
  • 49. CNNs (cont'd) ● ex: LeNet (LeCun et al, 1998) ● Alternating convolution & subsampling (ie, max-pooling) layes ● Top-layer is a typical ANN. ● Train using backprop & stochastic gradient descent Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, 1998. Figure from http://deeplearning.net/tutorial/lenet.html
  • 50. CNNs (cont'd) ● Pb: training deep networks is difficult with backprop (slow, requires LOTS of data) ● CNN do better (because sparse) but still a pb. ● → Unsupervised pre-training of the network – using, ie, sparse autoencoders, layer-wise. – refine the weights with supervised backprop afterwards. ● Top results in MNIST, ILSRVC, PASCAL VOC.
  • 51. Other Part-based Hierarchies ● Deep belief networks (DBN): Restricted Boltzmann Machines (Hinton, Osindero, and Teh, “A Fast Learning Algorithm for Deep Belief Nets,” Neural Computation 18, 2006.) ● Slow Feature Analysis (SFA) (Franzius, Wilber and Wiskott. “Invariant object recognition and pose estimation with slow feature analysis”. Neural Computation, 2011) ● Compositional hierarchies (Fidler, Boben & Leonardis. “Evaluating multi-class learning strategies in a generative hierarchical framework for object detection”. NIPS, 2009) ● → Good review by Yoshua Bengio: Yoshua Bengio, Aaron Courville, and Pascal Vincent. “Representation Learning: A Review and New Perspectives.” IEEE PAMI 35(8), 2013. Fidler, S., M. Boben, and A. Leonardis 2009a. “Learning hierarchical compositional representations of object structure.” Pp. 196-215 in Object categorization : computer and human vision perspectives, edited by Sven J Dickinson, Aleš Leonardis, Bernt Schiele, and Michael J Tarr. New York: Cambridge University Press.
  • 52. Summary and conclusions ● There is no delineation between cognition and vision. ● Reasoning on hand-crafted symbols may be inadequate (semantic gap) or brittle. ● Learning abstraction is hard, but possible using deep hierarchies. ● Unsupervised pre-training for deep hierarchies is critical → tells us something about cognition.