SlideShare une entreprise Scribd logo
1  sur  101
Télécharger pour lire hors ligne
An introduction to cognitive robotics
EMJD ICE Summer School - 2013
Lucio Marcenaro – University of
Genova (ITALY)
Cognitive robotics?
• Robots with intelligent behavior
– Learn and reason
– Complex goals
– Complex world
• Robots ideal vehicles for developing and
testing cognitive:
– Learning
– Adaptation
– Classification
Cognitive robotics
• Traditional behavior modeling approaches
problematic and untenable.
• Perception, action and the notion of symbolic
representation to be addressed in cognitive
robotics.
• Cognitive robotics views animal cognition as a
starting point for the development of robotic
information processing.
Cognitive robotics
• “Immobile” Robots and Engineering
Operations
– Robust space probes, ubiquitous computing
• Robots That Navigate
– Hallway robots, Field robots, Underwater
explorers, stunt air vehicles
• Cooperating Robots
– Cooperative Space/Air/Land/Underwater vehicles,
distributed traffic networks, smart dust.
Some applications (1)
Some applications (2)
Other examples
Outline
• Lego Mindstorms
• Simple Line Follower
• Advanced Line Follower
• Learning to follow the line
• Conclusions
The NXT Unit – an embedded system
• 64K RAM, 256K Flash
• 32-bit ARM7 microcontroller
• 100 x 64 pixel LCD graphical
display
• Sound channel with 8-bit
resolution
• Bluetooth wireless
communications
• Stores multiple programs
– Programs selectable using buttons
The NXT unit
(Motor ports)
(Sensor ports)
Motors and Sensors
• Built-in rotation sensors
NXT Motors
NXT Rotation Sensor
• Built in to motors
• Measure degrees
or rotations
• Reads + and -
• Degrees: accuracy
+/- 1
• 1 rotation =
360 degrees
Viewing Sensors
• Connect sensor
• Turn on NXT
• Choose “View”
• Select sensor type
• Select port
NXT Sound Sensor
• Sound sensor can measure in dB and dBA
– dB: in detecting standard [unadjusted]
decibels, all sounds are measured with
equal sensitivity. Thus, these sounds may
include some that are too high or too low
for the human ear to hear.
– dBA: in detecting adjusted decibels, the
sensitivity of the sensor is adapted to the
sensitivity of the human ear. In other words,
these are the sounds that your ears are able
to hear.
• Sound Sensor readings on the NXT are
displayed in percent [%]. The lower the percent
the quieter the sound.
http://mindstorms.lego.com/Overview/Sound_Sensor.aspx
NXT Ultrasonic/Distance Sensor
• Measures
distance/proximity
• Range: 0-255 cm
• Precision: +/- 3cm
• Can report in
centimeters or
inches
http://mindstorms.lego.com/Overview/Ultrasonic_Sensor.aspx
17
NXT Non-standard sensors:
HiTechnic.com
• Compass
• Gyroscope
• Accellerometer/tilt sensor,
• Color sensor
• IRSeeker
• Prototype board with A/D converter
for the I2C bus
LEGO Mindstorms for NXT
(NXT-G)
NXT-G graphical programming
language
Based on the LabVIEW programming language G
Program by drawing a flow chart
NXT-G PC program interface
Toolbar
Workspace
Configuration
Panel
Help & Navigation
Controller
Palettes
Tutorials Web
Portal
Sequence Beam
Issues of the standard firmware
• Only one data type
• Unreliable bluetooth communication
• Limited multi-tasking
• Complex motor control
• Simplistic memory management
• Not suitable for large programs
• Not suitable for development of own tools or
blocks
Other programming languages and
environments
– Java leJOS
– Microsoft Robotics Studio
– RobotC
– NXC - Not eXactly C
– NXT Logo
– Lego NXT Open source firmware and software
development kit
leJOS
• A Java Virtual Machine for NXT
• Freely available
– http://lejos.sourceforge.net/
• Replaces the NXT-G firmware
• LeJOS plug-in is available for the Eclipse free
development environment
• Faster than NXT-G
Example leJOS Program
sonar = new UltrasonicSensor(SensorPort.S4);
Motor.A.forward();
Motor.B.forward();
while (true) {
if (sonar.getDistance() < 25) {
Motor.A.forward();
Motor.B.backward();
} else {
Motor.A.forward();
Motor.B.forward();
}
}
Event-driven Control in leJOS
• The Behavior interface
– boolean takeControl()
– void action()
– void suppress()
• Arbitrator class
– Constructor gets an array of Behavior objects
• takeControl() checked for highest index first
– start() method begins event loop
Event-driven example
class Go implements Behavior {
private Ultrasonic sonar =
new Ultrasonic(SensorPort.S4);
public boolean takeControl() {
return sonar.getDistance() > 25;
}
Event-driven example
public void action() {
Motor.A.forward();
Motor.B.forward();
}
public void suppress() {
Motor.A.stop();
Motor.B.stop();
}
}
Event-driven example
class Spin implements Behavior {
private Ultrasonic sonar =
new Ultrasonic(SensorPort.S4);
public boolean takeControl() {
return sonar.getDistance() <= 25;
}
Event-driven example
public void action() {
Motor.A.forward();
Motor.B.backward();
}
public void suppress() {
Motor.A.stop();
Motor.B.stop();
}
}
Event-driven example
public class FindFreespace {
public static void main(String[] a) {
Behavior[] b = new Behavior[]
{new Go(), new Spin()};
Arbitrator arb =
new Arbitrator(b);
arb.start();
}
}
Simple Line Follower
• Use light-sensor as a switch
• If measured value > threshold: ON state (white
surface)
• If measured value < threshold: OFF state
(black surface)
Simple Line Follower
• Robot not traveling inside the line but along
the edge
• Turning left until an “OFF” to “ON” transition
is detected
• Turning right until an “ON” to “OFF” transition
is detected
Simple Line Follower
NXTMotor rightM = new NXTMotor(MotorPort.A);
NXTMotor leftM = new NXTMotor(MotorPort.C);
ColorSensor cs = new ColorSensor(SensorPort.S2, Color.RED);
while (!Button.ESCAPE.isDown())
{
int currentColor = cs.getLightValue();
LCD.drawInt(currentColor, 5, 11, 3);
if (currentColor < 30)
{
rightM.setPower(50);
leftM.setPower(10);
}
else
{
rightM.setPower(10);
leftM.setPower(50);
}
}
Simple Line Follower
• DEMO
Advanced Line Follower
• Use light-sensor as an
Analog sensor
• Sensor ranges btween 0
– 100
• Takes the average light
detected over a small
area
Advanced Line Follower
• Subtract the current reading of the sensor
from what the sensor should be reading
– Use this value to directly control direction and
power of the wheels
• Multiply this value for a constant: how
strongly the wheels should turn to correct its
path?
• Add a value to be sure that the robot is always
moving forward
Advanced Line Follower
NXTMotor rightM = new NXTMotor(MotorPort.A);
NXTMotor leftM = new NXTMotor(MotorPort.C);
int targetValue = 30;
int amplify = 7;
int targetPower = 50;
ColorSensor cs = new ColorSensor(SensorPort.S2, Color.RED);
rightM.setPower(targetPower);
leftM.setPower(targetPower);
while (!Button.ESCAPE.isDown())
{
int currentColor = cs.getLightValue();
int difference = currentColor - targetValue;
int ampDiff = difference * amplify;
int rightPower = ampDiff + targetPower;
int leftPower = targetPower;
rightM.setPower(rightPower);
leftM.setPower(leftPower);
}
Advanced Line Follower
• DEMO
Learn how to follow
• Goal
– Make robots do what we want
– Minimize/eliminate programming
• Proposed Solution: Reinforcement Learning
– Specify desired behavior using rewards
– Express rewards in terms of sensor states
– Use machine learning to induce desired actions
• Target Platform
– Lego Mindstorms NXT
Example: Grid World
• A maze-like problem
– The agent lives in a grid
– Walls block the agent’s path
• Noisy movement: actions do not
always go as planned:
– 80% of the time, preferred action is
taken
(if there is no wall there)
– 10% of the time, North takes the agent
West; 10% East
– If there is a wall in the direction the
agent would have been taken, the agent
stays put
• The agent receives rewards each time
step
– Small “living” reward each step (can be
negative)
– Big rewards come at the end (good or
bad)
• Goal: maximize sum of rewards
Markov Decision Processes
• An MDP is defined by:
– A set of states s  S
– A set of actions a  A
– A transition function T(s,a,s’)
• Prob that a from s leads to s’
• i.e., P(s’ | s,a)
• Also called the model (or
dynamics)
– A reward function R(s, a, s’)
• Sometimes just R(s) or R(s’)
– A start state
– Maybe a terminal state
• MDPs are non-deterministic
search problems
– Reinforcement learning: MDPs
where we don’t know the
transition or reward functions
What is Markov about MDPs?
• “Markov” generally means that given the
present state, the future and the past are
independent
• For Markov decision processes, “Markov”
means:
Andrej Andreevič Markov
(1856-1922)
Solving MDPs: policies
• In deterministic single-agent search problems, want an
optimal plan, or sequence of actions, from start to a goal
• In an MDP, we want an optimal policy *: S → A
– A policy  gives an action for each state
– An optimal policy maximizes expected utility if followed
– An explicit policy defines a reflex agent
Optimal policy when
R(s, a, s’) = -0.03 for all
non-terminals s
Example Optimal Policies
R(s) = -2.0R(s) = -0.4
R(s) = -0.03R(s) = -0.01
MDP Search Trees
• Each MDP state gives an expectimax-like search tree
a
s
s’
s, a
(s,a,s’) called a transition
T(s,a,s’) = P(s’|s,a)
R(s,a,s’)
s,a,s’
s is a state
(s, a) is a
q-state
Utilities of Sequences
• In order to formalize
optimality of a policy,
need to understand
utilities of sequences of
rewards
• What preferences should
an agent have over
reward sequences?
• More or less?
– [1,2,2] or [2,3,4]
• Now or later?
– [1,0,0] or [0,0,1]
Discounting
• It’s reasonable to maximize the sum of
rewards
• It’s also reasonable to prefer rewards now to
rewards later
• One solution:values of rewards decay
exponentially
Discounting
• Typically discount rewards
by  < 1 each time step
– Sooner rewards have higher
utility than later rewards
– Also helps the algorithms
converge
• Example: discount of 0.5:
– U([1,2,3])=1*1+0.5*2+0.25*3
– U([1,2,3])<U([3,2,1])
Stationary Preferences
• Theorem if we assume stationary preferences:
• Then: there are only two ways to define utilities
– Additive utility:
– Discounted utility:
Quiz: Discounting
• Given:
– Actions: East, West and Exit (available in exit states a, e)
– Transitions: deterministic
• Quiz 1: For =1, what is the optimal policy?
• Quiz 2: For =0.1, what is the optimal policy?
• Quiz 3: For which  are East and West equally good
when in state d?
10 1
a b c d e
10 1
10 1
Infinite Utilities?!
• Problem: infinite state sequences have infinite rewards
• Solutions:
– Finite horizon:
• Terminate episodes after a fixed T steps (e.g. life)
• Gives nonstationary policies ( depends on time left)
– Discounting: for 0 <  < 1
• Smaller  means smaller “horizon” – shorter term focus
• Absorbing state: guarantee that for every policy, a terminal
state will eventually be reached
Recap: Defining MDPs
• Markov decision processes:
– States S
– Start state s0
– Actions A
– Transitions P(s’|s,a) (or T(s,a,s’))
– Rewards R(s,a,s’) (and discount )
• MDP quantities so far:
– Policy = Choice of action for each state
– Utility (or return) = sum of discounted rewards
a
s
s, a
s,a,s’
s’
Optimal Quantities
• Why? Optimal values define
optimal policies!
• Define the value (utility) of a
state s:
V*(s) = expected utility starting in s
and acting optimally
• Define the value (utility) of a
q-state (s,a):
Q*(s,a) = expected utility starting in
s, taking action a and thereafter
acting optimally
• Define the optimal policy:
*(s) = optimal action from state s
a
s
s, a
s,a,s’
s’
Gridworld V*(s)
• Optimal value function V*(s)
Gridworld Q*(s,a)
• Optimal Q function Q*(s,a)
Values of States
• Fundamental operation: compute the value of
a state
– Expected utility under optimal action
– Average sum of (discounted) rewards
• Recursive definition of value
a
s
s, a
s,a,s’
s’
Why Not Search Trees?
• We’re doing way too much work with
search trees
• Problem: States are repeated
– Idea: Only compute needed quantities once
• Problem: Tree goes on forever
– Idea: Do a depth-limited computations, but
with increasing depths until change is small
– Note: deep parts of the tree eventually don’t
matter if  < 1
Time-limited Values
• Key idea: time-limited values
• Define Vk(s) to be the optimal value of s if the
game ends in k more time steps
– Equivalently, it’s what a depth-k search tree would
give from s
k=0
k=1
k=2
k=3
k=4
k=5
k=6
k=7
k=100
Value Iteration
• Problems with the recursive computation:
– Have to keep all the Vk
*(s) around all the time
– Don’t know which depth k(s) to ask for when planning
• Solution: value iteration
– Calculate values for all states, bottom-up
– Keep increasing k until convergence
Value Iteration
• Idea:
– Start with V0
*(s) = 0, which we know is right (why?)
– Given Vi
*, calculate the values for all states for depth i+1:
– This is called a value update or Bellman update
– Repeat until convergence
• Complexity of each iteration: O(S2A)
• Theorem: will converge to unique optimal values
– Basic idea: approximations get refined towards optimal values
– Policy may converge long before values do
Practice: Computing Actions
• Which action should we chose from state s:
– Given optimal values V?
– Given optimal q-values Q?
– Lesson: actions are easier to select from Q’s!
Utilities for Fixed Policies
• Another basic operation: compute the
utility of a state s under a fixed (general
non-optimal) policy
• Define the utility of a state s, under a
fixed policy :
V(s) = expected total discounted rewards
(return) starting in s and following 
• Recursive relation (one-step look-ahead
/ Bellman equation):
(s)
s
s, (s)
s, (s),s’
s’
Policy Evaluation
• How do we calculate the V’s for a fixed policy?
• Idea one: modify Bellman updates
• Efficiency: O(S2) per iteration
• Idea two: without the maxes it’s just a linear system,
solve with Matlab (or whatever)
Policy Iteration
• Problem with value iteration:
– Considering all actions each iteration is slow: takes |A| times longer than
policy evaluation
– But policy doesn’t change each iteration, time wasted
• Alternative to value iteration:
– Step 1: Policy evaluation: calculate utilities for a fixed policy (not optimal
utilities!) until convergence (fast)
– Step 2: Policy improvement: update policy using one-step look-ahead with
resulting converged (but not optimal!) utilities (slow but infrequent)
– Repeat steps until policy converges
• This is policy iteration
– It’s still optimal!
– Can converge faster under some conditions
Policy Iteration
• Policy evaluation: with fixed current policy , find values with
simplified Bellman updates:
– Iterate until values converge
• Policy improvement: with fixed utilities, find the best action
according to one-step look-ahead
Comparison
• In value iteration:
– Every pass (or “backup”) updates both utilities (explicitly, based on
current utilities) and policy (possibly implicitly, based on current
policy)
• In policy iteration:
– Several passes to update utilities with frozen policy
– Occasional passes to update policies
• Hybrid approaches (asynchronous policy iteration):
– Any sequences of partial updates to either policy entries or utilities
will converge if every state is visited infinitely often
Reinforcement Learning
• Basic idea:
– Receive feedback in the form of rewards
– Agent’s utility is defined by the reward function
– Must learn to act so as to maximize expected rewards
– All learning is based on observed samples of outcomes
Reinforcement Learning
• Reinforcement learning:
– Still assume an MDP:
• A set of states s  S
• A set of actions (per state) A
• A model T(s,a,s’)
• A reward function R(s,a,s’)
– Still looking for a policy (s)
– New twist: don’t know T or R
• I.e. don’t know which states are good or what the actions do
• Must actually try actions and states out to learn
Model-Based Learning
• Model-Based Idea:
– Learn the model empirically through experience
– Solve for values as if the learned model were correct
• Step 1: Learn empirical MDP model
– Count outcomes for each s,a
– Normalize to give estimate of T(s,a,s’)
– Discover R(s,a,s’) when we experience (s,a,s’)
• Step 2: Solve the learned MDP
– Iterative policy evaluation, for example
(s)
s
s, (s)
s, (s),s’
s’
Example: Model-Based Learning
• Episodes:
x
y
T(<3,3>, right, <4,3>) = 1 / 3
T(<2,3>, right, <3,3>) = 2 / 2
+100
-100
 = 1
(1,1) up -1
(1,2) up -1
(1,2) up -1
(1,3) right -1
(2,3) right -1
(3,3) right -1
(3,2) up -1
(3,3) right -1
(4,3) exit +100
(done)
(1,1) up -1
(1,2) up -1
(1,3) right -1
(2,3) right -1
(3,3) right -1
(3,2) up -1
(4,2) exit -100
(done)
Model-Free Learning
• Want to compute an expectation weighted by P(x):
• Model-based: estimate P(x) from samples, compute expectation
• Model-free: estimate expectation directly from samples
• Why does this work? Because samples appear with the right frequencies!
Example: Direct Estimation
• Episodes:
x
y
(1,1) up -1
(1,2) up -1
(1,2) up -1
(1,3) right -1
(2,3) right -1
(3,3) right -1
(3,2) up -1
(3,3) right -1
(4,3) exit +100
(done)
(1,1) up -1
(1,2) up -1
(1,3) right -1
(2,3) right -1
(3,3) right -1
(3,2) up -1
(4,2) exit -100
(done)
V(2,3) ~ (96 + -103) / 2 = -3.5
V(3,3) ~ (99 + 97 + -102) / 3 = 31.3
 = 1, R = -1
+100
-100
Sample-Based Policy Evaluation?
• Who needs T and R? Approximate the
expectation with samples (drawn from T!) (s)
s
s, (s)
s1’s2’ s3’
s, (s),s’
s’
Almost! But we only
actually make progress
when we move to i+1.
Temporal-Difference Learning
• Big idea: learn from every experience!
– Update V(s) each time we experience (s,a,s’,r)
– Likely s’ will contribute updates more often
• Temporal difference learning
– Policy still fixed!
– Move values toward value of whatever successor
occurs: running average!
(s)
s
s, (s)
s’
Sample of V(s):
Update to V(s):
Same update:
Exponential Moving Average
• Exponential moving average
– Makes recent samples more important
– Forgets about the past (distant past values were wrong anyway)
– Easy to compute from the running average
• Decreasing learning rate can give converging averages
Example: TD Policy Evaluation
Take  = 1,  = 0.5
(1,1) up -1
(1,2) up -1
(1,2) up -1
(1,3) right -1
(2,3) right -1
(3,3) right -1
(3,2) up -1
(3,3) right -1
(4,3) exit +100
(done)
(1,1) up -1
(1,2) up -1
(1,3) right -1
(2,3) right -1
(3,3) right -1
(3,2) up -1
(4,2) exit -100
(done)
Problems with TD Value Learning
• TD value leaning is a model-free way to do
policy evaluation
• However, if we want to turn values into a
(new) policy, we’re sunk:
• Idea: learn Q-values directly
• Makes action selection model-free too!
a
s
s, a
s,a,s’
s’
Active Learning
• Full reinforcement learning
– You don’t know the transitions T(s,a,s’)
– You don’t know the rewards R(s,a,s’)
– You can choose any actions you like
– Goal: learn the optimal policy
– … what value iteration did!
• In this case:
– Learner makes choices!
– Fundamental tradeoff: exploration vs. exploitation
– This is NOT offline planning! You actually take actions in the world and
find out what happens…
Detour: Q-Value Iteration
• Value iteration: find successive approx optimal values
– Start with V0
*(s) = 0, which we know is right (why?)
– Given Vi
*, calculate the values for all states for depth i+1:
• But Q-values are more useful!
– Start with Q0
*(s,a) = 0, which we know is right (why?)
– Given Qi
*, calculate the q-values for all q-states for depth i+1:
Q-Learning
• Q-Learning: sample-based Q-value iteration
• Learn Q*(s,a) values
– Receive a sample (s,a,s’,r)
– Consider your old estimate:
– Consider your new sample estimate:
– Incorporate the new estimate into a running average:
Q-Learning Properties
• Amazing result: Q-learning converges to optimal policy
– If you explore enough
– If you make the learning rate small enough
– … but not decrease it too quickly!
– Basically doesn’t matter how you select actions (!)
• Neat property: off-policy learning
– learn optimal policy without following it (some caveats)
Q-Learning
• Discrete sets of states and actions
– States form an N-dimensional array
• Unfolded into one dimension in practice
– Individual actions selected on each time step
• Q-values
– 2D array (indexed by state and action)
– Expected rewards for performing actions
Q-Learning
• Table of expected rewards (“Q-values”)
– Indexed by state and action
• Algorithm steps
– Calculate state index from sensor values
– Calculate the reward
– Update previous Q-value
– Select and perform an action
• Q(s,a) = (1 - α) Q(s,a) + α (r + γ max(Q(s',a)))
• Certain sensors provide continuous values
• Sonar
• Motor encoders
• Q-Learning requires discrete inputs
• Group continuous values into discrete “buckets”
• [Mahadevan and Connell, 1992]
• Q-Learning produces discrete actions
• Forward
• Back-left/Back-right
Q-Learning and Robots
Creating Discrete Inputs
• Basic approach
– Discretize continuous values into sets
– Combine each discretized tuple into a single index
• Another approach
– Self-Organizing Map
– Induces a discretization of continuous values
– [Touzet 1997] [Smith 2002]
Q-Learning Main Loop
• Select action
• Change motor speeds
• Inspect sensor values
– Calculate updated state
– Calculate reward
• Update Q values
• Set “old state” to be the updated state
Calculating the State (Motors)
• For each motor:
– 100% power
– 93.75% power
– 87.5% power
• Six motor states
Calculating the State (Sensors)
• No disparity: STRAIGHT
• Left/Right disparity
– 1-5: LEFT_1, RIGHT_1
– 6-12: LEFT_2, RIGHT_2
– 13+: LEFT_3, RIGHT_3
• Seven total sensor states
• 63 states overall
Calculating Reward
• No disparity => highest value
• Reward decreases with increasing disparity
Action Set for Line Follow
• MAINTAIN
– Both motors unchanged
• UP_LEFT, UP_RIGHT
– Accelerate motor by one motor state
• DOWN_LEFT, DOWN_RIGHT
– Decelerate motor by one motor state
• Five total actions
Q-learning line follower
Conclusions
• Lego Mindstorms NXT as a conveniente
platform for «cognitive robotics»
• Executing a task with «rules»
• Learning hot to execute a task
– MDP
– Reinforcement learning
• Q-learning applied to Lego Mindstorms
Thank you!
• Questions?

Contenu connexe

En vedette

Ice summer school-5-7-2013
Ice summer school-5-7-2013Ice summer school-5-7-2013
Ice summer school-5-7-2013Jun Hu
 
Ice ss2013
Ice ss2013Ice ss2013
Ice ss2013Jun Hu
 
Printversion ice summer school 1 7-2013.key
Printversion ice summer school 1 7-2013.keyPrintversion ice summer school 1 7-2013.key
Printversion ice summer school 1 7-2013.keyJun Hu
 
Dice01 re life-ict-system-smartdiagn-pdw-27june2013
Dice01 re life-ict-system-smartdiagn-pdw-27june2013Dice01 re life-ict-system-smartdiagn-pdw-27june2013
Dice01 re life-ict-system-smartdiagn-pdw-27june2013Jun Hu
 
Engineering natural lighting experiences
Engineering natural lighting experiencesEngineering natural lighting experiences
Engineering natural lighting experiencesJun Hu
 
Sociale media voor fotografen: 4 basics en 10 quickwins
Sociale media voor fotografen: 4 basics en 10 quickwinsSociale media voor fotografen: 4 basics en 10 quickwins
Sociale media voor fotografen: 4 basics en 10 quickwinssimongryspeert
 
Mexicanos part dos
Mexicanos part dosMexicanos part dos
Mexicanos part dosysabelmedina
 
MEG Primary Injection Project
MEG Primary Injection ProjectMEG Primary Injection Project
MEG Primary Injection ProjectFrancesco Legname
 
CERTAMEN DE NAVIDAD. CIBER@AULA NAVALAGAMELLA
CERTAMEN DE NAVIDAD. CIBER@AULA NAVALAGAMELLACERTAMEN DE NAVIDAD. CIBER@AULA NAVALAGAMELLA
CERTAMEN DE NAVIDAD. CIBER@AULA NAVALAGAMELLAciberaulacso
 
Facebook voor bestuurders
Facebook voor bestuurdersFacebook voor bestuurders
Facebook voor bestuurderssimongryspeert
 
Semaforo Audiovisual
Semaforo AudiovisualSemaforo Audiovisual
Semaforo AudiovisualMurilo Santos
 

En vedette (14)

Ice summer school-5-7-2013
Ice summer school-5-7-2013Ice summer school-5-7-2013
Ice summer school-5-7-2013
 
Ice ss2013
Ice ss2013Ice ss2013
Ice ss2013
 
Printversion ice summer school 1 7-2013.key
Printversion ice summer school 1 7-2013.keyPrintversion ice summer school 1 7-2013.key
Printversion ice summer school 1 7-2013.key
 
Dice01 re life-ict-system-smartdiagn-pdw-27june2013
Dice01 re life-ict-system-smartdiagn-pdw-27june2013Dice01 re life-ict-system-smartdiagn-pdw-27june2013
Dice01 re life-ict-system-smartdiagn-pdw-27june2013
 
Engineering natural lighting experiences
Engineering natural lighting experiencesEngineering natural lighting experiences
Engineering natural lighting experiences
 
Week13
Week13Week13
Week13
 
Sociale media voor fotografen: 4 basics en 10 quickwins
Sociale media voor fotografen: 4 basics en 10 quickwinsSociale media voor fotografen: 4 basics en 10 quickwins
Sociale media voor fotografen: 4 basics en 10 quickwins
 
Mexicanos part dos
Mexicanos part dosMexicanos part dos
Mexicanos part dos
 
Week14
Week14Week14
Week14
 
MEG Primary Injection Project
MEG Primary Injection ProjectMEG Primary Injection Project
MEG Primary Injection Project
 
CERTAMEN DE NAVIDAD. CIBER@AULA NAVALAGAMELLA
CERTAMEN DE NAVIDAD. CIBER@AULA NAVALAGAMELLACERTAMEN DE NAVIDAD. CIBER@AULA NAVALAGAMELLA
CERTAMEN DE NAVIDAD. CIBER@AULA NAVALAGAMELLA
 
Facebook voor bestuurders
Facebook voor bestuurdersFacebook voor bestuurders
Facebook voor bestuurders
 
Semaforo Audiovisual
Semaforo AudiovisualSemaforo Audiovisual
Semaforo Audiovisual
 
Vek.od.ua Лидерство Доленко
Vek.od.ua Лидерство ДоленкоVek.od.ua Лидерство Доленко
Vek.od.ua Лидерство Доленко
 

Similaire à Lucio marcenaro tue summer_school

Computer-Vision based Centralized Multi-agent System on Matlab and Arduino Du...
Computer-Vision based Centralized Multi-agent System on Matlab and Arduino Du...Computer-Vision based Centralized Multi-agent System on Matlab and Arduino Du...
Computer-Vision based Centralized Multi-agent System on Matlab and Arduino Du...Aritra Sarkar
 
Autonomous robotics based on simple sensor inputs.
Autonomous robotics based on simplesensor inputs.Autonomous robotics based on simplesensor inputs.
Autonomous robotics based on simple sensor inputs. sathish sak
 
SERENE 2014 School: Daniel varro serene2014_school
SERENE 2014 School: Daniel varro serene2014_schoolSERENE 2014 School: Daniel varro serene2014_school
SERENE 2014 School: Daniel varro serene2014_schoolHenry Muccini
 
SERENE 2014 School: Incremental Model Queries over the Cloud
SERENE 2014 School: Incremental Model Queries over the CloudSERENE 2014 School: Incremental Model Queries over the Cloud
SERENE 2014 School: Incremental Model Queries over the CloudSERENEWorkshop
 
final report-4
final report-4final report-4
final report-4Zhuo Li
 
final presentation from William, Amy and Alex
final presentation from William, Amy and Alexfinal presentation from William, Amy and Alex
final presentation from William, Amy and AlexZiwei Zhu
 
Building Deep Reinforcement Learning Applications on Apache Spark with Analyt...
Building Deep Reinforcement Learning Applications on Apache Spark with Analyt...Building Deep Reinforcement Learning Applications on Apache Spark with Analyt...
Building Deep Reinforcement Learning Applications on Apache Spark with Analyt...Databricks
 
Efficient and Advanced Omniscient Debugging for xDSMLs (SLE 2015)
Efficient and Advanced Omniscient Debugging for xDSMLs (SLE 2015)Efficient and Advanced Omniscient Debugging for xDSMLs (SLE 2015)
Efficient and Advanced Omniscient Debugging for xDSMLs (SLE 2015)Benoit Combemale
 
Topology hiding Multipath Routing Protocol in MANET
Topology hiding Multipath Routing Protocol in MANETTopology hiding Multipath Routing Protocol in MANET
Topology hiding Multipath Routing Protocol in MANETAkshay Phalke
 
Megamodeling of Complex, Distributed, Heterogeneous CPS Systems
Megamodeling of Complex, Distributed, Heterogeneous CPS SystemsMegamodeling of Complex, Distributed, Heterogeneous CPS Systems
Megamodeling of Complex, Distributed, Heterogeneous CPS SystemsEugenio Villar
 
PRM-RL: Long-range Robotics Navigation Tasks by Combining Reinforcement Learn...
PRM-RL: Long-range Robotics Navigation Tasks by Combining Reinforcement Learn...PRM-RL: Long-range Robotics Navigation Tasks by Combining Reinforcement Learn...
PRM-RL: Long-range Robotics Navigation Tasks by Combining Reinforcement Learn...Dongmin Lee
 
HMM based Automatic Arabic Sign Language Translator using
HMM based Automatic Arabic Sign Language Translator usingHMM based Automatic Arabic Sign Language Translator using
HMM based Automatic Arabic Sign Language Translator usingعمر أمين
 
Cloudera Data Science Challenge
Cloudera Data Science ChallengeCloudera Data Science Challenge
Cloudera Data Science ChallengeMark Nichols, P.E.
 
Data Science Challenge presentation given to the CinBITools Meetup Group
Data Science Challenge presentation given to the CinBITools Meetup GroupData Science Challenge presentation given to the CinBITools Meetup Group
Data Science Challenge presentation given to the CinBITools Meetup GroupDoug Needham
 
Spark streaming for the internet of flying things 20160510.pptx
Spark streaming for the internet of flying things 20160510.pptxSpark streaming for the internet of flying things 20160510.pptx
Spark streaming for the internet of flying things 20160510.pptxPablo Francisco Pérez Hidalgo
 

Similaire à Lucio marcenaro tue summer_school (20)

Computer-Vision based Centralized Multi-agent System on Matlab and Arduino Du...
Computer-Vision based Centralized Multi-agent System on Matlab and Arduino Du...Computer-Vision based Centralized Multi-agent System on Matlab and Arduino Du...
Computer-Vision based Centralized Multi-agent System on Matlab and Arduino Du...
 
Autonomous robotics based on simple sensor inputs.
Autonomous robotics based on simplesensor inputs.Autonomous robotics based on simplesensor inputs.
Autonomous robotics based on simple sensor inputs.
 
AI Robotics
AI RoboticsAI Robotics
AI Robotics
 
Smart Room Gesture Control
Smart Room Gesture ControlSmart Room Gesture Control
Smart Room Gesture Control
 
SERENE 2014 School: Daniel varro serene2014_school
SERENE 2014 School: Daniel varro serene2014_schoolSERENE 2014 School: Daniel varro serene2014_school
SERENE 2014 School: Daniel varro serene2014_school
 
SERENE 2014 School: Incremental Model Queries over the Cloud
SERENE 2014 School: Incremental Model Queries over the CloudSERENE 2014 School: Incremental Model Queries over the Cloud
SERENE 2014 School: Incremental Model Queries over the Cloud
 
Dealing with the need for Infrastructural Support in Ambient Intelligence
Dealing with the need for Infrastructural Support in Ambient IntelligenceDealing with the need for Infrastructural Support in Ambient Intelligence
Dealing with the need for Infrastructural Support in Ambient Intelligence
 
Prestentation
PrestentationPrestentation
Prestentation
 
final report-4
final report-4final report-4
final report-4
 
final presentation from William, Amy and Alex
final presentation from William, Amy and Alexfinal presentation from William, Amy and Alex
final presentation from William, Amy and Alex
 
Building Deep Reinforcement Learning Applications on Apache Spark with Analyt...
Building Deep Reinforcement Learning Applications on Apache Spark with Analyt...Building Deep Reinforcement Learning Applications on Apache Spark with Analyt...
Building Deep Reinforcement Learning Applications on Apache Spark with Analyt...
 
Efficient and Advanced Omniscient Debugging for xDSMLs (SLE 2015)
Efficient and Advanced Omniscient Debugging for xDSMLs (SLE 2015)Efficient and Advanced Omniscient Debugging for xDSMLs (SLE 2015)
Efficient and Advanced Omniscient Debugging for xDSMLs (SLE 2015)
 
Topology hiding Multipath Routing Protocol in MANET
Topology hiding Multipath Routing Protocol in MANETTopology hiding Multipath Routing Protocol in MANET
Topology hiding Multipath Routing Protocol in MANET
 
Megamodeling of Complex, Distributed, Heterogeneous CPS Systems
Megamodeling of Complex, Distributed, Heterogeneous CPS SystemsMegamodeling of Complex, Distributed, Heterogeneous CPS Systems
Megamodeling of Complex, Distributed, Heterogeneous CPS Systems
 
Angular and Deep Learning
Angular and Deep LearningAngular and Deep Learning
Angular and Deep Learning
 
PRM-RL: Long-range Robotics Navigation Tasks by Combining Reinforcement Learn...
PRM-RL: Long-range Robotics Navigation Tasks by Combining Reinforcement Learn...PRM-RL: Long-range Robotics Navigation Tasks by Combining Reinforcement Learn...
PRM-RL: Long-range Robotics Navigation Tasks by Combining Reinforcement Learn...
 
HMM based Automatic Arabic Sign Language Translator using
HMM based Automatic Arabic Sign Language Translator usingHMM based Automatic Arabic Sign Language Translator using
HMM based Automatic Arabic Sign Language Translator using
 
Cloudera Data Science Challenge
Cloudera Data Science ChallengeCloudera Data Science Challenge
Cloudera Data Science Challenge
 
Data Science Challenge presentation given to the CinBITools Meetup Group
Data Science Challenge presentation given to the CinBITools Meetup GroupData Science Challenge presentation given to the CinBITools Meetup Group
Data Science Challenge presentation given to the CinBITools Meetup Group
 
Spark streaming for the internet of flying things 20160510.pptx
Spark streaming for the internet of flying things 20160510.pptxSpark streaming for the internet of flying things 20160510.pptx
Spark streaming for the internet of flying things 20160510.pptx
 

Plus de Jun Hu

IoT in the City
IoT in the CityIoT in the City
IoT in the CityJun Hu
 
201812 design research on social cyber-physical systems
201812 design research on social cyber-physical systems201812 design research on social cyber-physical systems
201812 design research on social cyber-physical systemsJun Hu
 
201811 csc-tue-ph d
201811 csc-tue-ph d201811 csc-tue-ph d
201811 csc-tue-ph dJun Hu
 
201812 tue id briefing short
201812 tue id briefing short201812 tue id briefing short
201812 tue id briefing shortJun Hu
 
Connectedness for enriching elderly care: Interactive Installation & System ...
Connectedness for enriching elderly care:  Interactive Installation & System ...Connectedness for enriching elderly care:  Interactive Installation & System ...
Connectedness for enriching elderly care: Interactive Installation & System ...Jun Hu
 
Closer to Nature: Interactive Systems for Seniors with Dementia in Long-term ...
Closer to Nature: Interactive Systems for Seniors with Dementia in Long-term ...Closer to Nature: Interactive Systems for Seniors with Dementia in Long-term ...
Closer to Nature: Interactive Systems for Seniors with Dementia in Long-term ...Jun Hu
 
Internet of Things: Social Applications
Internet of Things: Social ApplicationsInternet of Things: Social Applications
Internet of Things: Social ApplicationsJun Hu
 
Social Things
Social ThingsSocial Things
Social ThingsJun Hu
 
Participatory Media Arts: A TU/e DESIS Lab project
Participatory Media Arts: A TU/e DESIS Lab projectParticipatory Media Arts: A TU/e DESIS Lab project
Participatory Media Arts: A TU/e DESIS Lab projectJun Hu
 
Interaction and Fusion
Interaction and FusionInteraction and Fusion
Interaction and FusionJun Hu
 
Traditional Dynamic Arts and Interaction Design
Traditional Dynamic Arts and Interaction DesignTraditional Dynamic Arts and Interaction Design
Traditional Dynamic Arts and Interaction DesignJun Hu
 
How to do PhD at TU/e
How to do PhD at TU/eHow to do PhD at TU/e
How to do PhD at TU/eJun Hu
 
Publishing design
Publishing designPublishing design
Publishing designJun Hu
 
Alan young presentation
Alan young presentationAlan young presentation
Alan young presentationJun Hu
 
Desform2013 grip pdf_optim
Desform2013 grip pdf_optimDesform2013 grip pdf_optim
Desform2013 grip pdf_optimJun Hu
 
Elements for Interaction Design in Public Spaces: Learning from Traditional D...
Elements for Interaction Design in Public Spaces: Learning from Traditional D...Elements for Interaction Design in Public Spaces: Learning from Traditional D...
Elements for Interaction Design in Public Spaces: Learning from Traditional D...Jun Hu
 
De s form_presentation
De s form_presentationDe s form_presentation
De s form_presentationJun Hu
 
De s form2013_wuxi_steffen.ppt
De s form2013_wuxi_steffen.pptDe s form2013_wuxi_steffen.ppt
De s form2013_wuxi_steffen.pptJun Hu
 
Presentation experio des form 23 09-2013
Presentation experio des form 23 09-2013Presentation experio des form 23 09-2013
Presentation experio des form 23 09-2013Jun Hu
 
Frohlich framing4
Frohlich framing4Frohlich framing4
Frohlich framing4Jun Hu
 

Plus de Jun Hu (20)

IoT in the City
IoT in the CityIoT in the City
IoT in the City
 
201812 design research on social cyber-physical systems
201812 design research on social cyber-physical systems201812 design research on social cyber-physical systems
201812 design research on social cyber-physical systems
 
201811 csc-tue-ph d
201811 csc-tue-ph d201811 csc-tue-ph d
201811 csc-tue-ph d
 
201812 tue id briefing short
201812 tue id briefing short201812 tue id briefing short
201812 tue id briefing short
 
Connectedness for enriching elderly care: Interactive Installation & System ...
Connectedness for enriching elderly care:  Interactive Installation & System ...Connectedness for enriching elderly care:  Interactive Installation & System ...
Connectedness for enriching elderly care: Interactive Installation & System ...
 
Closer to Nature: Interactive Systems for Seniors with Dementia in Long-term ...
Closer to Nature: Interactive Systems for Seniors with Dementia in Long-term ...Closer to Nature: Interactive Systems for Seniors with Dementia in Long-term ...
Closer to Nature: Interactive Systems for Seniors with Dementia in Long-term ...
 
Internet of Things: Social Applications
Internet of Things: Social ApplicationsInternet of Things: Social Applications
Internet of Things: Social Applications
 
Social Things
Social ThingsSocial Things
Social Things
 
Participatory Media Arts: A TU/e DESIS Lab project
Participatory Media Arts: A TU/e DESIS Lab projectParticipatory Media Arts: A TU/e DESIS Lab project
Participatory Media Arts: A TU/e DESIS Lab project
 
Interaction and Fusion
Interaction and FusionInteraction and Fusion
Interaction and Fusion
 
Traditional Dynamic Arts and Interaction Design
Traditional Dynamic Arts and Interaction DesignTraditional Dynamic Arts and Interaction Design
Traditional Dynamic Arts and Interaction Design
 
How to do PhD at TU/e
How to do PhD at TU/eHow to do PhD at TU/e
How to do PhD at TU/e
 
Publishing design
Publishing designPublishing design
Publishing design
 
Alan young presentation
Alan young presentationAlan young presentation
Alan young presentation
 
Desform2013 grip pdf_optim
Desform2013 grip pdf_optimDesform2013 grip pdf_optim
Desform2013 grip pdf_optim
 
Elements for Interaction Design in Public Spaces: Learning from Traditional D...
Elements for Interaction Design in Public Spaces: Learning from Traditional D...Elements for Interaction Design in Public Spaces: Learning from Traditional D...
Elements for Interaction Design in Public Spaces: Learning from Traditional D...
 
De s form_presentation
De s form_presentationDe s form_presentation
De s form_presentation
 
De s form2013_wuxi_steffen.ppt
De s form2013_wuxi_steffen.pptDe s form2013_wuxi_steffen.ppt
De s form2013_wuxi_steffen.ppt
 
Presentation experio des form 23 09-2013
Presentation experio des form 23 09-2013Presentation experio des form 23 09-2013
Presentation experio des form 23 09-2013
 
Frohlich framing4
Frohlich framing4Frohlich framing4
Frohlich framing4
 

Dernier

Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 

Dernier (20)

Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 

Lucio marcenaro tue summer_school

  • 1. An introduction to cognitive robotics EMJD ICE Summer School - 2013 Lucio Marcenaro – University of Genova (ITALY)
  • 2. Cognitive robotics? • Robots with intelligent behavior – Learn and reason – Complex goals – Complex world • Robots ideal vehicles for developing and testing cognitive: – Learning – Adaptation – Classification
  • 3. Cognitive robotics • Traditional behavior modeling approaches problematic and untenable. • Perception, action and the notion of symbolic representation to be addressed in cognitive robotics. • Cognitive robotics views animal cognition as a starting point for the development of robotic information processing.
  • 4. Cognitive robotics • “Immobile” Robots and Engineering Operations – Robust space probes, ubiquitous computing • Robots That Navigate – Hallway robots, Field robots, Underwater explorers, stunt air vehicles • Cooperating Robots – Cooperative Space/Air/Land/Underwater vehicles, distributed traffic networks, smart dust.
  • 8. Outline • Lego Mindstorms • Simple Line Follower • Advanced Line Follower • Learning to follow the line • Conclusions
  • 9. The NXT Unit – an embedded system • 64K RAM, 256K Flash • 32-bit ARM7 microcontroller • 100 x 64 pixel LCD graphical display • Sound channel with 8-bit resolution • Bluetooth wireless communications • Stores multiple programs – Programs selectable using buttons
  • 10. The NXT unit (Motor ports) (Sensor ports)
  • 12. • Built-in rotation sensors NXT Motors
  • 13. NXT Rotation Sensor • Built in to motors • Measure degrees or rotations • Reads + and - • Degrees: accuracy +/- 1 • 1 rotation = 360 degrees
  • 14. Viewing Sensors • Connect sensor • Turn on NXT • Choose “View” • Select sensor type • Select port
  • 15. NXT Sound Sensor • Sound sensor can measure in dB and dBA – dB: in detecting standard [unadjusted] decibels, all sounds are measured with equal sensitivity. Thus, these sounds may include some that are too high or too low for the human ear to hear. – dBA: in detecting adjusted decibels, the sensitivity of the sensor is adapted to the sensitivity of the human ear. In other words, these are the sounds that your ears are able to hear. • Sound Sensor readings on the NXT are displayed in percent [%]. The lower the percent the quieter the sound. http://mindstorms.lego.com/Overview/Sound_Sensor.aspx
  • 16. NXT Ultrasonic/Distance Sensor • Measures distance/proximity • Range: 0-255 cm • Precision: +/- 3cm • Can report in centimeters or inches http://mindstorms.lego.com/Overview/Ultrasonic_Sensor.aspx
  • 17. 17 NXT Non-standard sensors: HiTechnic.com • Compass • Gyroscope • Accellerometer/tilt sensor, • Color sensor • IRSeeker • Prototype board with A/D converter for the I2C bus
  • 18. LEGO Mindstorms for NXT (NXT-G) NXT-G graphical programming language Based on the LabVIEW programming language G Program by drawing a flow chart
  • 19. NXT-G PC program interface Toolbar Workspace Configuration Panel Help & Navigation Controller Palettes Tutorials Web Portal Sequence Beam
  • 20. Issues of the standard firmware • Only one data type • Unreliable bluetooth communication • Limited multi-tasking • Complex motor control • Simplistic memory management • Not suitable for large programs • Not suitable for development of own tools or blocks
  • 21. Other programming languages and environments – Java leJOS – Microsoft Robotics Studio – RobotC – NXC - Not eXactly C – NXT Logo – Lego NXT Open source firmware and software development kit
  • 22. leJOS • A Java Virtual Machine for NXT • Freely available – http://lejos.sourceforge.net/ • Replaces the NXT-G firmware • LeJOS plug-in is available for the Eclipse free development environment • Faster than NXT-G
  • 23. Example leJOS Program sonar = new UltrasonicSensor(SensorPort.S4); Motor.A.forward(); Motor.B.forward(); while (true) { if (sonar.getDistance() < 25) { Motor.A.forward(); Motor.B.backward(); } else { Motor.A.forward(); Motor.B.forward(); } }
  • 24. Event-driven Control in leJOS • The Behavior interface – boolean takeControl() – void action() – void suppress() • Arbitrator class – Constructor gets an array of Behavior objects • takeControl() checked for highest index first – start() method begins event loop
  • 25. Event-driven example class Go implements Behavior { private Ultrasonic sonar = new Ultrasonic(SensorPort.S4); public boolean takeControl() { return sonar.getDistance() > 25; }
  • 26. Event-driven example public void action() { Motor.A.forward(); Motor.B.forward(); } public void suppress() { Motor.A.stop(); Motor.B.stop(); } }
  • 27. Event-driven example class Spin implements Behavior { private Ultrasonic sonar = new Ultrasonic(SensorPort.S4); public boolean takeControl() { return sonar.getDistance() <= 25; }
  • 28. Event-driven example public void action() { Motor.A.forward(); Motor.B.backward(); } public void suppress() { Motor.A.stop(); Motor.B.stop(); } }
  • 29. Event-driven example public class FindFreespace { public static void main(String[] a) { Behavior[] b = new Behavior[] {new Go(), new Spin()}; Arbitrator arb = new Arbitrator(b); arb.start(); } }
  • 30. Simple Line Follower • Use light-sensor as a switch • If measured value > threshold: ON state (white surface) • If measured value < threshold: OFF state (black surface)
  • 31. Simple Line Follower • Robot not traveling inside the line but along the edge • Turning left until an “OFF” to “ON” transition is detected • Turning right until an “ON” to “OFF” transition is detected
  • 32. Simple Line Follower NXTMotor rightM = new NXTMotor(MotorPort.A); NXTMotor leftM = new NXTMotor(MotorPort.C); ColorSensor cs = new ColorSensor(SensorPort.S2, Color.RED); while (!Button.ESCAPE.isDown()) { int currentColor = cs.getLightValue(); LCD.drawInt(currentColor, 5, 11, 3); if (currentColor < 30) { rightM.setPower(50); leftM.setPower(10); } else { rightM.setPower(10); leftM.setPower(50); } }
  • 34. Advanced Line Follower • Use light-sensor as an Analog sensor • Sensor ranges btween 0 – 100 • Takes the average light detected over a small area
  • 35. Advanced Line Follower • Subtract the current reading of the sensor from what the sensor should be reading – Use this value to directly control direction and power of the wheels • Multiply this value for a constant: how strongly the wheels should turn to correct its path? • Add a value to be sure that the robot is always moving forward
  • 36. Advanced Line Follower NXTMotor rightM = new NXTMotor(MotorPort.A); NXTMotor leftM = new NXTMotor(MotorPort.C); int targetValue = 30; int amplify = 7; int targetPower = 50; ColorSensor cs = new ColorSensor(SensorPort.S2, Color.RED); rightM.setPower(targetPower); leftM.setPower(targetPower); while (!Button.ESCAPE.isDown()) { int currentColor = cs.getLightValue(); int difference = currentColor - targetValue; int ampDiff = difference * amplify; int rightPower = ampDiff + targetPower; int leftPower = targetPower; rightM.setPower(rightPower); leftM.setPower(leftPower); }
  • 38. Learn how to follow • Goal – Make robots do what we want – Minimize/eliminate programming • Proposed Solution: Reinforcement Learning – Specify desired behavior using rewards – Express rewards in terms of sensor states – Use machine learning to induce desired actions • Target Platform – Lego Mindstorms NXT
  • 39. Example: Grid World • A maze-like problem – The agent lives in a grid – Walls block the agent’s path • Noisy movement: actions do not always go as planned: – 80% of the time, preferred action is taken (if there is no wall there) – 10% of the time, North takes the agent West; 10% East – If there is a wall in the direction the agent would have been taken, the agent stays put • The agent receives rewards each time step – Small “living” reward each step (can be negative) – Big rewards come at the end (good or bad) • Goal: maximize sum of rewards
  • 40. Markov Decision Processes • An MDP is defined by: – A set of states s  S – A set of actions a  A – A transition function T(s,a,s’) • Prob that a from s leads to s’ • i.e., P(s’ | s,a) • Also called the model (or dynamics) – A reward function R(s, a, s’) • Sometimes just R(s) or R(s’) – A start state – Maybe a terminal state • MDPs are non-deterministic search problems – Reinforcement learning: MDPs where we don’t know the transition or reward functions
  • 41. What is Markov about MDPs? • “Markov” generally means that given the present state, the future and the past are independent • For Markov decision processes, “Markov” means: Andrej Andreevič Markov (1856-1922)
  • 42. Solving MDPs: policies • In deterministic single-agent search problems, want an optimal plan, or sequence of actions, from start to a goal • In an MDP, we want an optimal policy *: S → A – A policy  gives an action for each state – An optimal policy maximizes expected utility if followed – An explicit policy defines a reflex agent Optimal policy when R(s, a, s’) = -0.03 for all non-terminals s
  • 43. Example Optimal Policies R(s) = -2.0R(s) = -0.4 R(s) = -0.03R(s) = -0.01
  • 44. MDP Search Trees • Each MDP state gives an expectimax-like search tree a s s’ s, a (s,a,s’) called a transition T(s,a,s’) = P(s’|s,a) R(s,a,s’) s,a,s’ s is a state (s, a) is a q-state
  • 45. Utilities of Sequences • In order to formalize optimality of a policy, need to understand utilities of sequences of rewards • What preferences should an agent have over reward sequences? • More or less? – [1,2,2] or [2,3,4] • Now or later? – [1,0,0] or [0,0,1]
  • 46. Discounting • It’s reasonable to maximize the sum of rewards • It’s also reasonable to prefer rewards now to rewards later • One solution:values of rewards decay exponentially
  • 47. Discounting • Typically discount rewards by  < 1 each time step – Sooner rewards have higher utility than later rewards – Also helps the algorithms converge • Example: discount of 0.5: – U([1,2,3])=1*1+0.5*2+0.25*3 – U([1,2,3])<U([3,2,1])
  • 48. Stationary Preferences • Theorem if we assume stationary preferences: • Then: there are only two ways to define utilities – Additive utility: – Discounted utility:
  • 49. Quiz: Discounting • Given: – Actions: East, West and Exit (available in exit states a, e) – Transitions: deterministic • Quiz 1: For =1, what is the optimal policy? • Quiz 2: For =0.1, what is the optimal policy? • Quiz 3: For which  are East and West equally good when in state d? 10 1 a b c d e 10 1 10 1
  • 50. Infinite Utilities?! • Problem: infinite state sequences have infinite rewards • Solutions: – Finite horizon: • Terminate episodes after a fixed T steps (e.g. life) • Gives nonstationary policies ( depends on time left) – Discounting: for 0 <  < 1 • Smaller  means smaller “horizon” – shorter term focus • Absorbing state: guarantee that for every policy, a terminal state will eventually be reached
  • 51. Recap: Defining MDPs • Markov decision processes: – States S – Start state s0 – Actions A – Transitions P(s’|s,a) (or T(s,a,s’)) – Rewards R(s,a,s’) (and discount ) • MDP quantities so far: – Policy = Choice of action for each state – Utility (or return) = sum of discounted rewards a s s, a s,a,s’ s’
  • 52. Optimal Quantities • Why? Optimal values define optimal policies! • Define the value (utility) of a state s: V*(s) = expected utility starting in s and acting optimally • Define the value (utility) of a q-state (s,a): Q*(s,a) = expected utility starting in s, taking action a and thereafter acting optimally • Define the optimal policy: *(s) = optimal action from state s a s s, a s,a,s’ s’
  • 53. Gridworld V*(s) • Optimal value function V*(s)
  • 54. Gridworld Q*(s,a) • Optimal Q function Q*(s,a)
  • 55. Values of States • Fundamental operation: compute the value of a state – Expected utility under optimal action – Average sum of (discounted) rewards • Recursive definition of value a s s, a s,a,s’ s’
  • 56. Why Not Search Trees? • We’re doing way too much work with search trees • Problem: States are repeated – Idea: Only compute needed quantities once • Problem: Tree goes on forever – Idea: Do a depth-limited computations, but with increasing depths until change is small – Note: deep parts of the tree eventually don’t matter if  < 1
  • 57. Time-limited Values • Key idea: time-limited values • Define Vk(s) to be the optimal value of s if the game ends in k more time steps – Equivalently, it’s what a depth-k search tree would give from s
  • 58. k=0
  • 59. k=1
  • 60. k=2
  • 61. k=3
  • 62. k=4
  • 63. k=5
  • 64. k=6
  • 65. k=7
  • 66. k=100
  • 67. Value Iteration • Problems with the recursive computation: – Have to keep all the Vk *(s) around all the time – Don’t know which depth k(s) to ask for when planning • Solution: value iteration – Calculate values for all states, bottom-up – Keep increasing k until convergence
  • 68. Value Iteration • Idea: – Start with V0 *(s) = 0, which we know is right (why?) – Given Vi *, calculate the values for all states for depth i+1: – This is called a value update or Bellman update – Repeat until convergence • Complexity of each iteration: O(S2A) • Theorem: will converge to unique optimal values – Basic idea: approximations get refined towards optimal values – Policy may converge long before values do
  • 69. Practice: Computing Actions • Which action should we chose from state s: – Given optimal values V? – Given optimal q-values Q? – Lesson: actions are easier to select from Q’s!
  • 70. Utilities for Fixed Policies • Another basic operation: compute the utility of a state s under a fixed (general non-optimal) policy • Define the utility of a state s, under a fixed policy : V(s) = expected total discounted rewards (return) starting in s and following  • Recursive relation (one-step look-ahead / Bellman equation): (s) s s, (s) s, (s),s’ s’
  • 71. Policy Evaluation • How do we calculate the V’s for a fixed policy? • Idea one: modify Bellman updates • Efficiency: O(S2) per iteration • Idea two: without the maxes it’s just a linear system, solve with Matlab (or whatever)
  • 72. Policy Iteration • Problem with value iteration: – Considering all actions each iteration is slow: takes |A| times longer than policy evaluation – But policy doesn’t change each iteration, time wasted • Alternative to value iteration: – Step 1: Policy evaluation: calculate utilities for a fixed policy (not optimal utilities!) until convergence (fast) – Step 2: Policy improvement: update policy using one-step look-ahead with resulting converged (but not optimal!) utilities (slow but infrequent) – Repeat steps until policy converges • This is policy iteration – It’s still optimal! – Can converge faster under some conditions
  • 73. Policy Iteration • Policy evaluation: with fixed current policy , find values with simplified Bellman updates: – Iterate until values converge • Policy improvement: with fixed utilities, find the best action according to one-step look-ahead
  • 74. Comparison • In value iteration: – Every pass (or “backup”) updates both utilities (explicitly, based on current utilities) and policy (possibly implicitly, based on current policy) • In policy iteration: – Several passes to update utilities with frozen policy – Occasional passes to update policies • Hybrid approaches (asynchronous policy iteration): – Any sequences of partial updates to either policy entries or utilities will converge if every state is visited infinitely often
  • 75. Reinforcement Learning • Basic idea: – Receive feedback in the form of rewards – Agent’s utility is defined by the reward function – Must learn to act so as to maximize expected rewards – All learning is based on observed samples of outcomes
  • 76. Reinforcement Learning • Reinforcement learning: – Still assume an MDP: • A set of states s  S • A set of actions (per state) A • A model T(s,a,s’) • A reward function R(s,a,s’) – Still looking for a policy (s) – New twist: don’t know T or R • I.e. don’t know which states are good or what the actions do • Must actually try actions and states out to learn
  • 77. Model-Based Learning • Model-Based Idea: – Learn the model empirically through experience – Solve for values as if the learned model were correct • Step 1: Learn empirical MDP model – Count outcomes for each s,a – Normalize to give estimate of T(s,a,s’) – Discover R(s,a,s’) when we experience (s,a,s’) • Step 2: Solve the learned MDP – Iterative policy evaluation, for example (s) s s, (s) s, (s),s’ s’
  • 78. Example: Model-Based Learning • Episodes: x y T(<3,3>, right, <4,3>) = 1 / 3 T(<2,3>, right, <3,3>) = 2 / 2 +100 -100  = 1 (1,1) up -1 (1,2) up -1 (1,2) up -1 (1,3) right -1 (2,3) right -1 (3,3) right -1 (3,2) up -1 (3,3) right -1 (4,3) exit +100 (done) (1,1) up -1 (1,2) up -1 (1,3) right -1 (2,3) right -1 (3,3) right -1 (3,2) up -1 (4,2) exit -100 (done)
  • 79. Model-Free Learning • Want to compute an expectation weighted by P(x): • Model-based: estimate P(x) from samples, compute expectation • Model-free: estimate expectation directly from samples • Why does this work? Because samples appear with the right frequencies!
  • 80. Example: Direct Estimation • Episodes: x y (1,1) up -1 (1,2) up -1 (1,2) up -1 (1,3) right -1 (2,3) right -1 (3,3) right -1 (3,2) up -1 (3,3) right -1 (4,3) exit +100 (done) (1,1) up -1 (1,2) up -1 (1,3) right -1 (2,3) right -1 (3,3) right -1 (3,2) up -1 (4,2) exit -100 (done) V(2,3) ~ (96 + -103) / 2 = -3.5 V(3,3) ~ (99 + 97 + -102) / 3 = 31.3  = 1, R = -1 +100 -100
  • 81. Sample-Based Policy Evaluation? • Who needs T and R? Approximate the expectation with samples (drawn from T!) (s) s s, (s) s1’s2’ s3’ s, (s),s’ s’ Almost! But we only actually make progress when we move to i+1.
  • 82. Temporal-Difference Learning • Big idea: learn from every experience! – Update V(s) each time we experience (s,a,s’,r) – Likely s’ will contribute updates more often • Temporal difference learning – Policy still fixed! – Move values toward value of whatever successor occurs: running average! (s) s s, (s) s’ Sample of V(s): Update to V(s): Same update:
  • 83. Exponential Moving Average • Exponential moving average – Makes recent samples more important – Forgets about the past (distant past values were wrong anyway) – Easy to compute from the running average • Decreasing learning rate can give converging averages
  • 84. Example: TD Policy Evaluation Take  = 1,  = 0.5 (1,1) up -1 (1,2) up -1 (1,2) up -1 (1,3) right -1 (2,3) right -1 (3,3) right -1 (3,2) up -1 (3,3) right -1 (4,3) exit +100 (done) (1,1) up -1 (1,2) up -1 (1,3) right -1 (2,3) right -1 (3,3) right -1 (3,2) up -1 (4,2) exit -100 (done)
  • 85. Problems with TD Value Learning • TD value leaning is a model-free way to do policy evaluation • However, if we want to turn values into a (new) policy, we’re sunk: • Idea: learn Q-values directly • Makes action selection model-free too! a s s, a s,a,s’ s’
  • 86. Active Learning • Full reinforcement learning – You don’t know the transitions T(s,a,s’) – You don’t know the rewards R(s,a,s’) – You can choose any actions you like – Goal: learn the optimal policy – … what value iteration did! • In this case: – Learner makes choices! – Fundamental tradeoff: exploration vs. exploitation – This is NOT offline planning! You actually take actions in the world and find out what happens…
  • 87. Detour: Q-Value Iteration • Value iteration: find successive approx optimal values – Start with V0 *(s) = 0, which we know is right (why?) – Given Vi *, calculate the values for all states for depth i+1: • But Q-values are more useful! – Start with Q0 *(s,a) = 0, which we know is right (why?) – Given Qi *, calculate the q-values for all q-states for depth i+1:
  • 88. Q-Learning • Q-Learning: sample-based Q-value iteration • Learn Q*(s,a) values – Receive a sample (s,a,s’,r) – Consider your old estimate: – Consider your new sample estimate: – Incorporate the new estimate into a running average:
  • 89. Q-Learning Properties • Amazing result: Q-learning converges to optimal policy – If you explore enough – If you make the learning rate small enough – … but not decrease it too quickly! – Basically doesn’t matter how you select actions (!) • Neat property: off-policy learning – learn optimal policy without following it (some caveats)
  • 90. Q-Learning • Discrete sets of states and actions – States form an N-dimensional array • Unfolded into one dimension in practice – Individual actions selected on each time step • Q-values – 2D array (indexed by state and action) – Expected rewards for performing actions
  • 91. Q-Learning • Table of expected rewards (“Q-values”) – Indexed by state and action • Algorithm steps – Calculate state index from sensor values – Calculate the reward – Update previous Q-value – Select and perform an action • Q(s,a) = (1 - α) Q(s,a) + α (r + γ max(Q(s',a)))
  • 92. • Certain sensors provide continuous values • Sonar • Motor encoders • Q-Learning requires discrete inputs • Group continuous values into discrete “buckets” • [Mahadevan and Connell, 1992] • Q-Learning produces discrete actions • Forward • Back-left/Back-right Q-Learning and Robots
  • 93. Creating Discrete Inputs • Basic approach – Discretize continuous values into sets – Combine each discretized tuple into a single index • Another approach – Self-Organizing Map – Induces a discretization of continuous values – [Touzet 1997] [Smith 2002]
  • 94. Q-Learning Main Loop • Select action • Change motor speeds • Inspect sensor values – Calculate updated state – Calculate reward • Update Q values • Set “old state” to be the updated state
  • 95. Calculating the State (Motors) • For each motor: – 100% power – 93.75% power – 87.5% power • Six motor states
  • 96. Calculating the State (Sensors) • No disparity: STRAIGHT • Left/Right disparity – 1-5: LEFT_1, RIGHT_1 – 6-12: LEFT_2, RIGHT_2 – 13+: LEFT_3, RIGHT_3 • Seven total sensor states • 63 states overall
  • 97. Calculating Reward • No disparity => highest value • Reward decreases with increasing disparity
  • 98. Action Set for Line Follow • MAINTAIN – Both motors unchanged • UP_LEFT, UP_RIGHT – Accelerate motor by one motor state • DOWN_LEFT, DOWN_RIGHT – Decelerate motor by one motor state • Five total actions
  • 100. Conclusions • Lego Mindstorms NXT as a conveniente platform for «cognitive robotics» • Executing a task with «rules» • Learning hot to execute a task – MDP – Reinforcement learning • Q-learning applied to Lego Mindstorms