Mots-clés
artificial intelligence
reinforcement learning
machine learning
model-free methods
lstdq
lspi
tile coding
fourier-cosine basis
polynomial approximation
value function approximation
semi-gradient descent
stochastic gradient descent
tabular methods
simple black jack
monte carlo
sarsa
q learning
temporal difference learning
canonical maze
bellman equations
markov decision process
dynamic programming
Tout plus
Présentations
(3)Personal Information
Entreprise/Lieu de travail
Helsinki Finland
Mots-clés
artificial intelligence
reinforcement learning
machine learning
model-free methods
lstdq
lspi
tile coding
fourier-cosine basis
polynomial approximation
value function approximation
semi-gradient descent
stochastic gradient descent
tabular methods
simple black jack
monte carlo
sarsa
q learning
temporal difference learning
canonical maze
bellman equations
markov decision process
dynamic programming
Tout plus