SlideShare une entreprise Scribd logo
1  sur  74
I do not speak Chinese ! ! !

●

And my English is extremely French
(when native English speakers listen to my
English, they sometimes believe that they
suddenly, by miracle, understand French)
I do not speak Chinese ! ! !
●

●

And my English is extremely French
(when native English speakers listen to my
English, they sometimes believe that they
suddenly, by miracle, understand French)
For the moment if I gave a talk in Chinese it
would be boring, with only:
● hse-hse
● nirao
● pukachi
I do not speak Chinese ! ! !
●

●

●

And my English is extremely French
(when native English speakers listen to my
English, they sometimes believe that they
suddenly, by miracle, understand French)
For the moment if I gave a talk in Chinese it
would be boring, with only:
● hse-hse
● nirao
● pukachi
Interrupt me as much as you want for
facilitating understanding :-)
High-Scale Power Systems:
Simulation & Optimization
Olivier Teytaud + Inria-Tao + Artelys
TAO project-team
INRIA Saclay Île-de-France

O. Teytaud, Research Fellow,
olivier.teytaud@inria.fr
http://www.lri.fr/~teytaud/
Ilab METIS
www.lri.fr/~teytaud/metis.html
●

Metis = Tao + Artelys
TAO tao.lri.fr, Machine Learning & Optimization
● Joint INRIA / CNRS / Univ. Paris-Sud team
● 12 researchers, 17 PhDs, 3 post-docs, 3 engineers
● Artelys www.artelys.com SME
- France / US / Canada
- 50 persons
==> collaboration through common platform
●

●

Activities
● Optimization (uncertainties, sequential)
● Application to power systems
O. Teytaud, Research Fellow,
olivier.teytaud@inria.fr
http://www.lri.fr/~teytaud/
Importantly, it is not a lie.
●

●

●

●

It is a tradition, in research institutes, to claim
some links with industry
I don't claim that having such links is necessary
or always a great achievement in itself
But I do claim that in my case it is true that I
have links with industry
My four students here in Taiwan, and others in
France, all have real salaries based on
industrial fundings.
All in one slide
Consider an electric system.
Decisions =
● Strategic decisions (a few time steps):
● building a nuclear power plant
● build a Spain-Marocco connection
● build a wind farm
●

tactical decisions (many time steps):
● switching on hydroelectricity plant #7
● switching on thermal plant #4
● ....

Based on
Simulations
of the
tactical level

Depends on
the
strategic
level
A bit more precisely:
the strategic level
Brute force approach for strategic level:
●

●

I simulate
● each possible strategic decision (e.g. 20000);
● 1000 times;
● each of them with optimal tactical decisions
==> 20 000 optimizations, 1000
simulations each
I choose the best one.
Better: More simulations on the best strategic decisions.
However, this talk will not focus on that part.
A bit more precisely:
the tactical level
Brute force approach for tactical level:
●

●

Simplify
● Replace each random process by
expectation
● Optimize decisions deterministically
But reality is stochastic:
● Water inflows
● Wind farms
Better: optimizing a policy
(i.e. reactive, closed-loop)
Specialization on Power Systems
●

Planning/control (tactical level)
●

Pluriannual planning: evaluate marginal costs of hydroelectricity

●

Taking into account stochasticity and uncertainties
==> IOMCA (ANR)

●

High scale investment studies (e.g. Europe+North Africa)
●

Long term (2030 - 2050)

●

Huge (non-stochastic) uncertainties

●

Investments: interconnections, storage, smart grids, power plants...
==> POST (ADEME)

●

Moderate scale (Cities, Factories) (tactical level simpler)
●

Master plan optimization

●

Stochastic uncertainties
==> Citines project (FP7)
Example: interconnection studies
(demand levelling, stabilized supply)
The POST project – supergrids
simulation and optimization
Mature technology:HVDC links
(high-voltage direct current)

European subregions:
- Case 1 : electric corridor France / Spain / Marocco
- Case 2 : south-west
(France/Spain/Italiy/Tunisia/Marocco)
- Case 3 : maghreb – Central West Europe
==> towards a European supergrid

Related
ideas in Asia
Tactical level: unit commitment at the
scale of a coutry: looks like a game
●

Many time steps.

●

Many power plants.

●

Some of them have stocks (hydroelectricity).

●

Many constraints (rules).

●

Uncertainties (water inflows, temperature, …)

==> make decisions:
●

When should I switch on ? (for each PP)

●

At which power ?
Investment decisions through simulations
●

Issues
–
–
–

●

Methods
–
–

●

Demand varying in time, limited previsibility
Transportation introduces constraints
Renewable ==> variability ++
Markovian assumptions ==> wrong
Simplified models ==> Model error >> optimization error

Our approach
●

Machine Learning on top of Mathematical Programming
Hybridization reinforcement learning /
mathematical programming
●

Math programming (mathematicians doing discrete-time
control)
–
–
–

●

Nearly exact solutions for a simplified problem
High-dimensional constrained action space
But small state space & not anytime

Reinforcement learning (artificially intelligent people
doing discrete-time control :-) )
–
–
–
–

Unstable
Small model bias
Small / simple action space
But high dimensional state space & anytime
Now the technical part

Model Predictive Control,
Now the technical part

Model Predictive Control,
Stochastic Dynamic
Programming,
Now the technical part

Model Predictive Control,
Stochastic Dynamic
Programming,
Direct Policy Search
Now the technical part

Model Predictive Control,
Stochastic Dynamic
Programming,
Direct Policy Search,
and Direct Value Search (new)
(3/4 of this talk is about the state of the art, only
1/4 our work)
Now the technical part

Model Predictive Control,
Stochastic Dynamic
Programming,
Direct Policy Search
and Direct Value Search
combining Direct Policy Search
and Stochastic Dynamic Programming
(3/4 of this talk is about the state of the art, only
1/4 our work)
Many optimization tools (SDP, MPC):
● Strong constraints on forecasts
● Strong constraints on model structure.
Direct Policy Search
● Arbitrary forecasts, arbitrary structure
● But not scalable / # decision variables.
→ merge: Direct Value Search
Jean-Joseph.Christophe@inria.fr
Jeremie.Decock@inria.fr
Pierre.Defreminville@artelys.com
Olivier.Teytaud@inria.fr
●

Stochastic Dynamic Optimization

●

Classical solutions: Bellman (old & new)
●
●

Markov Chains

●

Overfitting

●

●

Anticipativity

SDP, SDDP

Alternate solution: Direct Policy Search
●
●

●

No problem with anticipativity
Scalability issue

The best of both worlds: Direct Value Search
Random
process
Random values

Stochastic Control

Controller commands
with
memory
Observation

Cost

System
State

●
●

For an optimal representation, you need access
to the whole archive
or to forecasts (generative model / probabilistic forecasts)
(Astrom 1965)
●

Stochastic Dynamic Optimization

●

Classical solutions: Bellman (old & new)
●
●

Markov Chains

●

Overfitting

●

●

Anticipativity (dirty solution)

SDP, SDDP

Alternate solution: Direct Policy Search
●
●

●

No problem with anticipativity
Scalability issue

The best of both worlds: Direct Value Search
●

Anticipative solutions:
●

Maximum over strategic decisions

●

Of average over random processes

●

●

Of optimized decisions, given random
processes & strategic decisions

Pros/Cons
●
●

●

Much simpler (deterministic optimization)
But in real life you can not guess November
rains in January
Rather optimistic decisions
MODEL PREDICTIVE CONTROL
●

Anticipative solutions:
●

Maximum over strategic decisions

●

Of pessimistic forecasts (e.g. quantile)

●

●

Of optimized decisions, given forecasts &
strategic decisions

Pros/Cons
●
●

●

Much simpler (deterministic optimization)
But in real life you can not guess November
rains in January

Not so optimistic, convenient, simple
MODEL PREDICTIVE CONTROL
●

Anticipative solutions:
●

Maximum over strategic decisions

●

Of pessimistic forecasts (e.g. quantile)

●

●

Of optimized decisions, given forecasts &
strategic decisions

Ok,
we have done one
Pros/Cons
of the four targets:
Much simpler (deterministic optimization)
model predictive
But in real life you can not guess November
control.
rains in January
●
●

●

Not so optimistic, convenient, simple
●

Stochastic Dynamic Optimization

●

Classical solutions: Bellman (old & new)
●
●

Markov Chains

●

Overfitting

●

●

Anticipativity (dirty solution)

SDP, SDDP

Alternate solution: Direct Policy Search
●
●

●

No problem with anticipativity
Scalability issue

The best of both worlds: Direct Value Search
Markov solution
Representation as a Markov process (a tree):

This is the representation
of the random process.
Let us see how to
represent the rest.
How to solve, simple case, binary
stock, one day

It is
December
30th and I have
water

er
at )
w 0
se t =
I u os
(c

No more water,
december 31st

I do not use

I have water,
december 31st
How to solve, simple case, binary
stock, one day

It is
December
30th and I have
water:

Future
Cost = 0

er
at )
w 0
se t =
I u os
(c

No more water,
december 31st

I do not use

I have water,
december 31st
How to solve, simple case, binary
stock, 3 days, no random process

2
1

1

2

3

2

2

2

3

1
4
3
3
3

3

2

3
How to solve, simple case, binary
stock, 3 days, no random process

1

1

2

2

2

3

2

2

2

2

3

1
4
3
3

2

3
2

3

3
How to solve, simple case, binary
stock, 3 days, no random process

4

3

1

2

2

1

2
2

2
5

3

2

4

1

2

3

4
3

3

3
7

3

2

6

2

3
This was deterministic
●

How to add a random process ?

●

Just multiply nodes :-)
How to solve, simple case, binary
stock, 3 days, random parts
2
2
1
1
23
2 4 22 2
5
3
1
4 2 2 3
73 36 3
3
3

4

4

3

2

2
1
23
2 4 22 2
5
3
1
4 2 2 3
73 36 3
3
1

o ba
Pr

1/ 3
ility
b

Pro
bab
ilit

y 2/
3

2
2
1
23
2 4 22 2
5
3
1
4 2 2 3
73 36 3
3

4

1

3
Markov solution: ok you have understood
stochastic dynamic programming (Bellman)
Representation as a Markov process (a tree):

This is the representation
of the random process.
In each node, there are the
state-nodes with decision-edges.
Markov solution: ok you have understood
stochastic dynamic programming (Bellman)
Representation as a Markov process (a tree):

Ok,
we have done the 2nd
of the four targets:
This is the representation
stochastic dynamic
of the random process.
programming

In each node, there are the
state-nodes with decision-edges.
Markov solution
Representation as a Markov process (a tree):

Optimize decisions for each state.
This means you are not cheating.
But difficult to use.

Strategy optimized for
very specific forecasting
models

Might be ok for your problem ?
●

Stochastic Dynamic Optimization

●

Classical solutions: Bellman (old & new)
●
●

Markov Chains

●

Overfitting

●

●

Anticipativity (dirty solution)

SDP, SDDP

Alternate solution: Direct Policy Search
●
●

●

No problem with anticipativity
Scalability issue

The best of both worlds: Direct Value Search
Overfitting
●

Representation as a Markov process (a tree):

How do you actually make decisions when the random values
are not exactly those observed ? (heuristics...)
●

●

●

Check on random realizations which have not been used for
building the tree.
Does it work correctly ?
Overfitting = when it works only on scenarios used in the
optimization process.
●

Stochastic Dynamic Optimization

●

Classical solutions: Bellman (old & new)
●
●

Markov Chains

●

Overfitting

●

●

Anticipativity (dirty solution)

SDP, SDDP

Alternate solution: Direct Policy Search
●
●

●

No problem with anticipativity
Scalability issue

The best of both worlds: Direct Value Search
SDP / SDDP
Stochastic (Dual) Dynamic Programming
●

Representation of the controller with Linear Progamming
(value function as piecewise linear)
SDP / SDDP
Stochastic (Dual) Dynamic Programming
●

Representation of the controller with Linear Progamming
(value function as piecewise linear)
●

→ ok for 100 000 decision variables per time step
(tenths of time steps, hundreds of plants, several
decisions each)
SDP / SDDP
Stochastic (Dual) Dynamic Programming
●

Representation of the controller with Linear Progamming
(value function as piecewise linear)
●
●

→ ok for 100 000 decision variables per time step
but solving by expensive SDP/SDDP (curse of
dimensionality, exp. in state variables)
SDP / SDDP
Stochastic (Dual) Dynamic Programming
●

Representation of the controller with Linear Progamming
(value function as piecewise linear)
●
●

●

→ ok for 100 000 decision variables per time step
but solving by expensive SDP/SDDP

Constraints
●

Needs LP approximation: ok for you ?
SDP / SDDP
Stochastic (Dual) Dynamic Programming
●

Representation of the controller with Linear Progamming
(value function as piecewise linear)
●
●

●

→ ok for 100 000 decision variables per time step
but solving by expensive SDP/SDDP

Constraints
●

Needs LP approximation: ok for you ?

●

SDDP requires convex Bellman values: ok for you ?
SDP / SDDP
Stochastic (Dual) Dynamic Programming
●

Representation of the controller with Linear Progamming
(value function as piecewise linear)
●
●

●

→ ok for 100 000 decision variables per time step
but solving by expensive SDP/SDDP

Constraints
●

Needs LP approximation: ok for you ?

●

SDDP requires convex Bellman values: ok for you ?

●

Needs Markov random processes: ok for you ?
(possibly after some random process extension...)
SDP / SDDP
Stochastic (Dual) Dynamic Programming
●

Representation of the controller with Linear Progamming
(value function as piecewise linear)
●
●

●

→ ok for 100 000 decision variables per time step
but solving by expensive SDP/SDDP

Constraints
●

Needs LP approximation: ok for you ?

●

SDDP requires convex Bellman values: ok for you ?

●

Needs Markov random processes: ok for you ?
(possibly after some random process extension...)

●

Goal

keep scalability
but get rid of SDP/SDDP solving
Summary
●
●

●

Most classical solution = SDP and variants
Or MPC (model-predictive control), replacing
the stochastic parts by deterministic pessimistic
forecasts
Statistical modelization is “cast” into a tree
model & (probabilistic) forecasting modules are
essentially lost
●

Stochastic Dynamic Optimization

●

Classical solutions: Bellman (old & new)
●
●

Markov Chains

●

Overfitting

●

●

Anticipativity (dirty solution)

SDP, SDDP

Alternate solution: Direct Policy Search
●
●

●

No problem with anticipativity
But scalability issue

The best of both worlds: Direct Value Search
Direct Policy Search
●
●

●

●

Requires a parametric controller
Principle: optimize the parameters on
simulations
Unusual in large scale Power Systems
(we will see why)
Usual in other areas (finance, evolutionary
robotics)
Random
process
Random values

Stochastic Control

Controller commands
with
memory
State

Cost

System
State

Optimize the controller thanks to a simulator:
●
●
●

Command = Controller(w,state,forecasts)
Simulate( w ) = stochastic loss with parameter w
w* = argmin [Simulate(w)]
Random
process
Random values

Stochastic Control

Controller commands
with
memory
State

●
●
●

Cost

System

Ok,
State
we have done the 3rd
of the four targets:
Optimize the controller thanks to a simulator:
Direct policy search.
Command = Controller(w,state,forecasts)
Simulate( w ) = stochastic loss with parameter w
w* = argmin [Simulate(w)]
Direct Policy Search (DPS)
●

Requires a parametric controller
Direct Policy Search (DPS)
●

Requires a parametric controller
e.g. neural network
Controller(w,x) =
W3+W2.tanh(W1.x+W0)
Direct Policy Search (DPS)
●

Requires a parametric controller
e.g. neural network
Controller(w,x) =
W3+W2.tanh(W1.x+W0)

●

Noisy Black-Box Optimization
Direct Policy Search (DPS)
●

Requires a parametric controller
e.g. neural network
Controller(w,x) =
W3+W2.tanh(W1.x+W0)

●

Noisy Black-Box Optimization
●

Advantages: non-linear ok, forecasts included
Direct Policy Search (DPS)
●

Requires a parametric controller
e.g. neural network
Controller(w,x) =
W3+W2.tanh(W1.x+W0)

●

Noisy Black-Box Optimization
●

●

Advantages: non-linear ok, forecasts included

Issue: too slow
hundreds of parameters for even 20 decision variables
(depends on structure)
Direct Policy Search (DPS)
●

Requires a parametric controller
e.g. neural network
Controller(w,x) =
W3+W2.tanh(W1.x+W0)

●

Noisy Black-Box Optimization
●

●

Advantages: non-linear ok, forecasts included

Issue: too slow
hundreds of parameters for even 20 decision variables
(depends on structure)

●

Idea: a special structure for DPS

(inspired from SDP)
Direct Policy Search (DPS)
●

Requires a parametric controller
e.g. neural network
Controller(w,x) =
W3+W2.tanh(W1.x+W0)

●

Noisy Black-Box Optimization
●

●

Strategy optimized given the real
Advantages: non-linear ok, forecasts included
forecasting module you have

Issue: too slow

hundreds of parameters for even 20 decision variables
(forecasts are inputs)
(depends on structure)
●

Idea: a special structure for DPS

(inspired from SDP)
●

Stochastic Dynamic Optimization

●

Classical solutions: Bellman (old & new)
●
●

Markov Chains

●

Overfitting

●

●

Anticipativity (dirty solution)

SDP, SDDP

Alternate solution: Direct Policy Search
●
●

●

No problem with anticipativity
Scalability issue

The best of both worlds: Direct Value
Search
Direct Value Search
SDP representation in DPS
Controller(state) =
argmin Cost(decision) + V(next state)
●

V(nextState) = alpha x NextState

●

LP

alpha = NeuralNetwork(w,state)
Not LP

(or a more sophisticated LP)
==> given w, decision making solved as a LP
==> non-linear mapping for choosing the
parameters of the LP from the current state
Direct Value Search
SDP representation in DPS
Controller(state) =
argmin Cost(decision) + V(next state)
●

V(nextState) = alpha x NextState

●

LP

alpha = NeuralNetwork(w,state)
Not LP

(or a more sophisticated LP)
Drawback: requires the optimization of w
( = noisy black-box optimization problem)
Summary: the best of both worlds
Controller(w,state)

The Structure of
the Controller
(fast, scalable by structure)

●

V(w,state,.) is

non-linear

●

Optimize Cost(dec) + V(w,state,nextState) is LP

Simul(w)
●

Do a simulation with w

●

A simulator
(you can put anything you want in it,
even if it is not linear, nothing Markovian...)

Return the cost
DirectValueSearch

●

optimize w* = argmin simul(w)

●

Return Controller with w*

The optimization
(will do its best, given the simulator
and the structure)
Summary: the best of both worlds
Controller(w,state)
●

V(w,state,.) is

non-linear

●

Optimize Cost(dec) + V(w,state,nextState) is LP

Simul(w)
●

Do a simulation with w

●

Return the cost

3 optimizers:
● SAES

DirectValueSearch

●

●

optimize w* = argmin simul(w)

●

Return Controller with w*

●

Fabian:
● gradient descent
● redundant finite differences
Newton version
Ok,
we have done the 4th
of the four targets:
Direct value search.
State of the art in discrete-time control, a few tools:
●

Model Predictive Control:
For making a decision in a given state:
(i) do forecasts
(ii) replace random procs -> pessimistic forecasts
(iii) Optimize as if deterministic problem

●

Stochastic Dynamic Programming:
●
●

●

Markov model
Compute “cost to go” backwards

Direct Policy Search:
●

Parametric controller

●

Optimized on simulations
Conclusion
●

Still rather preliminary (less tested than MPC or
SDDP) but promising:
●
●

●
●

Forecasts naturally included in optimization
Anytime algorithm
(the user immediately gets approximate results)
No convexity constraints
Room for detailed simulations
(e.g. with very small time scale, for volatility)

●

No random process constraints (not Markov)

●

Can handle large state spaces (as DPS)

●

Can handle large action spaces (as SDP)
==> can work on the “real” problem, without “cast”
Bibliography
●

●
●

●

Dynamic Programming and Suboptimal Control:
A Survey from ADP to MPC. D. Bertsekas,
2005. (MPC = deterministic forecasts)
Astrom 1965
Renewable energy forecasts ought to be
probabilistic! P. Pinson, 2013 (wipfor talk)
Training a neural network with a financial
criterion rather than a prediction criterion.
Y. Bengio, 1997 (quite practical application of direct
policy search, convincing experiments)
Questions ?
Appendix
SDP / SDDP
Stochastic (Dual) Dynamic Programming
●

Representation of the controller
●

decision(current state)=
argmin Cost(decision) + Bellman(next state)

●

Linear programming (LP) if:
–
–

●

For a given current state, next state = LP(decision)
Cost(decision) = LP(decision)

→100 000 decision variables per time step

Contenu connexe

En vedette

Ilab Metis: we optimize power systems and we are not afraid of direct policy ...
Ilab Metis: we optimize power systems and we are not afraid of direct policy ...Ilab Metis: we optimize power systems and we are not afraid of direct policy ...
Ilab Metis: we optimize power systems and we are not afraid of direct policy ...Olivier Teytaud
 
Noisy Optimization combining Bandits and Evolutionary Algorithms
Noisy Optimization combining Bandits and Evolutionary AlgorithmsNoisy Optimization combining Bandits and Evolutionary Algorithms
Noisy Optimization combining Bandits and Evolutionary AlgorithmsOlivier Teytaud
 
Artificial Intelligence and Optimization with Parallelism
Artificial Intelligence and Optimization with ParallelismArtificial Intelligence and Optimization with Parallelism
Artificial Intelligence and Optimization with ParallelismOlivier Teytaud
 
Provocative statements around energy
Provocative statements around energyProvocative statements around energy
Provocative statements around energyOlivier Teytaud
 
The game of Go and energy; two nice computational intelligence problems (with...
The game of Go and energy; two nice computational intelligence problems (with...The game of Go and energy; two nice computational intelligence problems (with...
The game of Go and energy; two nice computational intelligence problems (with...Olivier Teytaud
 

En vedette (9)

Openoffice and Linux
Openoffice and LinuxOpenoffice and Linux
Openoffice and Linux
 
Ilab Metis: we optimize power systems and we are not afraid of direct policy ...
Ilab Metis: we optimize power systems and we are not afraid of direct policy ...Ilab Metis: we optimize power systems and we are not afraid of direct policy ...
Ilab Metis: we optimize power systems and we are not afraid of direct policy ...
 
Theory of games
Theory of gamesTheory of games
Theory of games
 
Noisy Optimization combining Bandits and Evolutionary Algorithms
Noisy Optimization combining Bandits and Evolutionary AlgorithmsNoisy Optimization combining Bandits and Evolutionary Algorithms
Noisy Optimization combining Bandits and Evolutionary Algorithms
 
Artificial Intelligence and Optimization with Parallelism
Artificial Intelligence and Optimization with ParallelismArtificial Intelligence and Optimization with Parallelism
Artificial Intelligence and Optimization with Parallelism
 
Tutorialmcts
TutorialmctsTutorialmcts
Tutorialmcts
 
Provocative statements around energy
Provocative statements around energyProvocative statements around energy
Provocative statements around energy
 
Grenoble
GrenobleGrenoble
Grenoble
 
The game of Go and energy; two nice computational intelligence problems (with...
The game of Go and energy; two nice computational intelligence problems (with...The game of Go and energy; two nice computational intelligence problems (with...
The game of Go and energy; two nice computational intelligence problems (with...
 

Similaire à Optimization of power systems - old and new tools

Dynamic Optimization without Markov Assumptions: application to power systems
Dynamic Optimization without Markov Assumptions: application to power systemsDynamic Optimization without Markov Assumptions: application to power systems
Dynamic Optimization without Markov Assumptions: application to power systemsOlivier Teytaud
 
Planning for power systems
Planning for power systemsPlanning for power systems
Planning for power systemsOlivier Teytaud
 
Artificial intelligence for power systems
Artificial intelligence for power systemsArtificial intelligence for power systems
Artificial intelligence for power systemsOlivier Teytaud
 
Online machine learning in Streaming Applications
Online machine learning in Streaming ApplicationsOnline machine learning in Streaming Applications
Online machine learning in Streaming ApplicationsStavros Kontopoulos
 
Online Machine Learning: introduction and examples
Online Machine Learning:  introduction and examplesOnline Machine Learning:  introduction and examples
Online Machine Learning: introduction and examplesFelipe
 
ICTER 2014 Invited Talk: Large Scale Data Processing in the Real World: from ...
ICTER 2014 Invited Talk: Large Scale Data Processing in the Real World: from ...ICTER 2014 Invited Talk: Large Scale Data Processing in the Real World: from ...
ICTER 2014 Invited Talk: Large Scale Data Processing in the Real World: from ...Srinath Perera
 
Simulation-based optimization: Upper Confidence Tree and Direct Policy Search
Simulation-based optimization: Upper Confidence Tree and Direct Policy SearchSimulation-based optimization: Upper Confidence Tree and Direct Policy Search
Simulation-based optimization: Upper Confidence Tree and Direct Policy SearchOlivier Teytaud
 
PAISS (PRAIRIE AI Summer School) Digest July 2018
PAISS (PRAIRIE AI Summer School) Digest July 2018 PAISS (PRAIRIE AI Summer School) Digest July 2018
PAISS (PRAIRIE AI Summer School) Digest July 2018 Natalia Díaz Rodríguez
 
CS3114_09212011.ppt
CS3114_09212011.pptCS3114_09212011.ppt
CS3114_09212011.pptArumugam90
 
Tools for artificial intelligence
Tools for artificial intelligenceTools for artificial intelligence
Tools for artificial intelligenceOlivier Teytaud
 
Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020
Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020
Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat Politècnica de Catalunya
 
Oxford Lectures Part 1
Oxford Lectures Part 1Oxford Lectures Part 1
Oxford Lectures Part 1Andrea Pasqua
 
Tracking the tracker: Time Series Analysis in Python from First Principles
Tracking the tracker: Time Series Analysis in Python from First PrinciplesTracking the tracker: Time Series Analysis in Python from First Principles
Tracking the tracker: Time Series Analysis in Python from First Principleskenluck2001
 
Initialization methods for the tsp with time windows using variable neighborh...
Initialization methods for the tsp with time windows using variable neighborh...Initialization methods for the tsp with time windows using variable neighborh...
Initialization methods for the tsp with time windows using variable neighborh...Konstantinos Giannakis
 
Multi-Period Integer Portfolio Optimization Using a Quantum Annealer (Present...
Multi-Period Integer Portfolio Optimization Using a Quantum Annealer (Present...Multi-Period Integer Portfolio Optimization Using a Quantum Annealer (Present...
Multi-Period Integer Portfolio Optimization Using a Quantum Annealer (Present...maikelcorleoni
 
Jay Yagnik at AI Frontiers : A History Lesson on AI
Jay Yagnik at AI Frontiers : A History Lesson on AIJay Yagnik at AI Frontiers : A History Lesson on AI
Jay Yagnik at AI Frontiers : A History Lesson on AIAI Frontiers
 

Similaire à Optimization of power systems - old and new tools (20)

Dynamic Optimization without Markov Assumptions: application to power systems
Dynamic Optimization without Markov Assumptions: application to power systemsDynamic Optimization without Markov Assumptions: application to power systems
Dynamic Optimization without Markov Assumptions: application to power systems
 
Planning for power systems
Planning for power systemsPlanning for power systems
Planning for power systems
 
Power systemsilablri
Power systemsilablriPower systemsilablri
Power systemsilablri
 
Artificial intelligence for power systems
Artificial intelligence for power systemsArtificial intelligence for power systems
Artificial intelligence for power systems
 
Online machine learning in Streaming Applications
Online machine learning in Streaming ApplicationsOnline machine learning in Streaming Applications
Online machine learning in Streaming Applications
 
Online Machine Learning: introduction and examples
Online Machine Learning:  introduction and examplesOnline Machine Learning:  introduction and examples
Online Machine Learning: introduction and examples
 
0-introduction.pdf
0-introduction.pdf0-introduction.pdf
0-introduction.pdf
 
ICTER 2014 Invited Talk: Large Scale Data Processing in the Real World: from ...
ICTER 2014 Invited Talk: Large Scale Data Processing in the Real World: from ...ICTER 2014 Invited Talk: Large Scale Data Processing in the Real World: from ...
ICTER 2014 Invited Talk: Large Scale Data Processing in the Real World: from ...
 
Simulation-based optimization: Upper Confidence Tree and Direct Policy Search
Simulation-based optimization: Upper Confidence Tree and Direct Policy SearchSimulation-based optimization: Upper Confidence Tree and Direct Policy Search
Simulation-based optimization: Upper Confidence Tree and Direct Policy Search
 
PAISS (PRAIRIE AI Summer School) Digest July 2018
PAISS (PRAIRIE AI Summer School) Digest July 2018 PAISS (PRAIRIE AI Summer School) Digest July 2018
PAISS (PRAIRIE AI Summer School) Digest July 2018
 
CS3114_09212011.ppt
CS3114_09212011.pptCS3114_09212011.ppt
CS3114_09212011.ppt
 
Tools for artificial intelligence
Tools for artificial intelligenceTools for artificial intelligence
Tools for artificial intelligence
 
Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020
Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020
Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Oxford Lectures Part 1
Oxford Lectures Part 1Oxford Lectures Part 1
Oxford Lectures Part 1
 
Tracking the tracker: Time Series Analysis in Python from First Principles
Tracking the tracker: Time Series Analysis in Python from First PrinciplesTracking the tracker: Time Series Analysis in Python from First Principles
Tracking the tracker: Time Series Analysis in Python from First Principles
 
Initialization methods for the tsp with time windows using variable neighborh...
Initialization methods for the tsp with time windows using variable neighborh...Initialization methods for the tsp with time windows using variable neighborh...
Initialization methods for the tsp with time windows using variable neighborh...
 
cs1538.ppt
cs1538.pptcs1538.ppt
cs1538.ppt
 
Multi-Period Integer Portfolio Optimization Using a Quantum Annealer (Present...
Multi-Period Integer Portfolio Optimization Using a Quantum Annealer (Present...Multi-Period Integer Portfolio Optimization Using a Quantum Annealer (Present...
Multi-Period Integer Portfolio Optimization Using a Quantum Annealer (Present...
 
Jay Yagnik at AI Frontiers : A History Lesson on AI
Jay Yagnik at AI Frontiers : A History Lesson on AIJay Yagnik at AI Frontiers : A History Lesson on AI
Jay Yagnik at AI Frontiers : A History Lesson on AI
 
Slides barcelona risk data
Slides barcelona risk dataSlides barcelona risk data
Slides barcelona risk data
 

Dernier

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 

Dernier (20)

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 

Optimization of power systems - old and new tools

  • 1. I do not speak Chinese ! ! ! ● And my English is extremely French (when native English speakers listen to my English, they sometimes believe that they suddenly, by miracle, understand French)
  • 2. I do not speak Chinese ! ! ! ● ● And my English is extremely French (when native English speakers listen to my English, they sometimes believe that they suddenly, by miracle, understand French) For the moment if I gave a talk in Chinese it would be boring, with only: ● hse-hse ● nirao ● pukachi
  • 3. I do not speak Chinese ! ! ! ● ● ● And my English is extremely French (when native English speakers listen to my English, they sometimes believe that they suddenly, by miracle, understand French) For the moment if I gave a talk in Chinese it would be boring, with only: ● hse-hse ● nirao ● pukachi Interrupt me as much as you want for facilitating understanding :-)
  • 4. High-Scale Power Systems: Simulation & Optimization Olivier Teytaud + Inria-Tao + Artelys TAO project-team INRIA Saclay Île-de-France O. Teytaud, Research Fellow, olivier.teytaud@inria.fr http://www.lri.fr/~teytaud/
  • 5. Ilab METIS www.lri.fr/~teytaud/metis.html ● Metis = Tao + Artelys TAO tao.lri.fr, Machine Learning & Optimization ● Joint INRIA / CNRS / Univ. Paris-Sud team ● 12 researchers, 17 PhDs, 3 post-docs, 3 engineers ● Artelys www.artelys.com SME - France / US / Canada - 50 persons ==> collaboration through common platform ● ● Activities ● Optimization (uncertainties, sequential) ● Application to power systems O. Teytaud, Research Fellow, olivier.teytaud@inria.fr http://www.lri.fr/~teytaud/
  • 6. Importantly, it is not a lie. ● ● ● ● It is a tradition, in research institutes, to claim some links with industry I don't claim that having such links is necessary or always a great achievement in itself But I do claim that in my case it is true that I have links with industry My four students here in Taiwan, and others in France, all have real salaries based on industrial fundings.
  • 7. All in one slide Consider an electric system. Decisions = ● Strategic decisions (a few time steps): ● building a nuclear power plant ● build a Spain-Marocco connection ● build a wind farm ● tactical decisions (many time steps): ● switching on hydroelectricity plant #7 ● switching on thermal plant #4 ● .... Based on Simulations of the tactical level Depends on the strategic level
  • 8. A bit more precisely: the strategic level Brute force approach for strategic level: ● ● I simulate ● each possible strategic decision (e.g. 20000); ● 1000 times; ● each of them with optimal tactical decisions ==> 20 000 optimizations, 1000 simulations each I choose the best one. Better: More simulations on the best strategic decisions. However, this talk will not focus on that part.
  • 9. A bit more precisely: the tactical level Brute force approach for tactical level: ● ● Simplify ● Replace each random process by expectation ● Optimize decisions deterministically But reality is stochastic: ● Water inflows ● Wind farms Better: optimizing a policy (i.e. reactive, closed-loop)
  • 10. Specialization on Power Systems ● Planning/control (tactical level) ● Pluriannual planning: evaluate marginal costs of hydroelectricity ● Taking into account stochasticity and uncertainties ==> IOMCA (ANR) ● High scale investment studies (e.g. Europe+North Africa) ● Long term (2030 - 2050) ● Huge (non-stochastic) uncertainties ● Investments: interconnections, storage, smart grids, power plants... ==> POST (ADEME) ● Moderate scale (Cities, Factories) (tactical level simpler) ● Master plan optimization ● Stochastic uncertainties ==> Citines project (FP7)
  • 11. Example: interconnection studies (demand levelling, stabilized supply)
  • 12. The POST project – supergrids simulation and optimization Mature technology:HVDC links (high-voltage direct current) European subregions: - Case 1 : electric corridor France / Spain / Marocco - Case 2 : south-west (France/Spain/Italiy/Tunisia/Marocco) - Case 3 : maghreb – Central West Europe ==> towards a European supergrid Related ideas in Asia
  • 13. Tactical level: unit commitment at the scale of a coutry: looks like a game ● Many time steps. ● Many power plants. ● Some of them have stocks (hydroelectricity). ● Many constraints (rules). ● Uncertainties (water inflows, temperature, …) ==> make decisions: ● When should I switch on ? (for each PP) ● At which power ?
  • 14. Investment decisions through simulations ● Issues – – – ● Methods – – ● Demand varying in time, limited previsibility Transportation introduces constraints Renewable ==> variability ++ Markovian assumptions ==> wrong Simplified models ==> Model error >> optimization error Our approach ● Machine Learning on top of Mathematical Programming
  • 15. Hybridization reinforcement learning / mathematical programming ● Math programming (mathematicians doing discrete-time control) – – – ● Nearly exact solutions for a simplified problem High-dimensional constrained action space But small state space & not anytime Reinforcement learning (artificially intelligent people doing discrete-time control :-) ) – – – – Unstable Small model bias Small / simple action space But high dimensional state space & anytime
  • 16. Now the technical part Model Predictive Control,
  • 17. Now the technical part Model Predictive Control, Stochastic Dynamic Programming,
  • 18. Now the technical part Model Predictive Control, Stochastic Dynamic Programming, Direct Policy Search
  • 19. Now the technical part Model Predictive Control, Stochastic Dynamic Programming, Direct Policy Search, and Direct Value Search (new) (3/4 of this talk is about the state of the art, only 1/4 our work)
  • 20. Now the technical part Model Predictive Control, Stochastic Dynamic Programming, Direct Policy Search and Direct Value Search combining Direct Policy Search and Stochastic Dynamic Programming (3/4 of this talk is about the state of the art, only 1/4 our work)
  • 21. Many optimization tools (SDP, MPC): ● Strong constraints on forecasts ● Strong constraints on model structure. Direct Policy Search ● Arbitrary forecasts, arbitrary structure ● But not scalable / # decision variables. → merge: Direct Value Search Jean-Joseph.Christophe@inria.fr Jeremie.Decock@inria.fr Pierre.Defreminville@artelys.com Olivier.Teytaud@inria.fr
  • 22. ● Stochastic Dynamic Optimization ● Classical solutions: Bellman (old & new) ● ● Markov Chains ● Overfitting ● ● Anticipativity SDP, SDDP Alternate solution: Direct Policy Search ● ● ● No problem with anticipativity Scalability issue The best of both worlds: Direct Value Search
  • 23. Random process Random values Stochastic Control Controller commands with memory Observation Cost System State ● ● For an optimal representation, you need access to the whole archive or to forecasts (generative model / probabilistic forecasts) (Astrom 1965)
  • 24. ● Stochastic Dynamic Optimization ● Classical solutions: Bellman (old & new) ● ● Markov Chains ● Overfitting ● ● Anticipativity (dirty solution) SDP, SDDP Alternate solution: Direct Policy Search ● ● ● No problem with anticipativity Scalability issue The best of both worlds: Direct Value Search
  • 25. ● Anticipative solutions: ● Maximum over strategic decisions ● Of average over random processes ● ● Of optimized decisions, given random processes & strategic decisions Pros/Cons ● ● ● Much simpler (deterministic optimization) But in real life you can not guess November rains in January Rather optimistic decisions
  • 26. MODEL PREDICTIVE CONTROL ● Anticipative solutions: ● Maximum over strategic decisions ● Of pessimistic forecasts (e.g. quantile) ● ● Of optimized decisions, given forecasts & strategic decisions Pros/Cons ● ● ● Much simpler (deterministic optimization) But in real life you can not guess November rains in January Not so optimistic, convenient, simple
  • 27. MODEL PREDICTIVE CONTROL ● Anticipative solutions: ● Maximum over strategic decisions ● Of pessimistic forecasts (e.g. quantile) ● ● Of optimized decisions, given forecasts & strategic decisions Ok, we have done one Pros/Cons of the four targets: Much simpler (deterministic optimization) model predictive But in real life you can not guess November control. rains in January ● ● ● Not so optimistic, convenient, simple
  • 28. ● Stochastic Dynamic Optimization ● Classical solutions: Bellman (old & new) ● ● Markov Chains ● Overfitting ● ● Anticipativity (dirty solution) SDP, SDDP Alternate solution: Direct Policy Search ● ● ● No problem with anticipativity Scalability issue The best of both worlds: Direct Value Search
  • 29. Markov solution Representation as a Markov process (a tree): This is the representation of the random process. Let us see how to represent the rest.
  • 30. How to solve, simple case, binary stock, one day It is December 30th and I have water er at ) w 0 se t = I u os (c No more water, december 31st I do not use I have water, december 31st
  • 31. How to solve, simple case, binary stock, one day It is December 30th and I have water: Future Cost = 0 er at ) w 0 se t = I u os (c No more water, december 31st I do not use I have water, december 31st
  • 32. How to solve, simple case, binary stock, 3 days, no random process 2 1 1 2 3 2 2 2 3 1 4 3 3 3 3 2 3
  • 33. How to solve, simple case, binary stock, 3 days, no random process 1 1 2 2 2 3 2 2 2 2 3 1 4 3 3 2 3 2 3 3
  • 34. How to solve, simple case, binary stock, 3 days, no random process 4 3 1 2 2 1 2 2 2 5 3 2 4 1 2 3 4 3 3 3 7 3 2 6 2 3
  • 35. This was deterministic ● How to add a random process ? ● Just multiply nodes :-)
  • 36. How to solve, simple case, binary stock, 3 days, random parts 2 2 1 1 23 2 4 22 2 5 3 1 4 2 2 3 73 36 3 3 3 4 4 3 2 2 1 23 2 4 22 2 5 3 1 4 2 2 3 73 36 3 3 1 o ba Pr 1/ 3 ility b Pro bab ilit y 2/ 3 2 2 1 23 2 4 22 2 5 3 1 4 2 2 3 73 36 3 3 4 1 3
  • 37. Markov solution: ok you have understood stochastic dynamic programming (Bellman) Representation as a Markov process (a tree): This is the representation of the random process. In each node, there are the state-nodes with decision-edges.
  • 38. Markov solution: ok you have understood stochastic dynamic programming (Bellman) Representation as a Markov process (a tree): Ok, we have done the 2nd of the four targets: This is the representation stochastic dynamic of the random process. programming In each node, there are the state-nodes with decision-edges.
  • 39. Markov solution Representation as a Markov process (a tree): Optimize decisions for each state. This means you are not cheating. But difficult to use. Strategy optimized for very specific forecasting models Might be ok for your problem ?
  • 40. ● Stochastic Dynamic Optimization ● Classical solutions: Bellman (old & new) ● ● Markov Chains ● Overfitting ● ● Anticipativity (dirty solution) SDP, SDDP Alternate solution: Direct Policy Search ● ● ● No problem with anticipativity Scalability issue The best of both worlds: Direct Value Search
  • 41. Overfitting ● Representation as a Markov process (a tree): How do you actually make decisions when the random values are not exactly those observed ? (heuristics...) ● ● ● Check on random realizations which have not been used for building the tree. Does it work correctly ? Overfitting = when it works only on scenarios used in the optimization process.
  • 42. ● Stochastic Dynamic Optimization ● Classical solutions: Bellman (old & new) ● ● Markov Chains ● Overfitting ● ● Anticipativity (dirty solution) SDP, SDDP Alternate solution: Direct Policy Search ● ● ● No problem with anticipativity Scalability issue The best of both worlds: Direct Value Search
  • 43. SDP / SDDP Stochastic (Dual) Dynamic Programming ● Representation of the controller with Linear Progamming (value function as piecewise linear)
  • 44. SDP / SDDP Stochastic (Dual) Dynamic Programming ● Representation of the controller with Linear Progamming (value function as piecewise linear) ● → ok for 100 000 decision variables per time step (tenths of time steps, hundreds of plants, several decisions each)
  • 45. SDP / SDDP Stochastic (Dual) Dynamic Programming ● Representation of the controller with Linear Progamming (value function as piecewise linear) ● ● → ok for 100 000 decision variables per time step but solving by expensive SDP/SDDP (curse of dimensionality, exp. in state variables)
  • 46. SDP / SDDP Stochastic (Dual) Dynamic Programming ● Representation of the controller with Linear Progamming (value function as piecewise linear) ● ● ● → ok for 100 000 decision variables per time step but solving by expensive SDP/SDDP Constraints ● Needs LP approximation: ok for you ?
  • 47. SDP / SDDP Stochastic (Dual) Dynamic Programming ● Representation of the controller with Linear Progamming (value function as piecewise linear) ● ● ● → ok for 100 000 decision variables per time step but solving by expensive SDP/SDDP Constraints ● Needs LP approximation: ok for you ? ● SDDP requires convex Bellman values: ok for you ?
  • 48. SDP / SDDP Stochastic (Dual) Dynamic Programming ● Representation of the controller with Linear Progamming (value function as piecewise linear) ● ● ● → ok for 100 000 decision variables per time step but solving by expensive SDP/SDDP Constraints ● Needs LP approximation: ok for you ? ● SDDP requires convex Bellman values: ok for you ? ● Needs Markov random processes: ok for you ? (possibly after some random process extension...)
  • 49. SDP / SDDP Stochastic (Dual) Dynamic Programming ● Representation of the controller with Linear Progamming (value function as piecewise linear) ● ● ● → ok for 100 000 decision variables per time step but solving by expensive SDP/SDDP Constraints ● Needs LP approximation: ok for you ? ● SDDP requires convex Bellman values: ok for you ? ● Needs Markov random processes: ok for you ? (possibly after some random process extension...) ● Goal keep scalability but get rid of SDP/SDDP solving
  • 50. Summary ● ● ● Most classical solution = SDP and variants Or MPC (model-predictive control), replacing the stochastic parts by deterministic pessimistic forecasts Statistical modelization is “cast” into a tree model & (probabilistic) forecasting modules are essentially lost
  • 51. ● Stochastic Dynamic Optimization ● Classical solutions: Bellman (old & new) ● ● Markov Chains ● Overfitting ● ● Anticipativity (dirty solution) SDP, SDDP Alternate solution: Direct Policy Search ● ● ● No problem with anticipativity But scalability issue The best of both worlds: Direct Value Search
  • 52. Direct Policy Search ● ● ● ● Requires a parametric controller Principle: optimize the parameters on simulations Unusual in large scale Power Systems (we will see why) Usual in other areas (finance, evolutionary robotics)
  • 53. Random process Random values Stochastic Control Controller commands with memory State Cost System State Optimize the controller thanks to a simulator: ● ● ● Command = Controller(w,state,forecasts) Simulate( w ) = stochastic loss with parameter w w* = argmin [Simulate(w)]
  • 54. Random process Random values Stochastic Control Controller commands with memory State ● ● ● Cost System Ok, State we have done the 3rd of the four targets: Optimize the controller thanks to a simulator: Direct policy search. Command = Controller(w,state,forecasts) Simulate( w ) = stochastic loss with parameter w w* = argmin [Simulate(w)]
  • 55. Direct Policy Search (DPS) ● Requires a parametric controller
  • 56. Direct Policy Search (DPS) ● Requires a parametric controller e.g. neural network Controller(w,x) = W3+W2.tanh(W1.x+W0)
  • 57. Direct Policy Search (DPS) ● Requires a parametric controller e.g. neural network Controller(w,x) = W3+W2.tanh(W1.x+W0) ● Noisy Black-Box Optimization
  • 58. Direct Policy Search (DPS) ● Requires a parametric controller e.g. neural network Controller(w,x) = W3+W2.tanh(W1.x+W0) ● Noisy Black-Box Optimization ● Advantages: non-linear ok, forecasts included
  • 59. Direct Policy Search (DPS) ● Requires a parametric controller e.g. neural network Controller(w,x) = W3+W2.tanh(W1.x+W0) ● Noisy Black-Box Optimization ● ● Advantages: non-linear ok, forecasts included Issue: too slow hundreds of parameters for even 20 decision variables (depends on structure)
  • 60. Direct Policy Search (DPS) ● Requires a parametric controller e.g. neural network Controller(w,x) = W3+W2.tanh(W1.x+W0) ● Noisy Black-Box Optimization ● ● Advantages: non-linear ok, forecasts included Issue: too slow hundreds of parameters for even 20 decision variables (depends on structure) ● Idea: a special structure for DPS (inspired from SDP)
  • 61. Direct Policy Search (DPS) ● Requires a parametric controller e.g. neural network Controller(w,x) = W3+W2.tanh(W1.x+W0) ● Noisy Black-Box Optimization ● ● Strategy optimized given the real Advantages: non-linear ok, forecasts included forecasting module you have Issue: too slow hundreds of parameters for even 20 decision variables (forecasts are inputs) (depends on structure) ● Idea: a special structure for DPS (inspired from SDP)
  • 62. ● Stochastic Dynamic Optimization ● Classical solutions: Bellman (old & new) ● ● Markov Chains ● Overfitting ● ● Anticipativity (dirty solution) SDP, SDDP Alternate solution: Direct Policy Search ● ● ● No problem with anticipativity Scalability issue The best of both worlds: Direct Value Search
  • 63. Direct Value Search SDP representation in DPS Controller(state) = argmin Cost(decision) + V(next state) ● V(nextState) = alpha x NextState ● LP alpha = NeuralNetwork(w,state) Not LP (or a more sophisticated LP) ==> given w, decision making solved as a LP ==> non-linear mapping for choosing the parameters of the LP from the current state
  • 64. Direct Value Search SDP representation in DPS Controller(state) = argmin Cost(decision) + V(next state) ● V(nextState) = alpha x NextState ● LP alpha = NeuralNetwork(w,state) Not LP (or a more sophisticated LP) Drawback: requires the optimization of w ( = noisy black-box optimization problem)
  • 65. Summary: the best of both worlds Controller(w,state) The Structure of the Controller (fast, scalable by structure) ● V(w,state,.) is non-linear ● Optimize Cost(dec) + V(w,state,nextState) is LP Simul(w) ● Do a simulation with w ● A simulator (you can put anything you want in it, even if it is not linear, nothing Markovian...) Return the cost DirectValueSearch ● optimize w* = argmin simul(w) ● Return Controller with w* The optimization (will do its best, given the simulator and the structure)
  • 66. Summary: the best of both worlds Controller(w,state) ● V(w,state,.) is non-linear ● Optimize Cost(dec) + V(w,state,nextState) is LP Simul(w) ● Do a simulation with w ● Return the cost 3 optimizers: ● SAES DirectValueSearch ● ● optimize w* = argmin simul(w) ● Return Controller with w* ● Fabian: ● gradient descent ● redundant finite differences Newton version
  • 67.
  • 68. Ok, we have done the 4th of the four targets: Direct value search.
  • 69. State of the art in discrete-time control, a few tools: ● Model Predictive Control: For making a decision in a given state: (i) do forecasts (ii) replace random procs -> pessimistic forecasts (iii) Optimize as if deterministic problem ● Stochastic Dynamic Programming: ● ● ● Markov model Compute “cost to go” backwards Direct Policy Search: ● Parametric controller ● Optimized on simulations
  • 70. Conclusion ● Still rather preliminary (less tested than MPC or SDDP) but promising: ● ● ● ● Forecasts naturally included in optimization Anytime algorithm (the user immediately gets approximate results) No convexity constraints Room for detailed simulations (e.g. with very small time scale, for volatility) ● No random process constraints (not Markov) ● Can handle large state spaces (as DPS) ● Can handle large action spaces (as SDP) ==> can work on the “real” problem, without “cast”
  • 71. Bibliography ● ● ● ● Dynamic Programming and Suboptimal Control: A Survey from ADP to MPC. D. Bertsekas, 2005. (MPC = deterministic forecasts) Astrom 1965 Renewable energy forecasts ought to be probabilistic! P. Pinson, 2013 (wipfor talk) Training a neural network with a financial criterion rather than a prediction criterion. Y. Bengio, 1997 (quite practical application of direct policy search, convincing experiments)
  • 74. SDP / SDDP Stochastic (Dual) Dynamic Programming ● Representation of the controller ● decision(current state)= argmin Cost(decision) + Bellman(next state) ● Linear programming (LP) if: – – ● For a given current state, next state = LP(decision) Cost(decision) = LP(decision) →100 000 decision variables per time step