Presentation by Arjen Markus (Deltares) at the Data Science Symposium 2018, during Delft Software Days - Edition 2018. Thursday 15 November 2018, Delft.
2. Purpose of the experiment
”Algorithmic differentiation” is a venerable technique:
Quite intrusive if done manually
However, the toolkit by NAG
makes it easy to apply to C++
and Fortran programs
Built into the compiler(s) – very
little changes to the source code
needed
This presentation is about our experiences with the tool.
1. Introduction November 15, 2018 2 / 20
3. Not only smooth problems ... BLOOM
To get acquainted with the method we chose a module from our
water quality model that is not too extensive and not too trivial –
BLOOM.
BLOOM uses an optimisation algorithm based on linear
programming: how much algae of various species can exist
given light, nutrients, ...
1. Introduction November 15, 2018 3 / 20
4. Algorithmic differentiation (1)
In many applications of our numerical models we need to
answer such questions as:
Sensitivity analysis: Which parameters influence the
outcome the most?
Calibration: How can we get as close as possible to the
measurements?
Data assimilation: How can we use the limited
observations we have to estimate a good starting point?
...
2. Technique November 15, 2018 4 / 20
5. Algorithmic differentiation (2)
To answer such questions we have many (mathematical)
methods to our disposal.
Some, however, are fairly na¨ıve. To determine the sensitivity of
the outcome to the parameters we can use the following
method:
Try different values
of the relevant
parameters
Determine
difference with
”nominal” result
0 20 40 60 80 100
0
1
2
3
4
5
Increasing parameter
2. Technique November 15, 2018 5 / 20
6. Algorithmic differentiation (3)
This works fine – if you have a small number of parameters.
It also assumes the response is more or less linear.
Two alternatives:
The tangent linear method:
∂U
∂x
,
∂U
∂y
,
∂U
∂z
, ...
The adjoint method:
∂x
∂U
,
∂y
∂U
,
∂z
∂U
, ...
Number of parameters: this can be large indeed, think of the
number of grid cells, in case of data assimilation.
2. Technique November 15, 2018 6 / 20
7. How it works – tangent linear method
Some technical details – for the tangent linear method:
A special compiler takes the source code and transforms it.
Each arithmetic operation now calculates the value and the
derivative:
w = x · y → (w,
dw
dz
) = (x · y, x ·
dy
dz
+ y ·
dx
dz
) (1)
Each mathematical function does the same.
The result is the precise derivative.
2. Technique November 15, 2018 7 / 20
8. How it works – adjoint method
For the adjoint method, things are slightly more complicated:
A special compiler takes the source code and transforms it
so that all operations are stored – the so-called tape.
The calculation is done both forward (normal) and
backward to calculate the derivative.
Advantage: much faster and you get a direct answer as to how
to change the parameters.
2. Technique November 15, 2018 8 / 20
9. The tools from NAG
NAG, Numerical Algorithms Group, offers:
HPC consulting and services
Software services in general
An extensive library with
numerical algorithms
A Fortran compiler that strictly
checks for conformance.
The AD compiler for C++ and
Fortran
3. About NAG November 15, 2018 9 / 20
10. Cooperation with NAG
Given their experience with numerical algorithms,
high-performance computing and their extensive library and
other tools, NAG is an interesting party.
This experiment was a first opportunity to closely cooperate
with them.
For the purpose of this presentation, I will focus on two simpler
applications.
The BLOOM experiment showed that smoothness is not a
prerequisite.
3. About NAG November 15, 2018 10 / 20
11. Linear programming – BLOOM
Constraints: (2)
x + 0.4y ≤ 10 (3)
x + 1.8y ≤ 5 (4)
Optimise: x + y (5)
The result: the optimum
depends on several
parameters
Determine the Jacobian
matrix to identify them
Information specific to
solution
X
Y
Constraints
Optimum
4. Examples November 15, 2018 11 / 20
12. Simple example: Streeter-Phelps
The classical model of BOD and DO in a river:
dBOD
dt
= −kBOD
dDO
dt
= −kBOD + r(DOsat − DO)/H
Five parameters:
Initial conditions for BOD and DO
Saturation concentration DO
Decay rate of BOD
Reaeration rate DO
4. Examples November 15, 2018 12 / 20
13. Simple example: Streeter-Phelps – the data
Artificial data with noise:
0 5 10 15 20
0
2
4
6
8
10
Oxygen
BOD
4. Examples November 15, 2018 13 / 20
14. Simple example: Streeter-Phelps – error
Using a simple line search algorithm and the results of the AD
tool:
1
10
100
1000
10000
100000
1000000
10000000
0 10 20 30 40
4. Examples November 15, 2018 14 / 20
15. Simple example: Streeter-Phelps – final
result
Using a simple line search algorithm and the results of the AD
tool:
BOD initial: 9.7 mg/l (10)
DO initial: 9.2 mg/l (8.0)
Saturation: 7.9 mg/l (7.8)
Decay rate: 0.5 d−1 (0.4)
Reaeration: 2.0 d−1 (2.5)
0 5 10 15 20
0
2
4
6
8
10
Oxygen (data)
Oxygen (model)
BOD (data)
BOD (model)
(In parenthe
4. Examples November 15, 2018 15 / 20
16. Backtracking an oil spill
The idea:
We have observed an oil spill somewhere.
It was released some two days before.
Can we trace it back to its origins?
A form of inverse modelling!
4. Examples November 15, 2018 16 / 20
17. Backtracking an oil spill – set-up
Very simple grid – rectangular, constant flow.
But:
We seek the initial condition that gives us the following patch
after two days:
0 5 10 15 20 25
0
1
2
3
4
5
Initial patch
"Observed" patch
(Green: a very rough estimate ...)
4. Examples November 15, 2018 17 / 20
18. Backtracking an oil spill – result
Use the adjoint gradient of the final result wrt the initial
condition:
Determine a new initial condition that will yield a result that is
closer to the observation.
Result:
0 5 10 15 20 25
0
1
2
3
4
5
< 0.001
< 0.003
< 0.01
< 0.03
< 0.1
< 0.3
< 0.6
> 0.6
4. Examples November 15, 2018 18 / 20
19. A bit of source code ...
do iteration = 1,100
! Update the initial condition
conc_init = max(0.0, conc_init - conc_init_adjoint * 0.01)
call dco_a1s_tape_create
call dco_a1s_tape_register_variable( conc_init )
... calculate concentration over time
... calculate deviation from "observed" concentration
! Examine the adjoint results
call dco_a1s_tape_register_output_variable( deviation )
call dco_a1s_tape_switch_to_passive
call dco_a1s_set( deviation, 1.0, -1 )
call dco_a1s_tape_interpret_adjoint
call dco_a1s_get( conc_init, conc_init_adjoint, -1 )
call dco_a1s_tape_remove
enddo
4. Examples November 15, 2018 19 / 20
20. Conclusions and recommendations
The technique of algorithmic differentation is very
promising (and actually well-established).
The tool provided by NAG is easy to use, even if there are
some complications you need to deal with.
One immediate benefit is with OpenDA.
4. Examples November 15, 2018 20 / 20