# Foreword¶

*Written by Patrick E. Farrell*

## Why care about adjoints?¶

Far too often, maths books launch into their subject without explaining to the novice reader why he or she should care about it in the first place. So, before diving into the details, let’s take a few minutes to motivate why adjoint techniques were invented.

Suppose an aeronautical engineer wishes to design a wing. The wing is
parametrised by a vector \(m\); for example, suppose each entry of
\(m\) is the coefficient of a Bézier curve. For any potential wing
design \(m\), the Euler equations can be solved, and the
lift-to-drag ratio \(J\) of the design computed. With an adjoint,
the engineer can do far more: the adjoint computes *the derivative of
the drag with respect to the design parameters*. This can be used to
guide a human designer, or can be passed to an automated optimisation
algorithm to automatically compute an optimal
shape. [1M-Jam88] [1M-GP00]. In the literature, this
concept is referred to as adjoint design optimisation.

Suppose a meteorologist wishes to improve a forecast by constraining
the weather model to match atmospheric observations. The state of the
atmosphere at the initial time is partially known (from weather
stations), but in order to initialise the model an initial condition
for the whole world is required. For any guess of the (unknown)
initial state of the atmosphere \(m\), the Navier-Stokes and
related equations can be solved, and the weighted misfit \(J\)
between the observed values and the simulation results can be
computed. With an adjoint, the meteorologist can *systematically
update their guess for the initial state of the atmosphere* to match
the observations [1M-LDT86] [1M-TC87]. In the
literature, this concept is referred to as variational data
assimilation, 3D-Var and 4D-Var.

Suppose an oceanographer wishes to understand the impact of bottom
topography on transport through the Drake passage. Bottom topography
(the shape of the sea floor) is quite poorly known; many areas of the
world are sparsely observed, and observations from over a century ago
are still used in some places. The bottom topography is represented as
a scalar field \(m\), the Navier-Stokes and related equations are
solved, and the average net transport through the Drake passage
\(J\) computed. With an adjoint, the oceanographer can see *where
the transport is most sensitive to the topography*, and so quantify
where the uncertainty matters most [1M-LH07]. In the
literature, this concept is referred to as sensitivity analysis.

Suppose a nuclear engineer working for a government regulator wishes
to examine a proposed new nuclear reactor design. To do this, a
forward model of the Boltzmann transport equations will be used to
simulate the proposed design and verify its safety. However, all
simulations inherently come with discretisation errors, and unless
those errors are quantified, the simulations cannot be relied upon to
make decisions upon which millions of lives and billions of pounds
depend. With an adjoint, the engineer can *quantify the impact of
discretisation errors* on the criticality rate, and decide to what
extent the simulations may be trusted [1M-BR01]. In the
literature, this concept is referred to as goal-based error
estimation, or goal-based adaptivity.

Suppose a mathematician wishes to understand the stability of some
physical system. The traditional approach to this problem is to
linearise the operator and investigate its eigenvalues, which
determine the long-term behaviour of the system (as \(t
\rightarrow \infty\)). However, systems that are eigenvalue-stable can
exhibit unexpected transient growth of small perturbations, which in
turn can cause the system to become unstable (through nonlinear
effects) [1M-TTRD93]. By computing the singular value
decomposition of the tangent linear model, *the transient growth of
the system to such perturbations can be quantified, and the optimally
growing perturbations identified* [1M-FI96]. The
computation of the singular value decomposition in turn requires the
adjoint. In the literature, this approach is referred to as
generalised stability theory.

As you can see, adjoints show up in many applications, and in many computational techniques. One of the reasons why adjoints have a reputation for being difficult is because their discussion is performed in many different areas of science, usually with their own specialised terminology. Reading the literature, there are almost as many ways to approach the topic as there are practitioners! With this introduction, I hope to strike to the heart of the matter, and clear some of the confusion with the minimum of application– or technique–specific lingo.

## A note on the exposition¶

I have chosen to motivate adjoints via a discussion of
*PDE-constrained optimisation* for two reasons. The first is that this
approach encapsulates many important applications of adjoints in a
general way, and so the reader will be well-equipped to understand
much adjoint-related mathematics in the literature. The second is the
elegance of the result: most people are amazed when they first learn
that it is possible to compute the gradient of a functional
\(\widehat{J}(m)\) in a cost *independent of the number of
parameters* \(\textrm{dim}(m)\)! The topic of adjoints is
intriguing, counterintuitive and beautiful; any exposition should try
to live up to that.

The focus of the exposition will be on getting the core ideas across, and for this reason the discussion will sometimes neglect technicalities. For example, I will implicitly assume that all problems are well-posed, that all necessary derivatives exist and are sufficiently smooth, etc. Occasionally, to build intuition, I will refer to objects as matrices and vectors, although the exposition holds in exactly the same way for their analogues in functional analysis. For an advanced in-depth technical treatment of PDE-constrained optimisation, see the excellent book of Hinze et al. [1M-HPUU09].

## Notation¶

The notation is mostly inspired by Gunzburger [5M-Gun03].

Symbol |
Meaning |
---|---|

\(m\) |
the vector of parameters |

\(u\) |
the solution of the PDE |

\(F(u, m)\) |
the PDE relating \(u\) and \(m\): \(F \equiv 0\) |

\(J(u, m)\) |
a functional of interest |

\(\widehat{J}(m)\) |
the functional considered as a pure function of \(m\): \(\widehat{J}(m) = J(u(m), m)\) |

In the next section, we introduce the PDE-constrained optimisation problem and give a broad overview of how it may be tackled.

References

- 1M-BR01
R. Becker and R. Rannacher. An optimal control approach to a posteriori error estimation in finite element methods.

*Acta Numerica*, 10:1–102, 2001. doi:10.1017/S0962492901000010.- 1M-FI96
B. F. Farrell and P. J. Ioannou. Generalized stability theory. Part I: Autonomous operators.

*Journal of the Atmospheric Sciences*, 53(14):2025–2040, 1996. doi:10.1175/1520-0469(1996)053<2025:GSTPIA>2.0.CO;2.- 1M-GP00
M. B. Giles and N. A. Pierce. An introduction to the adjoint approach to design.

*Flow, Turbulence and Combustion*, 65(3-4):393–415, 2000. doi:10.1023/A:1011430410075.- 1M-HPUU09
M. Hinze, R. Pinnau, M. Ulbrich, and S. Ulbrich.

*Optimization with PDE constraints*. Volume 23 of Mathematical Modelling: Theory and Applications. Springer, 2009. ISBN 978-1-4020-8838-4.- 1M-Jam88
A. Jameson. Aerodynamic design via control theory.

*Journal of Scientific Computing*, 3(3):233–260, 1988. doi:10.1007/BF01061285.- 1M-LDT86
F.-X. Le Dimet and O. Talagrand. Variational algorithms for analysis and assimilation of meteorological observations: theoretical aspects.

*Tellus A*, 38A(2):97–110, 1986. doi:10.1111/j.1600-0870.1986.tb00459.x.- 1M-LH07
M. Losch and P. Heimbach. Adjoint sensitivity of an ocean general circulation model to bottom topography.

*Journal of Physical Oceanography*, 37(2):377–393, 2007. doi:10.1175/JPO3017.1.- 1M-TC87
O. Talagrand and P. Courtier. Variational assimilation of meteorological observations with the adjoint vorticity equation. I: Theory.

*Quarterly Journal of the Royal Meteorological Society*, 113(478):1311–1328, 1987. doi:10.1002/qj.49711347812.- 1M-TTRD93
L. N. Trefethen, A. E. Trefethen, S. C. Reddy, and T. A. Driscoll. Hydrodynamic stability without eigenvalues.

*Science*, 261(5121):578–584, 1993. doi:10.1126/science.261.5121.578.