Preconditioning for PDE-constrained OptimizationMarch 1, 2010
Since the advent of computers, scientists have been finding better and better ways to calculate numerical solutions of partial differential equations. Back in the early years of the 20th century, L.F. Richardson envisioned teams of people with hand calculating machines working in the Albert Hall in London to solve diffusion equations for problems relating to the prediction of weather. By the end of the first decade of the 21st century, finite difference, finite element, spectral, pseudo-spectral, boundary element, and probably many other less well-known methods are used in applications across science and engineering, and increasingly in medicine, biology, and even the social sciences, for the approximate solution of problems posed as PDEs.
Some of these methods are routinely used in important applications---one wonders what practical civil engineering would be like today without finite element codes to calculate stresses and displacements in structures! Others remain exotic research objects, as we continue to grapple with the challenge of finding effective and accurate numerical approximation schemes for ever more challenging PDE problems. As new numerical schemes are designed, linear algebraists seek efficient ways to process the large sets of equations produced by the new methods, in particular for steady-state and diffusive (parabolic) PDE problems. The development of numerical algorithms for the solution of these large systems of linear or linearized equations remains an active and thriving research area. Whereas Richardson was dreaming of systems with tens or hundreds of discrete variables, today's application scientists are more likely to want a million or more!
Sparsity is a key aspect of the equation systems derived from grid-based approximation methods, and the development and refinement of sparse direct solvers since their introduction in the 1960s have been a highly successful research enterprise. Excellent software implementations are available and widely used, particularly for the approximation of PDE problems on two-dimensional domains. Three-dimensional spatial problems provide a particular challenge for sparse direct methods because of increased bandwidth (non-zero entries farther from the matrix diagonal), which leads generally to significant "fill-in" (of zero entries by non-zeros during the Gaussian elimination steps) and the consequent---often unsatisfiable---growth in storage requirements.
Iterative solution methods generally do not have this difficulty and so provide alternatives. Nobody is interested in a method that is grindingly slow to converge, however, and it is often such methods that can be easily achieved for those PDE problems that give rise to inherently poorly conditioned matrices. Preconditioning is clearly the route to take here: finding an approximation of the system matrix that both makes it easier to solve associated linear systems and provides a reasonable representation of the original problem.
The idea of matrix splitting, associated initially with Jacobi and Gauss–Seidel, is now seen as a way of providing simple---if often not very effective---preconditioners. Frequently more potent are incomplete factorization methods (carrying out Gaussian elimination as in a direct method, but keeping only a few of the entries) and preconditioners based on an understanding of properties of the underlying PDE; important examples include multigrid and domain decomposition--the latter being particularly appropriate in the context of parallel computation. The combination of an effective preconditioner and an appropriate iterative method---Krylov subspace methods, such as conjugate gradients and the GMRES method, are overwhelmingly popular---can be an effective and efficient way to achieve fast convergence with acceptable memory requirements and, hence, enable the solution of the very high-dimensional discrete systems arising from PDEs. Much has been achieved in research on preconditioning techniques in the past two or three decades, and effective preconditioning strategies and algorithms are available for many individual PDEs and coupled sets of PDEs, such as the incompressible Navier–Stokes equations describing the flow of a viscous fluid.
One aspect of the success of numerical methods and the associated linear solvers for various classes of PDE problems is the opportunity it gives for more direct attack on the issues for which the solutions of PDEs are desired in the first place. One such issue arises in the context of design. An example is the design of the shape of an aircraft wing: PDEs describe the external flow of air, given the geometry and the speed and orientation of the aircraft, but what the aeronautical engineer wants is a shape that can be built to be strong and light, that provides the required lift, and that minimizes drag and, hence, fuel consumption. In this and many other such design problems, PDEs describe the physics that constrains the set of good or optimal designs. Problems of this type are thus characterized by the optimization of design criteria in the presence of PDE constraints---constraints that are themselves PDE problems.
The need to solve such PDE-constrained optimization problems is not new--a significant part of control theory concerns conditions for optimality in the presence of constraints described by ordinary or partial differential equations. What is a little newer is the realization that such fully coupled problems can be solved by "all-at-once" or "one-shot" methods, which employ the framework of preconditioned iterative solvers for the block-structured systems comprising the optimality conditions and the discretized PDEs together. That is, an effective preconditioner for the relevant PDE problem can be employed as part of a block preconditioner that enables the iterative solution of a linearized, but fully coupled PDE-constrained optimization problem as a whole. Moreover, the conditions for optimality always involve the adjoint of the PDE in question, and a preconditioner required for this is often available by simple transposition. In this fully coupled framework, uncoupled approaches that alternate between solves of the PDE and of its adjoint, with these solutions used to update the relevant optimality criteria in a repeating cycle, can often be viewed as a sort of Gauss–Seidel iteration for the fully coupled system. The actual computational work and storage required for an iteration of the all-at-once method are essentially no more than those for one cycle of the uncoupled approach; typically, however, convergence is achieved much more rapidly.
As often in mathematics, it transpires that once one adopts the more encompassing viewpoint produced by the successes of the all-at-once solution methods, problems hitherto approached in a different way are seen to fit into the PDE-constrained optimization paradigm. Much work remains to be done, but even for Richardson's beloved weather forecasts, which today are achieved with some of the most advanced simulation software for PDEs, it seems that there may be utility in this approach!
For Further Reading
Omar Ghattas, "PDE-Constrained Optimization at the 2005 SIAM Conferences on CS&E and Optimization," SIAM News, Volume 38, Number 6, July/August 2005.
Fredi Tröltzsch, Optimale Steuerung partieller Differentialgleichungen Theorie, Verfahren und Anwendungen, Vieweg Verlag, 2005 (soon to appear in English translation).
Tyrone Rees, Martin Stoll, and A.J. Wathen, All-at-once preconditioning in PDE-constrained optimization, 2009, Oxford eprints archive: http://eprints.maths.ox.ac.uk/839/.
Andrew Wathen is a reader in numerical analysis at Oxford University. This article, written at the request of SIAM News, is based on the invited talk he gave at SIAM's 2009 conference on linear algebra.