## Panel Discusses Research Directions and Enabling Technologies for the Future

**May 1, 2007**

A Highlight of the 2007 SIAM Conference on CSE

**Daniel R. Reynolds and Ryan Szypowski**

*Recognizing the importance of a discussion of shared research and technology needs to the computational science and engineering community, the organizers of the 2007 SIAM Conference on CSE made space in the program for a panel discussion on the topic. Because CSE is a broad subject area, representing a large number of application areas and encompassing individuals from educational, governmental, and industrial sectors, the panel focused on mathematical and computer science requirements of the CSE community as a whole, rather than on enabling research for specific application areas.*

*Invited to discuss their views, needs, and ideas were five panelists, working in a range of areas: Terence Critchlow (Lawrence Livermore National Lab), data management; Eric de Sturler (Virginia Tech), inverse problems and optimization; Robert Falgout (LLNL), scalable algorithms for linear systems; Michael Holst (UC San Diego), multiscale modeling and spatially adaptive methods; and John Shadid (Sandia National Labs), large-scale simulation of coupled multiphysics systems.*

One topic touched on by all the panelists is the need for improved realism in computational models. In many areas of CSE, we are concerned with complex, interacting, nonlinear models, often acting on multiple time and length scales. For such systems, separate processes can combine to produce steady-state behavior, slowly varying dynamics compared with component time scales, or even short time scales arising from a single dominant process. For many nonlinear problems, the temporal behavior of the physical system sometimes transitions among these three categories throughout a simulation. Traditional approaches for simulating multiphysics processes have included operator-split, time-lagged, linearized, semi-implicit, or fixed-point methods, in which a single physical process is treated at a time. Unfortunately, such de-coupling of processes results in simulations with questionable stability, accuracy, convergence, or time-step controls.

One conclusion reached on this topic is that additional frameworks are required for analyzing the properties of these "practical" multiphysics iterations, including error propagation, convergence, and computational complexity, as the problems increase in size. Additionally, these approaches must be re-evaluated for robustness, accuracy, scalability, efficiency, and predictability, in light of more recent developments toward large-scale fully coupled simulation solver frameworks. Related to the issue of multiphysics models is that of multiscale modeling, which aims to capture physical properties at disparate spatial scales and even modeling accuracy in the same numerical solution. With such systems, there is a critical need for the development and analysis of multiscale techniques capable of capturing continuum-to-continuum and continuum-to-atomistic processes. The interactions between deterministic and stochastic subcomponent integrators need to be defined, along with theory for proper incorporation of stochastic processes and uncertain physical measurements into deterministic models.

In addition to these emerging research areas for multiphysics and multiscale models, much remains to be done on basic needs for continuum-level modeling. Applications of computational science are rarely so simple as the problems on which most numerical methods have been devised. For even mildly nonideal problems, many of these techniques may not prove adequate. Such problems continue to require stable, higher-order, efficient time integration methods with error estimation and control, along with stable, higher-order spatial discretization approaches with error estimation and adaptivity. Many of these applications, including electromagnetics, relativity, and conservation laws, involve constrained evolution problems, which require increased research on symplectic numerical methods designed to preserve geometric solution properties.

Such simulation capabilities will contribute to scientific advances only if they can be verified and validated, i.e., if it can be shown that the simulation code properly approximates the model equations and that the model itself represents the physical processes under study. Traditional statistical techniques for such studies often require huge numbers of forward simulations, which remain prohibitive for many large-scale problems. Thus, we must seek more efficient ways to perform such studies in the large-scale modeling context. Analytic solutions, which have been devised for many simple problems, are sorely needed for multiphysics systems as well. Lastly, additional analysis techniques need to be devised for estimating the propagation of discretization and modeling errors through coupled, decoupled, and operator-split solution algorithms.

The aims of modern computational science extend far beyond realistic and verified forward simulations. The eventual goal for many fields is not just the ability to simulate, but to use those simulation capabilities to meet additional goals---including optimal design, optimal control, parameter estimation, uncertainty quantification, and bifurcation analysis. Here, robust methods are needed for large-scale, nonlinear, and possibly ill-posed problems. Standard gradient-based methods, while leading to optimal efficiency, may slow down or fail given ill-posed or non-smooth problems. In many of these contexts, robustness is sometimes viewed as more important for a simulation than optimal scalability, and fixed-point, continuation, or homotopy-based methods may provide increased robustness. For many of these problems, derivative-free optimization methods, or even derivative-based methods using automatic differentiation tools, warrant further investigation.

The development of optimal solution algorithms as problem sizes increase remains at the forefront of much research in computational science. For fully implicit simulations of large multiphysics models, scalable solution methods for nonlinear problems typically rely on Newton–Krylov-based methods. Such approaches have been shown to ensure stability, accuracy, and robustness for many problems, but optimal scalability continues to depend on the development of optimal preconditioning approaches for the inner linear solves. To this end, research continues to be needed on so-called "physics-based" or "operator-based" preconditioners for various problem classes. More general algebraic approaches have also demonstrated optimal scalability, including algebraic multigrid (AMG) and adaptive-AMG, which attempt to improve scalability by automatically adjusting to a given linear operator. Further research is needed before such methods can be extended to the nonideal systems that better represent complex computational science applications, including nonelliptic and nonsymmetric problems, or even very high-dimensional problems, such as those arising in radiation transport or relativity. For true scalability, many problems may require multilevel methods in both time and space.

Optimally scalable algorithms are certainly needed, but in many problems it may be possible to exploit special structure, such as scale separation or smoothness, to achieve simulations in "better than *O*(*N*)" operations. With multiscale or multiphysics models, as mentioned earlier, the use of multiple models on multiple domains could lead to algorithms that need not globally simulate using the most accurate or detailed model or resolution, consequently enabling accurate simulations with far less computational work. Additional approaches for many applications include model reduction or reduced-space methods for complex multiparameter nonlinear systems. Spatially adaptive methods promise high accuracy with less work. Convergence theory for many of these approaches continues to be based on simplified model problems, however, and additional analysis is required for more complicated systems. Adaptive methods also remain less scalable to large problem sizes than uniformly refined simulations; low-complexity algorithms, scalable data structures, and optimally scalable parallel solvers for adaptive methods require further investigation. While many applications have traditionally focused on spatial adaptivity for increased efficiency, significant work remains to be done on combined time and space adaptivity.

Given the primary focus of many computational scientists on solution algorithms, we as a community must not lose sight of the reality that the overall efficiency and utility of our simulations depend on more than operation counts. Both data management and hardware optimization must be addressed as well. While current computational platforms are designed for high-flop performance on standard (and possibly antiquated) benchmarks, the overall complexity of algorithms may now be not so much flop-related as data-movement sensitive. With the current trend toward multicore computing technology, research is needed on how to adjust modern numerical methods for such continually changing computing hardware. A truly holistic solution would incorporate data management needs into applications from the ground up.

Modern applications have proved remarkably apt at producing huge quantities of data, but with the increases in simulation data have come new problems: How is that data to be appropriately mined for scientifically relevant information? Methods for integrating multimodal information from distributed, heterogeneous sources are a priority. We also need to develop efficient and automated filtering mechanisms that can create indices and query data to eliminate uninteresting information. And because we typically perform multiple simulations with different problem or solver parameters, record keeping needs to be automated, with the information to be retained including where each set came from, what the values of key parameters are within each set, and how the resulting data compares with other data sets.

In summary, applications in CSE require the development of improved algorithms---we cannot depend on increases in computational speed alone to solve modern complex problems using traditional algorithms. For nearly all enabling technologies, we require extension of mathematical methods and analysis from idealized model problems to nonlinear, coupled systems. Such investigations will require that applied mathematicians, computer scientists, and application scientists form significantly closer partnerships than those of the past. We must critically reevaluate "state-of-the-art" simulation capabilities, moving them decidedly forward in utility, accuracy, robustness, and efficiency, without requiring the reuse of old technology. As a result, we must be willing to move from traditional loosely coupled, poorly analyzed codes to newer, more optimal simulation codes. Many of the conclusions presented here have been expressed in the CSE community for several years, but we must collectively expand on these ideas in order to help shape the future of computational science and engineering research.

*The ideas presented here are those discussed by the panel; they do not constitute an exhaustive set of research directions for the CSE community, and further discussion is certainly needed. Interested readers are encouraged to visit **http://cam.ucsd.edu/cse_2007**, where panel organizer Daniel Reynolds has posted the presentation slides and a discussion summary.*

*Daniel Reynolds, a postdoc in mathematics and astrophysics at the University of California, San Diego, organized the panel. Ryan Szypowski is a mathematics graduate student at UCSD.*

*True scalability will require parallel multilevel methods in time. As we refine the mesh, we also refine the time step. To date, we have relied on increases in processor speed. This "solution" probably won't work indefinitely.---Robert Falgout*

*One of the main challenges now in CSE is to combine multiresolution methods for particular models with new methodologies for coupling different models together to allow for accurate modeling of phenomena in which widely varying length and time scales are important.---Michael Holst*

*Identifying the most interesting information requires complex analysis; automated and semi-automated tools are needed to identify interesting information quickly. Data comes in a wide variety of formats (simulations, text, images, sensors, GIS) and analysis techniques need to be useable on all of them.---Terence Critchlow*