Mathematical Modelers Look to an Uncertain FutureDecember 16, 1998
Error enters ineluctably at every stage of a mathematical model's life cycle, says James M. Hyman, organizer of a minisympsium on uncertainty in modeling and simulation at SIAM's Toronto meeting.
"We need new tools," James M. (Mac) Hyman told the audience packed into a room at the 1998 SIAM Annual Meeting in Toronto. Hyman, group leader for mathematical modeling and analysis in the Theoretical Division at Los Alamos National Laboratory, was speaking at a minisymposium on quantifying uncertainty and predictability in mathematical models. He is one of a growing number of researchers who believe that the time has come for mathematical models of complex systems to take seriously the question of their own limitations.
Global climate modeling, multiphase flow problems, and studies of urban traffic patterns are among the numerous areas in which uncertainty is a prominent, even dominant feature. The traditional mindset of minimizing uncertainty and eliminating sources of error is increasingly unhelpful in the face of problems that are inherently stochastic or highly nonlinear (or both). Instead, researchers need to study uncertainty in its own right, Hyman and others say.
The study of predictability "is going to reshape the scientific method in a very significant way," predicts James Glimm of the State University of New York at Stony Brook. In particular, he says, scientists are coming to "an entirely new point of view, that error is the leading term in something"---namely "the probability that you took the right model in the first place."
Minisymposium speaker James Glimm sees scientists coming to "an entirely new point of view, that error is the leading term in something"---namely "the probability that you took the right model in the first place."
Error In/Error Out
Mathematical modelers have always known to take the garbage that comes out of their computers with a grain of salt. Savvy practitioners develop an instinct for how large the error bars need to be. But instinct alone is not enough to deal with the hugely complicated models that arise in the applications made possible by today's computers and supercomputers. What's more, the trend is toward taking humans out of the loop---the brain's ponderous approach to decision-making is much too slow, not to mention its penchant for boredom and mistakes. Modelers need to automate the way they weigh the reliability of their results, instructing machines on how to take answers with a grain of silicon.
Error enters ineluctably at every stage of a mathematical model's life cycle, Hyman explains. It begins with decisions as to the relevance of potential factors. "There's no way you can include everything that can possibly affect the system [you're modeling]," Hyman points out. But how do you know for sure when something is irrelevant? "Do you ignore the phase of the moon, for example?" Hyman asks. "If you're considering tide calculations, you can't!"
Even when you can be confident that you haven't tossed out the baby with the bathwater, model simplification is a gamble. "By making these reductions in the size of your model, you introduce errors that are going to propagate all the way through to your predictions," Hyman says.
Then come the purely mathematical assumptions. How do you deal with nonlinearities in your model? If noise is present, is it white? If there's more than one random factor, how are they correlated? Numerical uncertainties also enter in. How do you track the propagation of truncation and round-off error? How do you describe your confidence in a simulation done on a grid that's coarser than you'd like, with initial conditions that are incomplete and imprecise? (This last problem is particularly vexing in climate and ocean models, Hyman points out. For example, it's known that most of the energy in oceans is concentrated in eddies that are about 10 kilometers in diameter, but the finest-scale models today use grid points 40 kilometers apart.)
Modelers must also worry about computer bugs. And, finally, there's opportunity for error and uncertainty in the way results are displayed and interpreted: Even if everything were calculated perfectly, you would still have to decide which results are pertinent.
What's lacking is rigorous and robust methods for quantifying uncertainty. "One of the problems," Hyman says, "is that we have very few tools." Sensitivity analysis, for example, works well for low-dimensional systems of ordinary differential equations, but not for the high-dimensional systems of partial differential equations that dominate in many applications. Finding appropriate norms is another problem: Judging the accuracy of a weather prediction by taking an L^2 norm of its difference from the actual weather can be grossly unfair, unless, say, it's tremendously significant that it rained at 4:00 rather than at 4:15.
Programming languages need new, error-oriented datatypes, such as probability density functions, Hyman adds. "If you look at Fortran or C or C++, we have reals and scalars, integers, characters, and functions. Nowhere in there can you find a datatype that will handle a pdf. Everyone builds them themselves, and they're completely incompatible."
Linda Petzold of the University of California at Santa Barbara described an effort she's led, using "reduced" models of chemical kinetics, to quantify uncertainty in computer simulations. Chemical engineers and environmental chemists frequently deal with systems involving thousands of reactions among hundreds of chemical species. The problem is to identify a manageably sized subset of reactions that account for as much of the dynamics as possible.
"I'm in the business of making uncertainty," she jokingly explains. The question is, "How good are my bad answers?" Abstractly, Petzold's approach amounts to a problem in integer programming. If the full system is described by an equation x' = R_1(x) + R_2(x) + . . . + R_N(x) (x being a vector of, say, concentrations of chemical species), the goal is to find a vector of 0's and 1's, (k_1, k_2, . . . , k_N), with k_1 + k_2 + . . . + K_N = K << N, such that the solution y to the "reduced" equation y´ = k_1R_(y) + k_2R_2(y) + . . . + k_NR_N(y) is as close as possible to the "true" solution, subject to the restriction on the number of nonzero k's.
"I'd like to have the reduced mechanism be able to give insights," Petzold says. "But at the same time it should be simpler and much cheaper to evaluate than the original system."
The problem is that, although there's a well-developed theory for linear integer programming, the component reactions in chemical kinetics tend to be nonlinear. "There's been amazingly little work on nonlinear integer programming problems," Petzold observes. The uncertainty toolchest, it seems, has little in it beyond a hammer, a pair of pliers, and a few scraps of sandpaper.
Barry A. Cipra is a mathematician and writer based in Northfield, Minnesota.