Stochastic Models for Natural ImagesMay 3, 2002
It is exciting that SIAM has now recognized imaging science as a major area, has canonized it as an official branch of applied mathematics. Looking over the program for the Boston meeting, we see some of the major themes: (a) image enhancement (deblurring, denoising, super-resolution), (b) image compression, (c) 3D scene reconstruction from images, and (d) object recognition. These areas represent a merging of the fields of image processing and computer vision. What is not so apparent is where statistics fits in. One can make the case that imaging science is really a branch of applied Bayesian statistics, an area that we at Brown like to call "pattern theory," following the pioneering work of Ulf Grenander.
Statistics has dominated the field of speech recognition since the 70s, with the early success of hidden Markov models. It began to appear in vision in the mid-80s, with, for example, S. and D. Geman's application of statistical mechanical ideas like the Ising model to image analysis. But only recently have people begun to assemble the massive datasets needed to study the raw statistics of images properly, and to craft stochastic models to fit subtle patterns like the characteristic deformations of shapes and the grammatical assembly of image primitives into whole objects.
The first part of my talk described some of the complexities of raw image statistics, including (a) the high-kurtosis, non-Gaussian nature of image filter responses, (b) the scaling self-similarity of image statistics, and (c) the characteristic local geometric structures (edges, bars, and blobs) of images. The scaling self-similarity of the probability measure defining the class of natural images has a remarkable consequence: In the continuous limit, images cannot be modeled as functions at all, but must be taken as generalized functions---distributions in the sense of Schwartz. This is so because there is no self-similar stationary probability measure on the space of locally integrable functions (even mod constants). This theorem reflects a nasty fact about the world: It is cluttered. The high kurtosis, on the other hand, reflects another basic fact: The world is made up of discrete objects, generating signals with discontinuities. Finally, the stereotyped local structure is an idea that goes back to Julesz's "textons" and Marr's "primal sketch." But only now can we begin to quantify this intuition. Work of Ann Lee, which begins to carry out this quantification, was also described in my talk.
The second part of the talk dealt with one approach to constructing stochastic models of "shape," based on fluid mechanics. In the 60s, Arnold discovered that the solutions of Euler's equation for incompressible ideal fluids can be interpreted as geodesics on the infinite-dimensional group of volume-preserving diffeomorphisms. Grenander and Miller proposed the use of modified Riemannian metrics on the full group of diffeomorphisms to model the deformations encountered in anatomy, e.g., in MRI images. Their metric gives rise to flows that solve a regularized form of the Euler equation for compressible fluids. This approach connects with work of Kendall and Bookstein on the statistical theory of shape, but gives it a better mathematical foundation.
David Mumford is a professor of applied mathematics at Brown University.