Spectral Techniques for Random GraphsNovember 27, 1998
Noga Alon's presentation on efficient methods for finding structural properties of graphs, in particular the largest clique in a graph, was one of two joint plenary talks given in Toronto as part of both the SIAM Annual Meeting and the Ninth SIAM Conference on Discrete Mathematics.
The construction of fast algorithms for determining structural properties of graphs is a classic challenge in discrete mathematics and theoretical computer science. Such problems are easy to state and illustrate, but they are often provably difficult, in the sense of being NP-hard. (The class of NP-hard problems consists of problems for which there are no known algorithms whose computation time grows only as a polynomial in problem size. If just one NP-hard problem could be solved in polynomial time, however, they all could.)
Information about how the nodes of a graph are connected is important in the analysis and design of circuits, VLSI chips, and networks of all sorts. The minimum bisection width of a graph, for example---the division of the nodes into two equal groups that minimizes the number of edges with an end in each group---affects the efficiency of divide-and-conquer algorithms, which operate by decomposing large networks into smaller, more tractable pieces. For both its intellectual challenge and its descriptive name, one of the best known of these structural problems is that of finding the largest clique in a graph. A clique is a set of mutually adjacent vertices. In a computer network, for example, a clique is a set of machines, each of which can communicate directly with any other without an intermediary.
Structure from Eigenvectors
One route to efficient algorithms is through the structural clues provided by eigenvectors of matrices associated with the graph. Noga Alon of Tel-Aviv University described some of these spectral techniques in Toronto in July, in one of the two joint plenary presentations at the combined 1998 SIAM Annual Meeting and SIAM Conference on Discrete Mathematics.
For the largest-clique problem, the spectral approach is far better than visual inspection. "Intuitively, a large clique should be easy to see in a picture of a graph," Alon said. "It is the big black part! But as the problem of finding the largest clique or even that of approximating its size is NP-hard, this intuition is misleading." In general, only a clique a small fraction of the size of the largest one can be found efficiently.
A more manageable, and perhaps more practical, challenge is the analysis of algorithms for solving such problems on random graphs. Making a tongue-in-cheek case for considering random graphs, Alon said, "Mathematicians tend to prefer a worst-case analysis, a kind of paranoia that is especially understandable if you live in Israel! But algorithms that use information about the spectra of matrices associated with the graph can work especially well with random graphs."
The Probabilistic View
Richard Karp, a former SIAM John von Neumann lecturer and, coincidentally, the second joint invited speaker at the Toronto meetings, had introduced the probabilistic view of discrete algorithms in an influential 1976 paper. In that paper, Karp distinguished his notion---drawing instances of problems from specified probability distributions and showing that "certain polynomial-time algorithms almost surely find near-optimal solutions"---from an earlier idea of Ronald Graham's, finding polynomial-time algorithms that produce approximate solutions for all instances of a given NP-hard problem.
Karp was motivated by his observation that "a proof that P ≠ NP would have only limited negative implications. . . . There would still be a possibility of finding a polynomial-time algorithm that gives approximately optimal solutions for all inputs, or optimal solutions for almost all inputs, or approximately optimal solutions for almost all inputs." Although it is now known that the first alternative is unlikely, the other two are certainly possible.
Alon and his colleagues work in the middle category, developing algorithms that find, for example, what is almost surely the largest clique in a random graph drawn from a certain class. They exploit the surprising structure of random graphs, a study initiated in 1959 by Paul Erdös and Alfred Renyi, and they mine the load of structural information contained in the eigenvalues and eigenvectors of matrices associated with the graph.
There is a pleasing synergy between the structure of a graph and the spectrum of its adjacency matrix, a relation that links an active area of discrete mathematics and theoretical computer science with classical analysis and scientific computing. Few speakers could describe in one breath work whose roots were cultivated by both Paul Erdös and James Wilkinson!
The adjacency matrix A of a graph is the (symmetric) matrix in which the element ij is 1 if the nodes i and j are connected by an edge and 0 otherwise. If the graph is itself a clique---if every node is connected to every other node---then A is a matrix with 1's off the diagonal and 0's on the diagonal. A vector of 1's is an eigenvector of that matrix. The corresponding eigenvalue-the largest eigenvalue-is just the common degree of the nodes. For graphs with more varied structure, that vector of 1's is no longer an eigenvector of the adjacency matrix. However, it can be substituted into the classic Rayleigh quotient (x, Ax)/(x, x) to show that the largest eigenvalue is at least the average degree of the nodes of the graph. (The maximum of the Rayleigh quotient is the largest eigenvalue of A.)
If the graph contains a clique, the adjacency matrix of that clique is a submatrix of A, and the vector with 1's at the nodes of the clique and an appropriate small real number elsewhere is almost an eigenvector of A. For the class of random graphs considered by Alon, the corresponding eigenvalue is close to half the size of the clique.
If the clique is large enough, that eigenvalue will be the second largest. A corresponding eigenvector will tend to have large components at the nodes of the clique and small components elsewhere. An eigenvector associated with the second-largest eigenvalue of a graph, then, can identify most of the nodes in a sufficiently large clique---that is, it can order the nodes according to the size of their coordinates in the eigenvector.
To attack random graphs with large cliques, Alon began with a graph in which each pair of nodes is connected with probability 1/2 by an edge. As the size of such a graph grows, a known function gives with probability approaching 1 the size of the largest clique. (The number of nodes in the largest clique is proportional to the logarithm of the size of the graph.) Smaller cliques, up to about half the maximum size, can be found almost surely by a well-known polynomial-time algorithm.
But Alon and his colleagues can do better, by randomly adding a larger clique of specified size to random graphs of this type. If the added clique is too large, then its nodes are almost surely the nodes of the graph that have largest degree and the problem of finding the clique is trivial.
In intermediate cases, spectral techniques have led Alon to a polynomial-time algorithm that can reliably find cliques whose size is a small multiple of the square root of the number of nodes in the graph. Spectral information, Alon points out, gives an essential head start: "One can almost surely extract a big portion of the hidden clique using the eigenvector of the second-largest eigenvalue of the adjacency matrix." Alon's proof of the reliability of the algorithm couples spectral properties of the adjacency matrix with estimates for the binomial distribution that determines the structural properties of the random graph.
An important open problem is the design of reliable algorithms for finding smaller maximal cliques-those, say, whose size is a power somewhat less than half the number of nodes. Alon also described similar ideas and challenges that arise in vertex and edge expansions, in graph bisection, and in graph coloring.
Having seen Alon demonstrate the power of spectral methods for analyzing the structure of graphs, that rare mathematician who is given to puns might be driven to suggest that almost surely, eigenvectors will continue to point the way to faster algorithms for random graphs. Fortunately, the probability of occurrence of such individuals is vanishingly small!
Paul Davis is a professor in the Department of Mathematical Sciences at Worcester Polytechnic Institute.