Sparse Matrix Algorithm Drives SPICE Performance GainsMay 1, 2007
Readers may remember spending hours in an engineering lab, painstakingly breadboarding a circuit to test a design: carefully positioning resistors and leads, applying voltage in the right place, measuring currents, and making adjustments to the design until the circuit behaved correctly.
Although helpful for testing relatively simple discrete designs, the method can't be used for complex integrated circuits, with their multiple layers generated by photo-lithography. In that case, researchers turn to circuit simulation software to test their designs before they are sent to be manufactured.
SPICE---Simulation Program with Integrated Circuit Emphasis---was created at Berkeley in the 70s, and has since evolved and sparked the development of numerous versions, some open-source and others commercial. The various programs generate sets of differential equations based on the layout and components of a circuit, then use one of several methods to solve the equations, thereby "simulating" the circuit.
One method used in many circuit simulation programs involves sparse matrix techniques. For small circuits, most solvers work very quickly, with little discernible difference in running times. As circuits get larger, however, the sparse matrix factorization algorithm used increasingly affects performance.
Tim Davis, an associate professor of computer science at the University of Florida, has developed a high-performance algorithm for factoring matrices arising in circuit simulation. He presented some of the underlying background in an invited talk in Boston last summer at the SIAM Annual Meeting. The algorithm has been implemented in several versions of SPICE, both commercial and open-source. Davis calls the LU decomposition-based algorithm KLU---LU for its expression of matrices as products of lower and upper triangular matrices, K for Clark Kent.
The KLU algorithm, according to Davis, is "somewhere between Gilbert/Peierls' left-looking LU and Li, Demmel, and Gilbert's SuperLU." As to Clark Kent, Davis thinks of KLU as "what SuperLU was before it became super."
The key step of the Gilbert/Peierls algorithm (GPLU) is the solution of Lx = b, where L, x, and b are sparse. In this step, the columns of L and U are computed, one at a time. GPLU does not specify a fill-reducing ordering, a choice that is crucial for circuit simulation matrices. SuperLU takes advantage of supernodes---collections of adjacent columns that have the same nonzero pattern. Circuit matrices are typically too sparse to allow efficient exploitation of supernodes, however, which means that the "super" nature of SuperLU is not very useful in circuit simulation.
KLU builds on GPLU by using preordering strategies not included in GPLU or SuperLU, but avoids the supernode-based strategy of SuperLU. Very little fill-in occurs with the ordering used by KLU. The resulting algorithm has improved the speed of circuit simulation in many cases.
One notable implementation of KLU is in Xyce, Sandia National Laboratories' version of SPICE. Created in 1999 to fill the need for a distributed-memory parallel circuit simulation tool at Sandia, Xyce was designed as a parallel code. KLU was first implemented in Xyce in 2004; on its official release in 2005, the software performed substantially better in certain cases.
"Xyce with KLU has been used to perform critical variability studies wherein hundreds of versions of Xyce are run, in serial on a parallel computer cluster," says Scott Hutchinson, manager of electrical and microsystems modeling at Sandia. "The KLU solver's improved performance . . . [has decreased] the linear solution times anywhere from 20% to an order of magnitude as the problem size increases."
The algorithm could have a huge impact on overall simulation performance, Hutchinson explains, because of the order of operations performed during a linear solve. "With most linear problems, what makes them time consuming is that the number of operations . . . scales superlinearly with the number of unknowns." The KLU solve, as "the ‘inner kernel' of a nonlinear and time-integration solve, can get called millions of times. Thus, speed improvements at this level can have a big impact on the overall time to solution."
KLU is a strong example of specialized improvement: Davis developed it expressly for circuit simulation, and the improvements over SuperLU are limited to circuits. KLU also improves on another factorization method, the Cholesky decomposition, in the case of circuit simulation: While Cholesky can be twice as efficient as KLU and other methods for solving linear equations, it typically fails in circuit simulation.
"Cholesky is good only if the matrix is symmetric and positive-definite. It fails otherwise," Davis says. "Circuit matrices are usually not symmetric."
Additionally, Davis points out, SuperLU and UMFPACK (another sparse linear system solver, which is used in MATLAB) are better for discretizations of, say, two- and three-dimensional differential equations, but inferior to KLU for circuits.
"Circuits do not have a 2D or 3D geometry; they look more like networks, not meshes," he says, explaining KLU's advantage for circuits.
Despite being created for circuit simulation, KLU may also have specialized applications elsewhere. Davis suggests that it might also be an improvement over SuperLU and other methods when used for chemical process simulation, where the very sparse, irregular network-like structure of the matrices involved is similar to that of circuit simulation matrices.
In the meantime, however, Davis is still working with commercial developers of other SPICE-type simulators to further improve KLU's performance in circuit simulation. Hutchinson and his colleagues at Sandia currently use KLU only as a serial solver, but would like to expand its use for larger-scale problems.
Currently, "at some problem size---say 100,000 devices---we switch to using parallel computers and iterative solvers" instead of KLU, Hutchinson says. "We are working with Tim on a parallel version of KLU to improve things further as the problems scale."
MPEG and WMV versions of Davis's talk, as well as an abstract and two series of slides (in PDF format), are available on his Web site at http://www.cise.ufl.edu/research/sparse/SIAM06/.
Tim Davis, a computer scientist at the University of Florida, is the author of Direct Methods for Sparse Linear Systems, which was published by SIAM last year.
Michelle Sipics is a contributing editor at SIAM News.