Avoiding Communication in Numerical Linear Algebra
Algorithms have two kinds of costs: arithmetic and communication, by which we mean moving data either between levels of a memory hierarchy (in the sequential case) or between processors over a network (in the parallel case). Communication costs can already exceed arithmetic costs by orders of magnitude, and the gap is growing exponentially over time, so our goal is to design linear algebra algorithms that minimize communication.First, we describe lower bounds on communication that any O(n^3)-like algorithm must satisfy, almost none of which are attained by conventional algorithms. Second, we describe new algorithms that do attain them, communicating asymptotically less than conventional ones, and demonstrate large modeled and measured speedups. Third, we do the same for iterative solvers for sparse matrices. In all cases there is a large design space of new algorithms to explore, for which we use autotuning.
James Demmel, University of California at Berkeley, USA