A Shot in the Arm for Integer ProgrammingNovember 16, 2008
Figure 1. In 1995, the Centers for Disease Control required that children be immunized for five disease groups. Courtesy of Sheldon Jacobson.
Barry A. Cipra
Twelve years ago, Sheldon Jacobson and Edward Sewell started to work on an operations research problem in public health that did not exist at the time---and is barely problematic even today. The tools they and colleagues have developed, however, could soon be essential to the white coats and bean-counting bureaucrats who are responsible for deciding how best to meet the schedule of mandatory childhood immunizations from a growing list of vaccines for a growing list of diseases. Jacobson, an OR expert at the University of Illinois at Urbana–Champaign, described the problem and tools in a mini-symposium on optimization at the SIAM Conference on Discrete Mathematics, held June 16–19 at the University of Vermont.
In 1995, the Centers for Disease Control, which sets the schedule for pediatric care, listed five groups of diseases for which immunization was required, with multiple doses required for most (see Figure 1). Off-hand, there was no problem at all. The parent simply took the infant to the doctor, who simply gave two or three or four injections per visit, until all required doses of each vaccine had been administered. Where it got slightly complicated was that vaccine manufacturers, of which there were several at the time, were developing combination vaccines that could, in effect, deliver two or occasionally three single doses for more than one disease group. This meant that a cost-conscious doctor or public health official had more than one way to meet the schedule, some more economical than others. Still, the options were so limited that even a math-phobic family physician could easily figure out the best way to proceed.
But Bruce Weniger, a physician, and Robert Deuson, an economist, both at the CDC, saw an oncoming computational crisis: Both the number of diseases for which vaccines were available and the number of different combination vaccines were going to grow, and as they did, the number of possible ways to meet the schedule was going to exceed the capabilities of eyeball optimization. That prospect has become reality. The current CDC schedule covers 11 disease groups (see Figure 2), and four combination vaccines are available, two of which are "pentavalent": GlaxoSmithKline's Pediarix, which immunizes simultaneously for diphtheria, tetanus, pertussis, poliovirus, and hep-atitis B, and Pentacel, manufactured by Sanofi Pasteur, for diphtheria, tetanus, pertussis, poliovirus, and Haemophilus influenzae type b.
Figure 2. Recommended immunization schedule for children from 0 to 6 years of age---United States, 2008. From the Centers for Disease Control; more information can be found at http://www.cdc.gov/vaccines/recs/schedules/child-schedule.htm.
Another significant complicating factor is the overlapping of timeframes for the delivery of doses for individual diseases. If the CDC set a schedule calling for delivery of each dose for each disease in a single, specified month, there would, again, be little problem and life would be easier for the operations researchers. But the CDC's job is not to make the mathematics easy---it's to make life better for people. As it is, the problem breaks into a sum of subproblems whenever one month's options don't overlap with the next month's, as happens in the current schedule for months 2 and 4. But months 6 through 18 are interlinked.
(It could be worse, Jacobson points out. In one respect, the CDC has simplified the mathematics: It used to have overlapping timeframes for successive doses for certain individual diseases, such as hepatitis B, with dose 1 to be given in months 0 (birth) to 2, and dose 2 in months 1, 2, or 4, with the stipulation that both doses could not be given in months 1 or 2 and that a minimum amount of time had to elapse between the doses. The current schedule has eliminated that option---but for reasons that have nothing to do with computational complexity.)
Deuson met Jacobson at an INFORMS (Institute for Operations Research and the Management Sciences) conference in 1996 and got him interested in the problem. Jacobson went to work on it with Weniger, Edward Sewell of Southern Illinois University at Edwardsville, and, over the years, a number of graduate students (including Shane Hall, who worked on mathematical aspects of the problem for his PhD dissertation and is now at the Air Force Institute of Technology) and others. Over the last 10 years, the group has refined the model and its software implementation, which is freely available at www.vaccineselection.com.
The Web site has been averaging a healthy 200 hits per month in the last few years, Jacobson says. The expectation, he adds, is that "the use of it will explode" once Pentacel, the second of the pentavalent vaccines, becomes widely available through the Vaccines for Children program.
The problem lends itself to integer programming: In any given month, an infant either receives a given vaccine or does not; each vaccine covers a certain set of diseases; constraints are imposed by the CDC schedule; and there are costs to be minimized, associated with the vaccine and its delivery (i.e., the doctor or nurse who administers it). The current model allows for arbitrary cost factors, such as pain and suffering of the patient (i.e., combination vaccines are "good" because they require fewer injections) and opportunity costs to the parent, who has to take time off from work or play to take the infant to the clinic; the doctor might not care about these factors, although public health officials should and often do consider them. Another important potential cost is "extraimmunization": giving the patient an un-called-for dose against some disease simply because it's cheaply bundled in a combination that covers other requirements. While generally innocuous from a scientific standpoint, such inoculations may be deemed undesirable for other reasons; the group's model facilitates their suppression or elimination.
The number of variables and constraints posited by the current CDC schedule and formulary of available vaccines does not pose a daunting task for commonly used integer programming solvers---CPLEX, say---but it does argue for machine computation over eyeball estimates. As Jacobson puts it, "It's hard to solve an integer program in your head." His group's aim is to keep pace with a growing schedule and formulary---on a problem that is known, as might be expected, to be NP-complete.
With Sewell taking the algorithmic lead, the group has recast the integer programming problem as a problem in dynamic programming--in particular, as a network shortest-path problem. "We created our own solver," Jacobson says, using a method they call "branch, bound, and remember." "Branch and bound" is a standard technique in search problems; "remembering" is at the heart of dynamic programming. In essence, after considering all the different vaccination states a child might progress through in growing up, and the costs associated with advancement from one state to another, the algorithm picks out the path of least cost (see Figure 3).
Figure 3. Shortest-path (i.e., minimum-cost) computations are pretty easy with only two diseases and four time periods. They get more complicated as the graph grows. Courtesy of Sheldon Jacobson.
What is the value of finding the optimal formulary over, say, a random selection of vaccines that satisfy the CDC's schedule? It depends, of course, on the various cost parameters, but Jacobson estimates that optimization could shave about 10% from a total price tag that starts at about $2000 per child. With about 4 million infants born each year, that translates into a pretty penny.
Jacobson sees general lessons to be learned from the 12-year experience. The vaccination-schedule work, he says, began as an informal, unfunded effort to provide a practical answer to a relatively straightforward integer programming problem. Today, with funding from the National Science Foundation, project researchers are working to understand the structure of a wide class of problems. Jacobson expects that applications of the current research will range from military operations to environmental concerns, such as reducing carbon footprints. Optimizing over the CDC immunization schedule, he says, "is really a very elegant combinatorial problem."
Barry A. Cipra is a mathematician and writer based in Northfield, Minnesota.