## HIV Modeling Emerges as Paradigm for New Statistics/Math Sciences Institute

**December 8, 2002**

"There has been a bifurcation between applied mathematics and statistics," says H.T. Banks (right) of North Carolina State University. "PDE modelers can't handle the uncertainty of their models, and statisticians aren't familiar with certain modeling considerations." SAMSI---the Statistical and Applied Mathematical Sciences Institute---was created to get the two groups working together, usually with domain scientists studying large complex systems. SAMSI has three co-directors: Banks and statisticians Alan Karr (left) of the National Institute of Statistical Sciences and Steve Marron (center) of the University of North Carolina. SAMSI's director is James Berger of Duke University.

Banks, a professor of mathematics at North Carolina State University, is teaching the course with Marie Davidian, a colleague from the NCSU statistics department. It is part of a much larger project in which they, and the university, are involved: the Statistical and Applied Mathematical Sciences Institute, SAMSI for short.

Officially up and running for only a few months, SAMSI is one of six institutes in the mathematical sciences funded by the National Science Foundation's Division of Mathematical Sciences. It joins the well-established and long-running Institute for Mathematics and Its Applications at the University of Minnesota and the Mathematical Sciences Research Institute in Berkeley, the two-year-old Institute for Pure and Applied Mathematics at UCLA, and its two new counterparts: the Mathematical Biosciences Institute at Ohio State University (see *SIAM News*, October 2002) and the American Institute of Mathematics (AIM) Research Conference Center, in Palo Alto.

SAMSI's mission, as stated on its Web site (www.samsi.info/), is to "forge a new synthesis of the statistical sciences and the applied mathematical sciences with disciplinary science to confront the very hardest and most important data- and model-driven scientific challenges."

Funded by NSF at $10 million for five years, SAMSI is a consortium of four North Carolina institutions: NCSU, Duke University, the University of North Carolina, and the National Institute of Statistical Sciences (NISS). The overall director is James Berger of Duke; the co-directors are Banks, representing NCSU (and applied mathematics), and statisticians Alan F. Karr of NISS and J.S. Marron of UNC. Like its co-directors, SAMSI is two thirds statistics, one third mathematics. Its home is the top floor of the NISS building in Research Triangle Park.

**The Ubiquity of Inverse Problems**In conversation with

*SIAM News*, Banks portrays himself as the most reluctant of directors. Having spent the last ten years as director of the Center for Research in Scientific Computation at NCSU, he has hands-on awareness of the rewards and challenges of running a large center funded primarily by the government (Defense Department agencies in the case of CRSC). And so two and a half years ago, when NSF released a solicitation for new institutes in the mathematical sciences, he was ambivalent. While supportive of the idea that the three to four hundred mathematicians and statisticians in North Carolina's Research Triangle had a collective contribution to make, Banks didn't initially see how he could be part of the venture.

At the same time, though, his own research was moving into a new area---HIV modeling---that had him working with an extended multidisciplinary team---faculty, students, and industry researchers in applied mathematics, statistics, biostatistics, and the biomedical sciences. What the group needed, he tells *SIAM News*, was "a place for the sorts of interactions that drive such a project." As it turned out, the place he was describing turned out to be a perfect match for the institute he eventually joined his North Carolina colleagues in proposing, and then running.

Two months into SAMSI's first program, he tells *SIAM News* how the HIV modeling effort has "emerged as a sort of paradigm for SAMSI." From his perspective, it was in 1999, at a SIAM Mathematics in Industry workshop in Raleigh, that the project was launched. During the workshop, Sarah Holte, a biostatistician from the Fred Hutchinson Cancer Research Center in Seattle, told him about problems arising in her work on HIV models. She had access to huge amounts of longitudinal data for individuals being treated for HIV, but was having trouble fitting mathematical models to the data.

Intrigued, Banks began to look into the problems Holte described, especially that of a controversial therapeutic strategy called structured treatment interruption, or STI. Based on inexplicably low virus counts seen in HIV-infected patients who stopped and later (in some cases much later) restarted their treatment regimens, it appeared that it might be possible to train the immune system to clear HIV without drugs, at least for substantial parts of the patients' lives.

Not everyone is convinced that STI could become an effective treatment regimen, and its proponents don't know why the observed effects occur. But it seems likely that progress in designing treatment regimens that would allow people to go off therapy, maybe for several years, will depend on good mathematical models. How HIV attacks the immune response system, how patients respond to various treatment regimens---it is questions of this type that mathematical modelers can address.

Holte had tried difference equation models, and later ordinary differential equation models; delay differential equations, she thought, might be the answer. Banks had worked with delay equations, if twenty years earlier; he and a graduate student, David Bortz (now at the University of Michigan), added some sophisticated delays to the models. Because each cell is different and because any data will be aggregate, Banks explains, the delays had to be random-which is where statisticians come in, providing estimates of probability distributions. When the random delays were included, he says, "the models started to look like the data."

"People should not try to work on problems like this by themselves," Banks says; they're sufficiently challenging to require expertise in a variety of areas. "STI is a control problem," he points out. "When you want to start controlling what goes on at the cellular level with drugs, you need a model that describes the dynamics, that lets you see what happens with drugs, administered in different ways." And, of course, "with complex phenomena, when you write down equations, you're going to have uncertainty."

In the beginning, Holte, Banks, and some students were the extent of the project team. But given the prominence of the statistical issues, Banks sought the advice of Marie Davidian, the NCSU colleague with whom he's now teaching the interdisciplinary SAMSI course mentioned earlier; the two are also co-chairs of the first semester-long SAMSI research program, "Inverse Problem Methodology in Complex Stochastic Models," which began in September and continues through January 2003.

"Inverse problems are a great topic to start with," Banks says, "because they're ubiquitous." Although mentioned explicitly only in the first semester's program, inverse problems will arise in all the SAMSI research programs. In fact, Banks sees in them the "value-added element" that makes SAMSI "far more than a mathematics department and a statistics department, side by side."

**Focus on the Next Generation of Interdisciplinary Scientists**Sarah Holte was a keynote speaker at the opening workshop for SAMSI's program on inverse problems, held September 21-24. (She was also a problem presenter at NCSU's annual graduate student modeling workshop in the summer of 2001.) The approximately 120 participants in the SAMSI workshop were from applied mathematics, statistics, and some of the application domains. Along with HIV dynamics, pharmacokinetics, polymers, and dielectric materials were established as "testbed" examples for the program, whose ambitious goal is the joint development of mathematical and statistical frameworks for estimation in complex nonlinear systems.

Banks and Davidian were especially encouraged, and somewhat surprised, by the strong response to the workshop tutorial (which they presented jointly, in PDE modeling and statistics); about three quarters of the workshop participants attended the tutorial, and about 70% of that group were people whose degrees had been awarded within the last five years. "We hit the mother lode," Banks says; "that's exactly the group we need to reach."

Also encouraging was the enthusiasm generated by Holte and the other domain scientists who gave plenary lectures, in application areas including, in addition to HIV, electromagnetics, polymers, pharmacokinetics, bread dough. . . . Inspired by the talks, many of the young people in attendance will go back to their institutions and seek out scientists with whom they can work on a variety of problems, Banks says. "And that's what SAMSI is good at. . . . An important question we need to answer is, How do you train a generation of young people to work on these problems?"

Already fully operational, SAMSI has six postdocs and 12 graduate fellows, four from each of the three universities. As the co-director responsible for education and outreach, Banks is now coordinating the development of programs for minorities, college juniors and seniors, and high school teachers.

The SAMSI program scheduled for the spring semester (January to June, 2003) is "Large-Scale Computer Models for Environmental Systems." Working with environmental scientists, applied mathematicians will bring their focus on deterministic models and statisticians their emphasis on the use of extensive data to joint efforts to develop more generic stochastic models. Research programs planned for 2003-04 are "Multiscale Modeling/Control" and "The Internet."

*Inspired by the talks at the opening workshop for SAMSI's program on inverse problems, many of the young people who attended will seek out "domain scientists" with whom they can work on complex interdisciplinary problems, says SAMSI co-director H.T. Banks.*

Along with its research programs, SAMSI will conduct year-long "synthesis programs." Under way now is the first such program, "Stochastic Computation." Participants will spend the year looking at the available methods, Banks says; the expected outcome is a publication describing the methods that work best in various settings. The synthesis program scheduled for next year is "Data Mining."

"What it really takes to work on all these problems is an open mind," Banks says in conclusion. "You have to be willing to learn new things." The work "requires a level of trust in your colleagues, and every one of you has to be willing to forage in new areas."

Information about all SAMSI programs can be found at www.samsi.info/.