SIAM Undergraduate Research Online

Volume 4


Spectral Clustering and Visualization: A Novel Clustering of Fisher's Iris Data Set

Published electronically February 23, 2011
DOI: 10.1137/10S010752

Authors: David Benson-Putnins (University of Michigan), Margaret Bonfardin (Washington University), Meagan E. Magnoni (Rensselaer Polytechnic Institute), and Daniel Martin (Davidson College)
Sponsors: Carl D. Meyer (North Carolina State University) and Charles D. Wessell (North Carolina State University)

Abstract: Clustering is the act of partitioning a set of elements into subsets, or clusters, so that elements in the same cluster are, in some sense, similar. Determining an appropriate number of clusters in a particular data set is an important issue in data mining and cluster analysis. Another important issue is visualizing the strength, or connectivity, of clusters.

We begin by creating a consensus matrix using multiple runs of the clustering algorithm k-means. This consensus matrix can be interpreted as a graph, which we cluster using two spectral clustering methods: the Fiedler Method and the MinMaxCut Method. To determine if increasing the number of clusters from k to k+1 is appropriate, we check whether an existing cluster can be split. Finally, we visualize the strength of clusters by using the consensus matrix and the clustering obtained through one of the aforementioned spectral clustering techniques.

Using these methods, we then investigate Fisher's Iris data set. Our methods support the existence of four clusters, instead of the generally accepted three clusters in this data.

Parameter Estimation in Differential Equations: A Numerical Study of Shooting Methods

Published electronically February 23, 2011
DOI: 10.1137/10S010739

Author: Franz Hamilton (George Mason University)
Sponsor: Timothy Sauer (George Mason University)

Abstract: Differential equation modeling is central to applications of mathematics to science and engineering. When a particular system of equations in used in an application, it is often important to determine unknown parameters. We compare the traditional shooting method to versions of multiple shooting methods in chaotic systems with noise added and conduct numerical experiments as to the reliability and accuracy of both methods.

On Numerical Methods for Elliptic Transmission Eigenvalue Problems

Published electronically March 4, 2011
DOI: 10.1137/10S010806

Author: Anirban Roy (Pennsylvania State University)
Sponsors: Anna L. Mazzucato (Pennsylvania State University) and Victor Nistor (Pennsylvania State University)

Abstract: The use of numerical tools to solve challenging problems in mathematics has exploded in the past several decades. The purpose of this paper is to compare the results of two different types of numerical methods in finding solutions to the eigenvalue problem for a second order elliptic partial differential equations (PDE) with boundary and transmission conditions. Transmission properties result from jumps in the coefficients of the equation and require more complex numerical methods to solve the eigenvalue problem than when the coefficients are continuous. We present the setup of both the bisection method to solve the exact equation satisfied by the eigenvalues and an application of the power method on a Finite Element Method discretization to find the largest eigenvalues and eigenfunction. We also provide some numerical evidence as to which method is more efficient given the complexities of our problem.

Macroscopic Cross-Diffusion Models Derived from Spatially Discrete Continuous Time Microscopic Models

Published electronically May 12, 2011
DOI: 10.1137/10S010818

Author: Stephen Ostrander (McMaster University)
Sponsor: Hermann J. Eberl (University of Guelph)

Abstract: We formulate a continuous time, discrete in space model for two spatially interacting species. The spatial interaction is described in terms of a measure for the desire or ability of a population to move from one location into a neighboring site. This can depend on local densities of both populations in the current and the target site. Refining the spatial resolution and passing to a continuous in space model, one obtains a system of partial differential equations with cross diffusion terms. We show that certain cross-diffusion models that have been used in the literature to describe interacting species can be derived as special cases with our approach.

Attractors: Nonstrange to Chaotic

Published electronically June 21, 2011
DOI: 10.1137/10S01079X

Author: Robert L. V. Taylor (The College of Wooster)
Sponsor: John David (The College of Wooster)

Abstract: The theory of chaotic dynamical systems can be a tricky area of study for a non-expert to break into. Because the theory is relatively recent, the new student finds himself immersed in a subject with very few clear and intuitive definitions. This paper aims to carve out a small section of the theory of chaotic dynamical systems---that of attractors---and outline its fundamental concepts from a computational mathematics perspective. The motivation for this paper is primarily to define what an attractor is and to clarify what distinguishes its various types (nonstrange, strange nonchaotic, and strange chaotic). Furthermore, by providing some examples of attractors and explaining how and why they are classified, we hope to provide the reader with a good feel for the fundamental connection between fractal geometry and the existence of chaos.

Computational Methods for a One-Dimensional Plasma Model with Transport Field

Published electronically August 18, 2011
DOI: 10.1137/11S010906

Author: Dustin W. Brewer (The University of Texas at Arlington)
Sponsor: Stephen Pankavich (United States Naval Academy)

Abstract: The electromagnetic behavior of a collisionless plasma is described by a system of partial differential equations known as the Vlasov-Maxwell system. From a mathematical standpoint, little is known about this physically accurate three-dimensional model, but a one-dimensional toy model of the equations can be studied much more easily. Knowledge of the dynamics of solutions to this reduced system, which computer simulation can help to determine, would be useful in predicting the behavior of solutions to the unabridged Vlasov-Maxwell system. Hence, we design, construct, and implement a novel algorithm that couples efficient finite-difference methods with a particle-in-cell code. Finally, we draw conclusions regarding their accuracy and efficiency, as well as, the behavior of solutions to the one-dimensional plasma model.

Moody's Mega Math Challenge 2011 Champion Paper-Colorado River Water: Good to the Last Acre-Foot

Published electronically November 11, 2011
DOI: 10.1137/11S011249
M3 Challenge Introduction

Authors: Caroline Bowman, Patrick Braga, Anthony Grebe, Alex Kiefer, and Jason Oettinger (Pine View School, Osprey, FL)
Sponsor: Ann Hankinson (Pine View School, Osprey, FL)

Summary: The arid region of the Southwestern United States holds one of the most important bodies of water in the nation: the Colorado River, which provides water to nearly 30 million people. The Colorado River Basin has been divided into Upper and Lower Basin regions since the signing of an interstate compact in 1922, and further agreements have specified the amount of water allocated to each state. Lake Powell, the reservoir formed by the Glen Canyon Dam, facilitates the sharing of water between the two basins by providing longterm storage for the Upper Basin's water and water to be sent to the Lower Basin.

With our first model, we develop a simplified geometric model of the shape of Lake Powell to simulate the effects of drought on the volume of the water in the reservoir. We conclude that in the worst-case scenario, when inflow equals 39% of the historic average, then the lake would run dry in 3.2 years. If inflow equals the probable value of 83% of the average then the lake's volume would reach about four-fifths of its capacity, and the high inflow of 137% of the average would yield maximum capacity.

From the second model, we conclude that the Glen Canyon Dam produces more energy if the reservoir is full, and that there is a large difference in the power generated between the three provided scenarios. This is due in part to the height of the reservoir as a direct result of the inflow and also to the fact that the outflow through the dam is dependent on the inflow if the reservoir becomes empty or full.

In our third model, we analyze the agricultural data related to the economy of the states that make up the Basin, examining the correlation between each state's water allocation and its agricultural GSP (Gross State Product). We consider how much water is allocated to each state as a result of the 1922 Compact and how this affects each state's GSP. We finally make recommendations of potential reductions to the amount of water that might be removed from the Colorado River to maintain a minimum capacity in Lake Powell.

Analysis of a Co-Epidemic Model

Published electronically November 15, 2011
DOI: 10.1137/11S010852

Author: Quinn A. Morris (Wake Forest University)
Sponsor: Stephen Robinson (Wake Forest University)

Abstract: Solutions to systems of differential equations which model disease transmission are of particular use and importance to epidemiologists who wish to study effective means to slow and prevent the spread of disease. In this paper, we examine a system that models two related diseases within a population, which is of particular importance to those studying co-infection and partial cross-immunity phenomena. Criteria for stability of equilibria are improved upon from previous research by Long, Vaidya, and Brandeau (2008).

Interval Estimates for Predictive Values in Diagnostic Testing with Three Outcomes

Published electronically November 17, 2011
DOI: 10.1137/11S010888

Authors: Scott Clark, Lauren Mondin, Courtney Weber, and Jessica Winborn (Sam Houston State University)
Sponsor: Melinda Miller Holt (Sam Houston State University)

Abstract: In disease testing, patients and doctors are interested in estimates for positive predictive value (PPV) and negative predictive value (NPV). The PPV of a test is the probability that a patient actually has the disease, given a positive test result. The NPV is the probability that a patient actually does not have the disease, given a negative test result. Here we consider diagnostic tests in which the disease state remains uncertain, so the uncertain predictive value (UPV) is also of interest. UPV is the probability that, given an uncertain test result, follow-up testing will remain inconclusive. We derive classical Wald-type and Bayesian interval estimates of PPV, NPV, and UPV. Performance of these intervals is compared through simulation studies of interval coverage and width.

Understanding the Impact Of Boundary and Initial Condition Errors on the Solution to a Thermal Diffusivity Inverse Problem

Published electronically November 18, 2011
DOI: 10.1137/11S011237

Author: Xiaojing Fu and Brian Leventhal (Clarkson University)
Sponsor: Kathleen Fowler (Clarkson University)

Abstract: In this work, we consider simulation of heat fl w in the shallow subsurface. As sunlight heats up the surface of soil, the thermal energy received dissipates downward into the ground. This process can be modeled using a partial differential equation known as the heat equation. The spatial distribution of soil thermal conductivities is a key factor in the modeling process. Prior to this study, temperature profile were recorded at different depths at various times. This work is motivated by trying to match these temperature profile using a simulation-based approach in the context of an inverse problem. Specificall we determine soil thermal conductivities using derivative-free optimization to minimize the nonlinear-least square errors between simulation and data profile We also study how errors in the initial and boundary conditions propagate overtime using numerical approach.

European Option Pricing Using a Combined Inverse Congruential Generator

Published electronically November 28, 2011
DOI: 10.1137/10S010776

Authors: Yered Pita-Juarez and Steven Melanson (California State University, Sacramento)
Sponsor: Coskun Cetin (California State University, Sacramento)

Abstract: One of the main problems in mathematical finance is to find the fair price of various contracts that convey a right, known as options, which depend on the price of other financial assets like stocks, known as the underlying assets.  A "fair" price for some of these contracts may not be obtained analytically.  In this manner, Monte Carlo simulations offer a convenient way to compute the fair price numerically, relying on the approximation of an expected value by the average of the simulated values. We briefly discuss some common random number generators, including a combined inverse congruential random number generator, in Monte Carlo simulations to compute the fair price of a European call option and to analyze the sensitivity of the price with respect to the changes in the key model parameters.

Choosing Basis Functions and Shape Parameters for Radial Basis Function Methods

Published electronically December 2, 2011
DOI: 10.1137/11S010840

Author: Michael Mongillo (Illinois Institute of Technology)
Sponsor: Greg Fasshauer (Illinois Institute of Technology)

Abstract: Radial basis function (RBF) methods have broad applications in numerical analysis and statistics. They have found uses in the numerical solution of PDEs, data mining, machine learning, and kriging methods in statistics. This work examines the use of radial basis functions in scattered data approximation. In particular, the experiments in this paper test the properties of shape parameters in RBF methods, as well as methods for finding an optimal shape parameter. Locating an optimal shape parameter is a difficult problem and a topic of current research. Some experiments also consider whether the same methods can be applied to the more general problem of selecting basis functions.

Enter Title