Mining Science and Engineering Data

Chandrika Kamath
Center for Applied Scientific Computing
Lawrence Livermore National Laboratory

Data analysis techniques have long been used to analyze scientific and engineering data. With sensors becoming ubiquitous, computers simulating complex processes at an unprecedented pace, and ever-improving data storage capabilities, petabyte-scale datasets are becoming routine. This has led to the innovative application of data mining techniques to novel and challenging problems.

In this tutorial, we will first give a brief introduction to data mining in the context of scientific and engineering applications. Using examples from diverse fields such as astronomy, biology, physics, and remote sensing, we will identify the common threads that permeate the mining of scientific/engineering data. We will consider practical solutions to issues such as feature extraction and data fusion that differentiate scientific data mining from its commercial counterpart. In addition, we will describe the challenges faced in mining spatio-temporal data from video and computer simulations. We will also discuss several approaches being used for the analysis of such data. Our goal is to show that the diversity of applications, the richness of the problems faced by practitioners, and the opportunity to borrow ideas from other more established areas of data analysis, make scientific data mining an exciting and challenging field.

Chandrika Kamath received the Ph.D. degree in computer science from the University of Illinois at Urbana-Champaign in 1986. Prior to joining Lawrence Livermore National Laboratory in 1997, Chandrika was a Consulting Software Engineer at Digital Equipment Corporation. Her research interests are in large-scale data mining and pattern recognition, including image processing, feature extraction, dimension reduction, and classification and clustering algorithms. She is also interested in the practical application of these techniques. Chandrika is currently the project lead and an individual contributor for Sapphire, a project in large-scale data mining.

More information available at http://www.llnl.gov/CASC/people/kamath/.

Return to Program