SIAM International Conference on Data Mining (2003)
Cathedral Hill Hotel, San Francisco, CA
May 1-3, 2003

Co-Sponsored by
Army High Performance Computing Research Center (AHPCRC)
and University of Illinois, Chicago


About the Conference

Advances in information technology and data collection methods have led to the availability of large data sets in commercial enterprises and in a wide variety of scientific and engineering disciplines. We have an unprecedented opportunity to analyze this data and extract intelligent and useful information from it. The field of data mining draws upon extensive work in areas such as statistics, machine learning, pattern recognition, databases, and high performance computing to discover interesting and previously unknown information in data.

This conference will provide a forum for the presentation of recent results in data mining, including applications, algorithms, software, and systems. There will be peer reviewed, contributed papers as well as invited talks, tutorials and a panel. Best paper awards will be given for papers in different categories. Proceedings of the conference will be available both online at the SIAM Web site and in hard copy form. In addition, several workshops on topics of current interest will be held on the final day of the conference.

Organizing Committee

Steering Committee

Vipin Kumar, Chair
AHPCRC, University of Minnesota
Robert Grossman
The Laboratory for Advanced Computing, University of Illinios, Chicago
Tom Mitchell
Carnegie Melon University
Steven Ashby
Lawrence Livermore National Laboratory
Jiawei Han
Simon Fraser University
Andrew Odlyzko
Digital Technology Center, University of Minnesota
Umeshwar Dayal
Hewlett-Packard Laboratories
David Hand
Imperial College, UK
N. Radhakrishnan
Army Research Laboratory
Usama Fayyad
Heikki Mannila
Jeffrey Ullman
Stanford University

Conference Co-Chairs:
Michael Berry, University of Tennessee
Rajeev Rastogi, Bell Labs, Lucent Technology

Program Chairs:
Daniel Barbara, George Mason University
Chandrika Kamath, Lawrence Livermore National Laboratory

Tutorial Chair:
Joydeep Ghosh, University of Texas

Workshop Chair:
Hillol Kargupta, University of Maryland, Baltimore County

Publicity Chair:
Aleksandar Lazarevic, AHPCRC, University of Minnesota

Sponsorship Chair:
Mohammed Zaki, Rensselaer Polytechnic Institute

Program Committee

Chid Apte
IBM T.J. Watson Research Center
Lars Asker
Stockholm University/KTH
Sabyasachi Basu
The Boeing Company
Daniel Boley
University of Minnesota
Kevin W. Bowyer
University of Notre Dame
Paul Bradley
digiMine, Inc.
Amy Braverman
Jet Propulsion Laboratory, California Institute of Technology
Donald E. Brown
University of Virginia, Charlottesville
Michael C. Burl,
University of Colorado, Boulder
Philip Chan
Florida Institute of Technology
Alok N. Choudhary
Northwestern University
Corinna Cortes
AT&T Labs-Research
Dennis D. Cox
Rice University
Umeshwar Dayal
Hewlett-Packard Laboratories
Dennis DeCoste
Jet Propulsion Laboratory / Caltech
Inderjit S. Dhillon
University of Texas, Austin
Martin Ester
Simon Fraser University
Ronen Feldman
Clear Forest Corporation
Peter A. Flach,
University of Bristol
Johannes Gehrke
Cornell University
Sara James Graves
University of Alabama in Huntsville
Dimitrios Gunopulos
University of California, Riverside
Yike Guo
Imperial College
Jiawei Han
University of Illinois, Urbana-Champaign
Howard Ho
IBM Almaden Research Center
Bala Iyer
IBM Silicon Valley Laboratory
Edward Ip
University of Southern California
Anil Jain
Michigan State University
George Karypis
University of Minnesota
Helene E. Kulsrud
Center for Commications Research
Diane Lambert
Bell Laboratories
Yann LeCun
NEC Research Institute
Bing Liu
University of Illinois, Chicago
Christopher Meek
Microsoft Research
Nina Mishra
Hewlett-Packard Laboratories
Dunja Mladenic
J.Stefan Institute, Slovenia
Doug Nychka,
National Center for Atmospheric Research
Zoran Obradovic
Temple University
Gregory Piatetsky-Shapiro
Greg Ridgeway
The RAND Corporation
John Riedl
University of Minnesota
John Roddick
Flinders University, Australia
Ted E. Senator
Kenneth C. Sevcik
University of Toronto, Canada
Arno Siebes
Utrecht University, The Netherlands
David Skillicorn
Queen's University, Canada
Padhraic Smyth
University of California, Irvine
Myra Spiliopoulou
Leipzig Graduate School of Management, Germany
Jaideep Srivastava
University of Minnesota
Prof. Domenico Talia
Universita' della Calabria, Italy
Takao TERANO, Prof.,
University of Tsukuba, Japan
Hannu Toivonen
University of Helsinki
Shusaku Tsumoto
Shimane Medical University
Michael Turmon
Jet Propulsion Laboratory/Caltech
Prof. Lyle H. Ungar
University of Pennsylvania
Dr. Ramasamy Uthurusamy
General Motors Corporation
Shivakumar Vaithyanathan
IBM Almaden Research Center
Grace Wahba
University of Wisconsin-Madison

Jason T. L. Wang
New Jersey Institute of Technology

Sally Wood
University of New South Wales, Australia
Xintao Wu
University of North Carolina, Charlotte
Osmar R. ZAIANE,
University of Alberta
Mohammed Zaki
Rensselaer Polytechnic Institute

Topics of Interest

Methods and algorithms:

· Query/Constraint-based Data Mining
· Probabilistic/Statistical Methods
· Mining Spatial, Temporal and Heterogeneous data
· Trend and Periodicity Analysis
· Parallel/Distributed/Agent Techniques
· Integration: Mining, Warehousing and OLAP
· Mining of Data Streams
· Scalable Algorithms
· Data Reduction/Pre-processing
· Feature Extraction and Selection
· Post-processing
· Collaborative Filtering/Personalization
· Cost-based Decision Making
· Visual Data Mining


· Intelligence Analysis
· Remote Sensing and Earth Sciences
· Non-destructive Evaluation
· Text, Video, and Multi-media Mining
· Astronomy
· Intrusion Detection
· Genomics, Bioinformatics, and Biometrics
· E-Commerce and Web Data
· Financial Data Analysis
· Medical and Health Industry
· Case Studies
· Novel Applications
· Benchmarks

Human Factors and Social Issues:

· Languages/User Interface for Data Mining
· Security
· Privacy of Information
· Risk Analysis
· Intellectual Ownership


RFtools--two-eyed Algorithms
Leo Breiman, University of California, Berkeley

The Mathematics of Privacy
George Cybenko, Dartmouth College

Data Mining Evolved: Trends and Challenges
Usama M. Fayyad, President & CEO, digiMine, Inc.

Algorithmic Issues in Monitoring and Mining Network Data Streams
Muthu Muthukrishnan, Rutgers University


The conference will feature workshops and tutorials on several special topics to be held during the conference. The deadline for submissions has past and the following workshops and tutorials will be featured at the conference:


High Performance, Pervasive, and Data Stream Mining

Discrete Mathematics and Data Mining

Text Mining

Scientific Data Mining

Data Mining for Counter Terrorism and Security

Clustering High Dimensional Data and its Applications


Mining Science and Engineering Data
Dr. Chandrika Kamath, Lawrence Livermore National Laboratory

Data Mining: Technology and Practice in the Real World
Monte F. Hancock, Jr., Chief Scientist, CSI Corporation

Relational Data Mining
Prof. Saso Dzeroski, Josef Stefan Institute, Ljubljana, Slovenia

Visual Data Mining
Dr. Mihael Ankerst and Prof. Daniel A. Keim
Boeing, Seattle; University of Constance, Germany,


The SDM-2003 organizing committee is seeking high quality workshop proposals. Selected workshops will focus on new challenges and initiatives in data mining research and applications. They will foster the discussion of exciting research directions and works in progress through paper presentations, discussions, and invited talks. Each workshop will be a daylong event.

The responsibilities of the workshop organizers include (1) writing the call for papers and publicizing it, (2) selecting the workshop organizing and program committees, (3) deciding the workshop program content, (4) selecting the papers through peer review process, and (4) delivering the proceedings to the press in time.

Workshop Submission Instructions

Workshop proposals should be sent via email to the SDM-2003 Workshops Chair, Hillol Kargupta ( before September 1, 2002.

A workshop proposal should include the following information:

a) Workshop title
b) Workshop organizers with full contact information
c) Description of the workshop including objectives, content, and format of the workshop
d) List of potential attendees
e) List of potential authors of workshop contributions
f) A short biography of each organizer

Workshop Deadlines

Deadline for proposal submission: September 1, 2002, PASSED
Decision notification: September 15, 2002, PASSED
Call for workshop papers: October 1, 2002, PASSED
Paper Submission Deadline: February 1, 2003
Acceptance notification to the authors: March 1, 2003
Camera-ready workshop proceedings: March 30, 2003

For any questions regarding workshops for SDM2003, please contact:
Assistant Professor, Department of Computer Science and Electrical Engineering
1000 Hilltop Circle, University of Maryland, Baltimore County
Baltimore, MD 21250
Voice: (410) 455-3969


For SDM-2003, we are seeking proposals for tutorials on all topics related to data mining.
A tutorial may be a theme-oriented comprehensive survey, discuss novel data mining techniques or may center around successful applications of data mining.

Tutorials are open to all conference attendees without any extra fees. The typical tutorial will be 2 hrs long, and held in parallel with two paper presentation tracks during the main conference program. This format encourages participation - previous SDM conference attracted 50 to 100 attendees per tutorial.

If interested, please email ( a two or three page proposal providing:

1. Title and intended audience.
2. Amount of time intended. The recommended slot is 2hrs, but longer tutorials may be accommodated.
3. A short description and outline, to provide a sense of both the scope and depth of the tutorial.
You may optionally specify the material to be covered for a 2-hour, 3-hour and/or 4-hour period.
4. A short biography of each tutor (or a URL pointer to one). The tutor must NOT focus mainly on his/her research results. SDM tutorials are not the forum for promoting one's research or product. If for certain parts of the tutorial, the material comes directly from the tutor's own research or product, please indicate that in the proposal. Also, any pointers to related tutorials that you may have recently presented will be appreciated.

Timeline for Tutorials

Deadline for proposal submission: October 1, 2002 PASSED
Decision notification: November 1, 2002 PASSED
Complete set of tutorial viewgraphs: March 15, 2003
(Tutorial notes will be available to all attendees for a nominal charge)

For any questions regarding tutorials for SDM2003, please contact:
Professor, Dept. of Electrical & Computer Engineering
ACES 3.118, Univ. of Texas, Austin, TX 78712-1084
Phone:(512) 471-8980; fax: 471-5907;



Accepted paper presentors for the conference should IMMEDIATELY review and respond accordingly to the instructions found:


***Please complete and return the copyright transfer form to SIAM IMMEDIATELY. (pdf file)***

Papers submitted to the conference should not be in consideration by any another conference with a published proceeding or by a journal. The work may be either theoretical or applied, but should make a significant contribution to the field. The papers should have a maximum of 12 pages (single-spaced, 2 column, 10 point font, and at least 0.75 inch margin on each side) not counting the title page and references, but including tables and figures. Please use US Letter (8.5" x 11") paper size. Papers must have a keyword list with no more than 6 keywords and an abstract with a maximum of 250 words.

Authors are strongly encouraged to submit their papers electronically in PDF format. For MS Word users, please convert your document to the PDF format.

The Submission URL is Here (

The LaTeX macros are available Here ( )
Please use the SODA and Data Mining Proceedings Macro

Conference Contact:

Back to the Top

Summary of Important Dates

Timeline for Conference Papers

Conference Paper Manuscripts Due: October 1, 2002, PASSED
Author Notification: January 3, 2003 PASSED
Final Version of Papers (Camera Ready): January 24, 2003

Workshop Deadlines

Deadline for proposal submission: September 1, 2002, PASSED
Decision notification: September 15, 2002, PASSED
Call for workshop papers: October 1, 2002, PASSED
Paper Submission Deadline: February 1, 2003
Acceptance notification to the authors: March 1, 2003
Camera-ready workshop proceedings: March 30, 2003

Timeline for Tutorials

Deadline for proposal submission: October 1, 2002, PASSED
Decision notification: November 1, 2002 PASSED
Complete set of tutorial viewgraphs: March 15, 2003


Conference Registration and Hotel Registration are now available!

Back to the Top


Registeration Information , Registration Form , Hotel Information , Hotel Form , Transportation Information


The Program is now Available!

Back to the Top


Sponsor information will be available in the future.

Back to the Top