SIAM Conference on Data Mining

Conference Schedule

Thursday, April 22, 2004
7:30am - 8:15am Continental Breakfast
8:15am - 8:30am Welcoming Remarks
8:30am - 9:45am Keynote 1 - Christopher M. Bishop (Microsoft Research Cambridge)
Recent Advances in Bayesian Inference Techniques
Session Chair: Michael W. Berry (Univ. of Tennessee)
9:45am - 10:00am Coffee Break
10:00am - 12:00pm

Tutorial 1: Top Ten Data Mining Mistakes - and How to Avoid Them (Elder)

Stream and Sequence Mining
Chair: Sheng Ma (IBM T.J. Watson Research Center):

Mining Relationships Between Interacting Episodes Cancelled
*Carl Mooney*, John Roddick (Flinders U., Australia)

Making Time-series Classification More Accurate using Learned Constraints
*Chotirat Ratanamahatana*, Eamonn Keogh (U. of California, Riverside)

GRM: A New Model for Clustering Linear Sequences
Hansheng Lei (SUNY Buffalo)

Non-linear Manifold Learning for Data Stream
*Martin H.C. Law*, Nan Zhang, Anil Jain (U. of Michigan)

Text and Spatial Mining
Chair: William Ferng (The Boeing Company):

Text Mining from Site Invariant and Dependent Features for Information Extraction Knowledge Adaptation
*Wai Lam*, Tak-Lam Wong (Chinese U. of Hong Kong)

Constructing Time Decompositions for Analyzing Time-stamped Documents
*Parvathi Chundi* (U. of Nebraska), Daniel Rosenkrantz (SUNY Albany)

Equivalence of Several Two-stage Methods for Linear Discriminant Analysis
*Peg Howland*, Haesun Park (U. of Minnesota)

A Framework for Discovering Co-location Patterns in Data Sets with Extended Spatial Objects
*Hui Xiong*, Shashi Shekhar (U. of Minnesota), Yan Huang (U. of North Texas), Vipin Kumar, Xiaobin Ma, Jin Soung Yoo (U. of Minnesota)

12:00pm - 1:45pm Lunch (attendees on their own)
1:45pm - 3:00pm Keynote 2 - Sara Graves (University of Alabama in Huntsville)
Data Mining and Data Usability
Session Chair: Chandrika Kamath (LLNL)
3:00pm - 4:30pm

Industry/Government Session
A special session with speakers addressing applications of data mining to problems of interest to industry as well as those of national interest (e.g., Homeland Defense) Overview, Boyack, Coughlan, Ma, Meyer

Genomics and Bioinformatics
Chair: Haesun Park (Univ. of Minnesota)

A Top-down Method for Mining Most-specific Frequent Patterns in Biological Sequences
Martin Ester, *Xiang Zhang* (Simon Fraser U., Canada)

Using Support Vector Machines for Classifying Large Sets of Multi-represented Objects
Hans-Peter Kriegel, Peer Kroeger, Alexej Pryakhin, *Matthias Schubert* (U. of Munich, Germany)

Minimum Sum-squared Residue Co-clustering of Gene Expression Data
*Suvrit Sra*, Yuk Cho, Inderjit Dhillon, Yuqiang Guan (U. of Texas-Austin)

Scalable Algorithms I
Chair: Hillol Kargupta (Univ. of Maryland, Baltimore County)

Training Support Vector Machines Using Adaptive Clustering
Boley Daniel, *Dongwei Cao* (U. of Minnesota)

IREP++, A Faster Rule Learning Algorithm
*Oliver Dain*, Robert Cunningham, Stephen Boyer (MIT Lincoln Lab.)

GenIc: A Single-pass Generalized Incremental Algorithm for Clustering
*Chetan Gupta*, Robert Grossman (U. of Illinois at Chicago)

4:30pm - 4:45pm Coffee Break
4:45pm - 5:45pm

Industry/Government Session continues

Pre-processing and Data Reduction
Chair: Srinivasan Parthasarathy (The Ohio State University)

Conquest: ADistributed Tool for Constructing Summaries of High-Dimensional Discrete Attribute Data Sets
*Jie Chi*, Mehmet Koyuturk, Ananth Grama (Purdue U.)

Basic Association Rules
*Guichong Li*, Howard Hamilton (U. of Regina, Canada)

Chair: Peg Howland (University of Minnesota)

Hierarchical Clustering for Thematic Browsing and Summarization of Large Sets of Association Rules
Alipio Jorge (Universidade do Porto, Portugal)

Analytical Evaluation of Clustering Results Using Computational Negative Controls
*Ronald K. Pearson*, Tom Zylkin, James Schwaber, Gregory Gonye (Thomas Jefferson U.)

6:15pm - 8:00pm Poster Session and Welcome Reception
Friday, April 23, 2004
7:30am - 8:15am Continental Breakfast
8:15am - 8:30am Welcoming Remarks
8:30am - 9:45am Keynote 3 - C. David Page Jr. (University of Wisconsin Medical School)
Data Mining Research Questions Raised by Biological Data
Session Chair: David Skillicorn (Queen's University)
10:00am - 12:00pm

Tutorial 2: Analyzing Medical Patient Data: Challenges, Results, and Future Directions (Rao)

Probabilistic/statistical Methods I
Chair: Inderjit Dhillon (Univ. of Texas, Austin)

An Abstract Weighting Framework for Clustering Algorithms
*Richard Nock* (U. Antilles-Guyane), Frank Nielsen (Sony CS Labs.)

RBA: An Integrated Framework for Regression Based on Association Rules
*Aysel Ozgur* (U. of Minnesota), Pang Ning Tan (Michigan State U.), Vipin Kumar (U. of Minnesota

Privacy-preserving Multi-variate Statistical Analysis: Linear Regression and Classification
*Wenliang Du*, Yunghsiang S. Han (Syracuse U.), Shigang Chen (U. of Florida)

Clustering with Bregman Divergences
*Arindam Banerjee*, Srujana Merugu, Inderjit Dhillon, Joydeep Ghosh (U. of Texas-Austin)

Chair: Morgan C. Wang (University of Central Florida)

Density-connected Subspace Clustering for High-dimensional Data
*Peer Kroeger*, Hans-Peter Kriegel, Karin Kailing (U. of Munich, Germany)

Tesselation and Clustering by Mixture Models and their Parallel Implementations
*Qiang Du*, Xiaoqiang Wang (Pennsylvania U.)

Clustering Categorical Data using the Correlated-force Ensemble
*Ming-Syan Chen*, Kun-Ta Chuang (National Taiwan U.)

HICAP: Hierarchical Clustering with Pattern Preservation
*Hui Xiong*, Michael Steinbach (U. of Minnesota), Pang-Ning Tan (Michigan State U.), Vipin Kumar (U. of Minnesota)

12:00pm - 1:45pm Lunch (attendees on their own)
1:45pm - 3:00pm Keynote 4 - Ted Senator (Defense Advanced Research Projects Agency or DARPA)
Data Mining for Connecting the Dots
Session Chair: Umeshwar Dayal (Hewlett-Packard Laboratories)
3:00pm - 4:30pm

Tutorial 3: Data Mining for Computer Security (Brodley, Chan)

Novel Applications
Chair: Sanjay Ranka (University of Florida)

Enhancing Communities of Interest using Bayesian Stochastic Blockmodels
*Deepak Agrawal*, Darryl Pregibon (AT&T Labs.)

VEDAS: A Mobile and Distributed Data Stream Mining System for Real-time Vehicle Monitoring
Hillol Kargupta (U. of Maryland, Baltimore County)

DOMISA: DOM-based Information Space Adsorption of Web Information Hierarchy Mining
Hung-Yu Kao (National Taiwan U.), Jan-Min Ho (Academia Sinica), *Ming-Syan Chen* (National Taiwan U.)

Scalable Algorithms II
Chair: S. Muthu Muthukrishnan (Rutgers Univ. and AT&T Research)

CREDOS: Classification using Ripple Down Structure (a case for rare classes)
*Mahesh V. Joshi* (IBM Almaden), Vipin Kumar (U. of Minnesota)

Active Semi-supervision for Pairwise Constrained Clustering
*Sugato Basu*, Arindam Banerjee, Raymond Mooney (U. of Texas-Austin)

Finding Frequent Patterns in a Large Sparse Graph
*Michihiro Kuramochi*, George Karypis (U. of Minnesota)

4:30pm - 4:45pm Coffee Break
4:45pm - 6:15pm Tutorial 3 continues

Probablistic/statistical Methods II
Chair: Jacob Kogan (Univ. of Maryland, Baltimore County)

A General Probabilistic Framework for Mining Labeled Ordered Trees
Nobuhisa Ueda, Kiyoko Aoki, *Hiroshi Mamitsuka* (Kyoto U, Japan)

Mixture Density Mercer Kernels: A Method to Learn Kernels Directly from Data
Ashok Srivastava (RIACS/NASA Ames Research Center)

A Mixture Model for Clustering Ensembles
*Alexander Topchy*, Anil Jain, William Punch (Michigan State U.)

Visual Mining
Chair: Rahul Ramachandran (Univ. of Alabama in Huntsville)

Visualizing RFM Segmentation
Ron Kohavi (, *Rajesh Parekh* (Blue Martini Software)

Visually Mining through Cluster Hierarchies
Stefan Brechiesen, Hans-Peter Kriegel, *Peer Kroeger*, Martin Pfeifle (U. of Munich, Germany)

Class-specific Ensembles for Active Learning
*Amit Mandvikar*, Huan Liu (Arizona State U.)

Saturday, April 24, 2004 (Workshops)
7:30am - 8:15am Continental Breakfast
8:30am - 10:00am Workshops Begin
10:00am - 10:30am Coffee Break
10:30am - 12:00pm Sessions Resume
12:00pm - 1:45pm Lunch (attendees on their own)
1:45pm - 3:15pm Sessions Resume
3:15pm - 3:45pm Coffee Break
3:45pm - 5:15pm Sessions Resume
5:15pm Conference Adjourns