Proceedings of the 2004 SIAM International Conference on Data Mining

Each link below is to a PDF of the paper as it was submitted. Papers are listed in program order. PDF file names represent the Proceedings (DM and year 04), followed by order in printed version (e.g. 001) and first author's last name and first initial..
Message from the Conference Co-Chairs
1 Mining
Relationships between Interacting Episodes
Carl Mooney and John F. Roddick
11 Making
Time-Series Classification More Accurate Using Learned Constraints
Chotirat Ann Ratanamahatana and Eamonn Keogh
23 GRM:
A New Model for Clustering Linear Sequences
Hansheng Lei and Venu Govindaraju
33 Nonlinear
Manifold Learning for Data Stream
Martin H. C. Law, Nan Zhang, and Anil K. Jain
45 Text
Mining from Site Invariant and Dependent Features for Information Extraction
Knowledge Adaptation
Tak-Lam Wong and Wai Lam
57 Constructing
Time Decompositions for Analyzing Time Stamped Documents
Parvathi Chundi and Daniel J. Rosenkrantz
69 Equivalence
of Several Two-Stage Methods for Linear Discriminant Analysis
Peg Howland and Haesun Park
78 A Framework
for Discovering Co-location Patterns in Data Sets with Extended Spatial Objects
Hui Xiong, Shashi Shekhar, Yan Huang, Vipin Kumar, Xiaobin Ma, and Jin Soung
Yoo
90 A Top-Down
Method for Mining Most Specific Frequent Patterns in Biological Sequences
Martin Ester and Xiang Zhang
102 Using Support Vector
Machines for Classifying Large Sets of Multi-Represented Objects
Hans-Peter Kriegel, Peer Kröger, Alexej Pryakhin, and Matthias Schubert
114 Minimum Sum-Squared
Residue Co-Clustering of Gene Expression Data
Hyuk Cho, Inderjit S. Dhillon, Yuqiang Guan, and Suvrit Sra
126 Training Support
Vector Machine Using Adaptive Clustering
Daniel Boley and Dongwei Cao
138 IREP++, A Faster
Rule Learning Algorithm
Oliver Dain, Robert K. Cunningham, and Stephen Boyer
147 GenIc: A Single
Pass Generalized Incremental Algorithm for Clustering
Chetan Gupta and Robert Grossman
154 Conquest: A Distributed
Tool for Constructing Summaries of High-Dimensional Discrete Attributed Datasets
Jie Chi, Mehmet Koyutürk, and Ananth Grama
166 Basic Association
Rules
Guichong Li and Howard J. Hamilton
178 Hierarchical Clustering
for Thematic Browsing and Summarization of Large Sets of Association Rules
Alípio Jorge
188 Quantitative Evaluation
of Clustering Results Using Computational Negative Controls
Ronald K. Pearson, Tom Zylkin, James S. Schwaber, and Gregory E. Gonye
200 An Abstract Weighting
Framework for Clustering Algorithms
Richard Nock and Frank Nielsen
210 RBA: An Integrated
Framework for Regression Based on Association Rules
Aysel Ozgur, Pang-Ning Tan, and Vipin Kumar
222 Privacy-Preserving
Multivariate Statistical Analysis: Linear Regression and Classification
Wenliang Du, Yunghsiang S. Han, and Shigang Chen
234 Clustering with Bregman
Divergences
Arindam Banerjee, Srujana Merugu, Inderjit Dhillon, and Joydeep Ghosh
246 Density-Connected
Subspace Clustering for High-Dimensional Data
Karin Kailing, Hans-Peter Kriegel, and Peer Kröger
257 Tessellation and
Clustering by Mixture Models and Their Parallel Implementations
Qiang Du and Xiaoqiang Wang
269 Clustering Categorical
Data Using the Correlated-Force Ensemble
Kun-Ta Chuang and Ming-Syan Chen
279 HICAP: Hierarchical
Clustering with Pattern Preservation
Hui Xiong, Michael Steinbach, Pang-Ning Tan, and Vipin Kumar
291 Enhancing Communities
of Interest Using Bayesian Stochastic Blockmodels
Deepak Agrawal and Daryl Pregibon
300 VEDAS: A Mobile
and Distributed Data Stream Mining System for Real-Time Vehicle Monitoring
Hillol Kargupta, Ruchita Bhargava, Kun Liu, Michael Powers, Patrick Blair,
Samuel Bushra, James Dull, Kakali Sarkar, Martin Klein, Mitesh Vasa, and David
Handy
312 DOMISA: DOM-Based
Information Space Adsorption for Web Information Hierarchy Mining
Hung-Yu Kao, Jan-Ming Ho, and Ming-Syan Chen
321 CREDOS: Classification
Using Ripple Down Structure (A Case for Rare Classes)
Mahesh V. Joshi and Vipin Kumar
333 Active Semi-Supervision
for Pairwise Constrained Clustering
Sugato Basu, Arindam Banerjee, and Raymond J. Mooney
345 Finding Frequent
Patterns in a Large Sparse Graph
Michihiro Kuramochi and George Karypis
357 A General Probabilistic
Framework for Mining Labeled Ordered Trees
Nobuhisa Ueda, Kiyoko F. Aoki, and Hiroshi Mamitsuka
369 Mixture Density
Mercer Kernels: A Method to Learn Kernels Directly from Data
Ashok N. Srivastava
379 A Mixture Model
for Clustering Ensembles
Alexander Topchy, Anil K. Jain, and William Punch
391 Visualizing RFM
Segmentation
Ron Kohavi and Rajesh Parekh
400 Visually Mining
through Cluster Hierarchies
Stefan Brechiesen, Hans-Peter Kriegel, Peer Kröger, and Martin Pfeifle
412 Class-Specific Ensembles
for Active Learning in Digital Imagery
Amit Mandvikar and Huan Liu
422 Mining Text for
Word Senses Using Independent Component Analysis
Reinhard Rapp
427 A Kernel-Based Semi-Naive
Bayesian Classifier Using P-Trees
Anne Denton and William Perrizo
432 BAMBOO: Accelerating
Closed Itemset Mining by Deeply Pushing the Length-Decreasing Support Constraint
Jianyong Wang and George Karypis
437 A General Framework
for Adaptive Anomaly Detection with Evolving Connectionist Systems
Yihua Liao, V. Rao Vemuri, and Alejandro Pasos
442 R-MAT: A Recursive
Model for Graph Mining
Deepayan Chakrabarti, Yiping Zhan, and Christos Faloutsos
447 Lazy Learning by
Scanning Memory Image Lattice
Yiqiu Han and Wai Lam
452 Text Mining Using
Non-negative Matrix Factorizations
V. Paul Pauca, Farial Shahnaz, Michael W. Berry, and Robert J. Plemmons
457 Active Mining of
Data Streams
Wei Fan, Yi-an Huang, Haixun Wang, and Philip S. Yu
462 Learning to Read
Between the Lines: The Aspect Bernoulli Model
A.
Kabán, E. Bingham, and T. Hirsimäki
467 Exploiting Hierarchical
Domain Values in Classification Learning
Yiqiu Han and Wai Lam
472 IFD: Iterative Feature
and Data Clustering
Tao Li and Sheng Ma
477 Adaptive Filtering
for Efficient Record Linkage
Lifang Gu and Rohan Baxter
482 A Foundational Approach
to Mining Itemset Utilities from Databases
Hong Yao, Howard J. Hamilton, and Cory J. Butz
487 The Discovery of
Generalized Causal Models with Mixed Variables Using MML Criterion
Gang Li and Honghua Dai
492 Reservoir-Based
Random Sampling with Replacement from Data Stream
Byung-Hoon Park, George Ostrouchov, Nagiza F. Samatova, and Al Geist
497 Principal Component
Analysis and Effective K-Means Clustering
Chris Ding and Xiaofeng He
502 Classifying Documents
without Labels
Daniel Barbará, Carlotta Domeniconi, and Ning Kang
507 Data Reduction in
Support Vector Machines by a Kernelized Ionic Interaction Model
Hyunsoo Kim and Haesun Park
512 Continuous-Time
Bayesian Modeling of Clinical Data
Sathyakama Sandilya and R. Bharat Rao
517 Subspace Clustering
of High Dimensional Data
Carlotta Domeniconi, Dimitris Papadopoulos, Dimitrios Gunopulos, and Sheng
Ma
522 Privacy Preserving
Naïve Bayes Classifier for Vertically Partitioned Data
Jaideep Vaidya and Chris Clifton
527 Resource-Aware Mining
with Variable Granularities in Data Streams
Wei-Guang Teng, Ming-Syan Chen, and Philip S. Yu
532 Mining Patters of
Activity from Video Data
Michael C. Burl
