Proceedings of the 2003 SIAM International Conference on Data Mining

Cathedral Hill Hotel, San Francisco, CA
May 1-3, 2003
Each link below is to a PDF of the paper as it was submitted. Papers are listed in program order. PDF file names represent the Proceedings (DM and year 03), followed by order in printed version (e.g. 001) and first author's last name and first initial..
Message from the Conference Co-Chairs
Part I: Full Papers
3 Decision
Tree Classification of Spatial Data Patterns from Videokeratography using Zernicke
Polynomials
M. D. Twa, S. Parthasarathy, and T. W. Raasch
13 Feature Mining Paradigms for Scientific Data
Ming Jiang, Tat-Sang Choy, Sameep Mehta, Matt Coatney, Steve Barr, Kaden Hazzard,
David Richie, Srinivasan Parthasarathy, Raghu Machiraju, David Thompson, John
Wilkins, and Boyd Gatlin
25 A Comparative Study of Anomaly Detection Schemes in Network Intrusion Detection
Aleksander Lazarevic, Levent Ertöz, Vipin Kumar, Aysel Ozgur, and Jaideep
Srivastava
37 Fast Online SVD Revisions for Lightweight Recommender Systems
Matthew Brand
47 Finding Clusters of Different Sizes, Shapes, and Densities in Noisy, High Dimensional Data
Levent Ertöz, Michael Steinbach, and Vipin Kumar
59 Hierarchical Document Clustering using Frequent
Itemsets
Benjamin C. M. Fung, Ke Wang, and Martin Ester
71 Scalable, Balanced Model-based Clustering
Shi Zhong and Joydeep Ghosh
83 A New Gravitational Clustering Algorithm
Jonathan Gomez, Dipankar Dsgupta, and Olfa Nasraoui
95 Mining Changes of Classification by Correspondence Tracing
Ke Wang, Senqiang Zhou, Chee Ada Fu, and Jeffrey Xu Yu
107 Dynamic Classification of Online Customers
Dimitris J. Bertsimas, Adam J. Mersereau, and Nitin R. Patel
119 Communication and Memory Efficient Parallel Decision Tree Construction
Ruoming Jin and Gagan Agrawal
130 ATLaS: A Native Extension of SQL for Data Mining
Haixun Wang and Carlo Zaniolo
142 Approximate Query Answering by Model Averaging
Dmitry Pavlov and Padhraic Smyth
154 On using Page Cooccurrences for Computing Clickstream Similarity
Ravi Kothari, Parul Mittal, Vivek Jain, and Mukesh Mohania
166 CloSpan: Mining Closed Sequential Patterns in Large Databases
Xifeng Yan, Jiawei Han, and Ramin Afshar
178 StarClass: Interactive Visual Classification using Star Coordinates
Soon Tee Teoh and Kwan-Liu Ma
186 Anytime Query-Tuned Kernel Machines via Cholesky Factorization
Dennis DeCoste
194 Estimation of Topological Dimension
D. R. Hundley and M. J. Kirby
203 Nonparametric Density Estimation: Toward Computational Tractability
Alexander G. Gray and Andrew W. Moore
212 Generalized Sensitivity Analysis: A Framework for Evaluating Data Analysis Results
Ronald K. Pearson
224 STAMP: On Discovery of Statistically Important Pattern Repeats in Long Sequential Data
Jiong Yang, Wei Wang, and Philip S. Yu
Part II: Poster Presentations
239 Efficient Unsupervised Mining from Noisy Data Sets: Application to Clustering Co-occurrence Data
Hiroshi Mamitsuka
244 Active Sampling: An Effective Approach to Feature Selection
Huan Li, Hongjun Lu, and Lei Yu
249 PageRank: HITS and a Unified Framework for Link Analysis
Chris Ding, Xiaofeng He, Parry Husbands, Hongyuan Zha, and Horst Simon
254 The Application of Text Mining Software to Examine Coded Information
Patricia B. Cerrito and James Cox
259 Extracting Cyber Communities through Patterns
Tassos Argyros, Charis Ermopoulos, Vassiliki Pavlaki, and Nidal Al-Said
264 On the Techniques for Data Clustering with Numerical Constraints
Bi-Ru Dai, Cheng-Ru Lin, and Ming-Syan Chen
269 The Analysis of Asthma and Exposure Data using Geographic Information Systems and Data Mining Information
Patricia B. Cerrito, George R. Barnes, and Robert W. Forbes
274 Detecting Periodicity in Nonideal Datasets
R. K. Pearson, H. Lähdesmäki, H. Huttunen, and O. Yli-Harja
279 Detection of Underrepresented Biological Sequences using Class-Conditional Distribution Models
Slobodan Vucetic, Dragoljub Pokrajac, Hongbo Xie, and Zoran Obradovic
284 Learning Bayesian
Network Structure from Distributed Data
R. Chen, K. Sivakumar, and H. Khargupta
289 Mixture Models and
Frequent Sets: Combining Global and Local Methods for 0-1 Data
Jaakko Hollmén, Jouni K. Seppänen, and Heikki Mannila
294 Field-Theoretic
Methods for Intractable Probabilistic Models
Dennis Lucarelli, Cheryl Resch, I-Jeng Wang, and Fernando J. Pineda
299 Data-Ming of a Large
Virtual Community: Relationship between Users DB and the Web-Log File
S. M. Savaresi, Simone Garatti, Sergio Bittanti, and Luca La Brocca
304 Cube Lattices: A
Framework for Multidimensional Data Mining
Alain Casali, Rosine Cicchetti, and Lotfi Lakhal
Part III: Student Papers
311 ApproxMAP: Approximate
Mining of Consensus Sequential Patterns
Hye-Chung (Monica) Kim, Jian Pei, Wei Wang, and Dean Duncan
316 Mining Frequent Sequential Patterns under Regular Expressions: A Highly Adaptive Strategy for Pushing Contraints
Hunor Albert-Lorincz and Jean-François Boulicaut
321 Sort-Merge Feature Selection for Video Data
Yan Liu and John R. Kender
326 An Outlier-based Data Association Method for Linking Criminal Incidents
Song Lin and Donald E. Brown
331 CPAR: Classification based on Predictive Association Rules
Xiaxin Yin and Jiawei Han
336 Mining Temporal Databases for Subsequence Patterns
Wen Niu and Raj Bhatnagar
341 Using Low-Memory Representations to Cluster Very Large Data Sets
David Littau and Daniel Boley
