Proceedings: Data Mining 2005

Proceedings of the 2005 SIAM International Conference on Data Mining

Each link below is to a PDF of the paper as it was submitted. Papers are listed in program order. PDF file names represent the Proceedings (DM and year 05), followed by order in printed version (e.g. 001) and first author's last name and first initial..

Message from the Conference Co-Chairs

Preface

1 Computational Developments of ψ-learning
Sijin Liu, Xiaotong Shen, and Wing Hung Wong

12 A Random Walks Perspective on Maximizing Satisfaction and Profit
Matthew Brand

20 Surveying Data for Patchy Structure
Ronald K. Pearson

32 2-Dimensional Singular Value Decomposition for 2D Maps and Images
Chris Ding and Jieping Ye

44 Summarizing and Mining Skewed Data Streams
Graham Cormode and S. Muthukrishnan

56 Online Analysis of Community Evolution in Data Streams
Charu C. Aggarwal and Philip S. Yu

68 Mining Frequent Itemsets from Data Streams with a Time-Sensitive Sliding Window
Chih-Hsiang Lin, Ding-Ying Chiu, Yi-Hung Wu, and Arbee L. P. Chen

80 On Abnormality Detection in Spuriously Populated Data Streams
Charu C. Aggarwal

92 Privacy-Preserving Classification of Customer Data without Loss of Accuracy
Zhiqiang Yang, Sheng Zhong, and Rebecca N.Wright

103 Privacy-Aware Market Basket Data Set Generation: A Feasible Approach for Inverse Frequent Set Mining
Xintao Wu, Ying Wu, Yongge Wang, and Yingjiu Li

115 On Variable Constraints in Privacy Preserving Data Mining
Charu C. Aggarwal and Philip S. Yu

126 Clustering with Model-Level Constraints
David Gondek, Shivakumar Vaithyanathan, and Ashutosh Garg

138 Clustering with Constraints: Feasibility Issues and the k-Means Algorithm
Ian Davidson and S. S. Ravi

150 A Cutting Algorithm for the Minimum Sum-of-Squared Error Clustering
Jiming Peng and Yu Xia

161 Dynamic Classification of Defect Structures in Molecular Dynamics Simulation Data
Sameep Mehta, Steve Barr, Tat-Sang Choy, Hui Yang, Srinivasan Parthasarathy, Raghu Machiraju, and John Wilkins

173 Striking Two Birds with One Stone: Simultaneous Mining of Positive and Negative Spatial Patterns
Bavani Arunasalam, Sanjay Chawla, and Pei Sun

183 Finding Young Stellar Populations in Elliptical Galaxies from Independent Components of Optical Spectra
Ata Kabán, Louisa A. Nolan, and Somak Raychaudhury

195 Hybrid Attribute Reduction for Classification Based on a Fuzzy Rough Set Technique
Qinghua Hu, Daren Yu, and Zongxia Xie

205 HARMONY: Efficiently Mining the Best Rules for Classification
Jianyong Wang and George Karypis

217 On Error Correlation and Accuracy of Nearest Neighbor Ensemble Classifiers
Carlotta Domeniconi and Bojun Yan

227 Lazy Learning for Classification Based on Query Projections
Yiqiu Han and Wai Lam

239 Mining Non-derivable Association Rules
Bart Goethals, Juho Muhonen, and Hannu Toivonen

250 Depth-First Non-derivable Itemset Mining
Toon Calders and Bart Goethals

262 Exploiting Relationships for Domain-Independent Data Cleaning
Dmitri V. Kalashnikov, Sharad Mehrotra, and Zhaoqi Chen

274 A Spectral Clustering Approach to Finding Communities in Graphs
Scott White and Padhraic Smyth

286 Mining Behavior Graphs for "Backtrace'' of Noncrashing Bugs
Chao Liu, Xifeng Yan, Hwanjo Yu, Jiawei Han, and Philip S. Yu

298 Learning to Refine Ontology for a New Web Site Using a Bayesian Approach
Tak-Lam Wong and Wai Lam

310 Exploiting Parameter Related Domain Knowledge for Learning in Graphical Models
Radu S. Niculescu, Tom M. Mitchell, and R. Bharat Rao

322 Exploiting Geometry for Support Vector Machine Indexing
Navneet Panda and Edward Y. Chang

334 Parallel Computation of RBF Kernels for Support Vector Classifiers
Shibin Qiu and Terran Lane

346 Loadstar: A Load Shedding Scheme for Classifying Data Streams
Yun Chi, Philip S. Yu, Haixun Wang, and Richard R. Muntz

358 Topic-Driven Clustering for Document Datasets
Ying Zhao and George Karypis

370 Variational Learning for Noisy-OR Component Analysis
Tomas Singliar and Milos Hauskrecht

380 Summarizing Sequential Data with Closed Partial Orders
Gemma Casas-Garriga

392 SUMSRM: A New Statistic for the Structural Break Detection in Time Series
Kwok Pan Pang and Kai Ming Ting

404 Markov Models for Identification of Significant Episodes
Robert Gwadera, Mikhail Atallah, and Wojciech Szpankowski

415 Efficient Mining of Maximal Sequential Patterns Using Multiple Samples
Congnan Luo and Soon M. Chung

427 Gaussian Processes for Active Data Mining of Spatial Aggregates
Naren Ramakrishnan, Chris Bailey-Kellogg, Satish Tadepalli, and Varun N. Pandey

439 Correlation Clustering for Learning Mixtures of Canonical Correlation Models
Xiaoli Z. Fern, Carla E. Brodley, and Mark A. Friedl

449 On Periodicity Detection and Structural Periodic Similarity
Michail Vlachos, Philip Yu, and Vittorio Castelli

461 Cross Table Cubing: Mining Iceberg Cubes from Data Warehouses
Moonjung Cho, Jian Pei, and David W. Cheung

466 Decision Tree Induction in High Dimensional, Hierarchically Distributed Databases
Amir Bar-Or, Assaf Schuster, Ran Wolff, and Daniel Keren

471 Slope One Predictors for Online Rating-Based Collaborative Filtering
Daniel Lemire and Anna Maclachlan

476 Sparse Fisher Discriminant Analysis for Computer Aided Detection
M. Murat Dundar, Glenn Fung, Jinbo Bi, Sandilya Sathyakama, and Bharat Rao

481 Expanding the Training Data Space Using Emerging Patterns and Genetic Methods
Hamad Alhammady and Kotagiri Ramamohanarao

486 Making Data Mining Models Useful to Model Non-paying Customers of Exchange Carriers
Wei Fan, Janak Mathuria, and Chang-tien Lu

491 Matrix Condition Number Prediction with SVM Regression and Feature Selection
Shuting Xu and Jun Zhang

496 Cluster Validity Analysis of Alternative Results from Multi-objective Optimization
Yimin Liu, Tansel Özyer, Reda Alhajj, and Ken Barker

501 ClosedPROWL: Efficient Mining of Closed Frequent Continuities by Projected Window List Technology
Kuo-Yu Huang, Chia-Hui Chang, and Kuo-Zui Lin

506 Three Myths about Dynamic Time Warping Data Mining
Chotirat Ann Ratanamahatana and Eamonn Keogh

511 PCA without Eigenvalue Calculations: A Case Study on Face Recognition
E. Kokiopoulou and Y. Saad

516 Mining Top-K Itemsets over a Sliding Window Based on Zipfian Distribution
Raymond Chi-Wing Wong and Ada Wai-Chee Fu

521 Hierarchical Document Classification Using Automatically Generated Hierarchy
Tao Li and Shenghuo Zhu

526 On Clustering Binary Data
Tao Li and Shenghuo Zhu

531 Time-Series Bitmaps: A Practical Visualization Tool for Working with Large Time Series Databases
Nitin Kumar, Venkata Nishanth Lolla, Eamonn Keogh, Stefano Lonardi, Chotirat Ann Ratanamahatana, and Li Wei

536 Pushing Feature Selection Ahead of Join
Rong She, Ke Wang, Yabo Xu, and Philip S. Yu

541 Discarding Insignificant Rules During Impact Rule Discovery in Large, Dense Databases
Shiying Huang and Geoffrey I. Webb

546 SPID4.7: Discretization Using Successive Pseudo Deletion at Maximum Information Gain Boundary Points
Somnath Pal and Himika Biswas

551 Iterative Mining for Rules with Constrained Antecedents
Zheng Sun, Philip S. Yu, and Xiang-Yang Li

556 Influence in Ratings-Based Recommender Systems: An Algorithm-Independent Approach
Al Mamunur Rashid, George Karypis, and John Riedl

561 Mining Unconnected Patterns in Workflows
Gianluigi Greco, Antonella Guzzo, Giuseppe Manco, and Domenico Saccà

566 The Best Nurturers in Computer Science Research
Bharath Kumar M. and Y. N. Srikant

571 Knowledge Discovery from Heterogeneous Dynamic Systems Using Change-Point Correlations
Tsuyoshi Idé and Keisuke Inoue

576 Building Decision Trees on Records Linked through Key References
Ke Wang, Yabo Xu, Philip S. Yu, and Rong She

581 Efficient Allocation of Marketing Resources Using Dynamic Programming
Giuliano Tirenni, Abderrahim Labbi, André Elisseeff, and Cèsar Berrospi

586 Near-Neighbor Search in Pattern Distance Spaces
Haixun Wang, Chang-Shing Perng, and Philip S. Yu

591 An Algorithm for Lattice-Structured Subspace Clusters
Haiyun Bian and Raj Bhatnagar

596 CBS: A New Classification Method by Using Sequential Patterns
Vincent S. M. Tseng and Chao-Hui Lee

601 SeqIndex: Indexing Sequences by Sequential Pattern Analysis
Hong Cheng, Xifeng Yan, and Jiawei Han

606 On the Equivalence of Nonnegative Matrix Factorization and Spectral Clustering
Chris Ding, Xiaofeng He, and Horst D. Simon

611 Kronecker Factorization for Speeding up Kernel Machines
Gang Wu, Zhihua Zhang, and Edward Chang

616 Symmetric Statistical Translation Models for Automatic Image Annotation
Feng Kang and Rong Jin

621 Correcting Sampling Bias in Structural Genomics through Iterative Selection of Underrepresented Targets
Kang Peng, Slobodan Vucetic, and Zoran Obradovic

626 Statistical Models for Unequally Spaced Time Series
Emre Erdogan, Sheng Ma, Alina Beygelzimer, and Irina Rish

631 CLSI: A Flexible Approximation Scheme from Clustered Term-Document Matrices
Dimitrios Zeimpekis and Efstratios Gallopoulos

636 WFIM: Weighted Frequent Itemset Mining with a Weight Range and a Minimum Weight
Unil Yun and John J. Leggett

641 Model-Based Clustering with Probabilistic Constraints
Martin H. C. Law, Alexander Topchy, and Anil K. Jain

Renew SIAM · Contact Us · Site Map · Join SIAM · My Account
Facebook Twitter Flickr Youtube