Proceedings of the 2006 SIAM International Conference on Data Mining
Each link below is to a PDF of the paper as it was submitted. Papers are listed in program order. PDF file names represent the Proceedings (DM and year 06), followed by order in printed version (e.g. 001) and first author's last name and first initial..
1 Area
Under ROC Optimisation using a Ramp Approximation
Alan Herschtal, Bhavani Raskutti, Peter K. Campbell
12 On
the Necessary and Sufficient Conditions of a Meaningful Distance Function
for High Dimensional Data Space
Chih-Ming Hsu, Ming-Syan Chen
24 CPM:
A Covariance-preserving Projection Method
Jieping Ye, Tao Xiong, Ravi Janardan
35 Transform
Regression and the Kolmogorov Superposition Theorem
Edwin Pednault
47 A
Latent Dirichlet Model for Unsupervised Entity Resolution
Indrajit Bhattacharya, Lise Getoor
59 Deriving
Private Information from Randomly Perturbed Ratings
Sheng Zhang, James Ford, and Fillia Makedon
70 Name
Reference Resolution in Organizational Email Archives
Christopher P. Diehl, Lise Getoor, Galileo Namataz
82 Automated
Knowledge Discovery from Simulators
M.C. Burl, D. DeCoste, B.L. Enke, D. Mazzoni, W.J. Merline, L. Scharenbroich
94 Mining
for Outliers in Sequential Databases
Pei Sun, Sanjay Chawla, Bavani Arunasalam
106 Mining
Control Flow Abnormality for Logic Error Isolation
Chao Liu, Xifeng Yan, Jiawei Han
118 Scan
Detection: A Data Mining Approach
György J. Simon, Hui Xiong, Eric Eilertson, Vipin Kumar
130 Learning
Bayesian Networks from Incomplete Data: An Efficient Method for Generating Approximate
Predictive Distributions
Carsten Riggelsen
141 Efficient
Markov Network Structure Discovery using Independence Tests
Facundo Bromberg, Dimitris Margaritis, Vasant Honavar
153 K-Means
Clustering Over a Large, Dynamic Network
Souptik Datta, Chris Giannella, Hillol Kargupta
165 Adapting
K-Medians to Generate Normalized Cluster Centers
Benjamin J. Anderson, Deborah S. Gross, David R. Musicant, Anna M. Ritz, Thomas
G. Smith, Leah E. Steinberg
176 Advanced
Prototype Machines: Exploring Prototypes for Classification
Hans-Peter Kriegel Matthias Schubert
188 Toward
Semantic XML Clustering
Andrea Tagarelli, Sergio Greco
200 A
Semantic Approach for Mining Hidden Links from Complementary and Non-interactive
Biomedical Literature
Xiaohua Hu, Xiaodan Zhang, Illhoi Yoo, Yanqing Zhang
210 Representation
is Everything: Towards Efficient and Adaptable Similarity Measures for Biological
Data
Charu C. Aggarwal
222 Mining
Frequent Agreement Subtrees in Phylogenetic Databases
Sen Zhang and Jason T. L. Wang
234 Trend
Relational Analysis and Grey-Fuzzy Clustering Method
Zhijie Chen, Weizhen Chen, Qile Chen and Mian-Yun Chen
246 Joint
Cluster Analysis of Attribute Data and Relationship Data: the Connected k-Center
Problem
Martin Ester, Rong Ge, Byron J. Gao, Zengjian Hu, Boaz Ben-Moshe
258 Weighted
Clustering Ensembles
Muna Al-Razgan, Carlotta Domeniconi
270 Clustering
in the Presence of Bridge-Nodes
Jerry Scripps and Pang-Ning Tan
282 Mining
Interesting Patterns from Very High Dimensional Data: A Top-Down Row Enumeration
Approach
Hongyan Liu, Jiawei Han, Dong Xin, Zheng Shao
294 Mining
Frequent Patterns by Differential Refinement of Clustered Bitmaps
Jianwei Li, Alok Choudhary, Nan Jiang, Wei-keng Liao
306 Discovery
of Co-evoluting Spatial Co-located Event Sets
Jin Soung Yoo, Shashi Shekhar, Sangho Kim, Mete Celik
316 Efficient
Algorithms for Sequence Segmentation
Evimaria Terzi, Panayiotis Tsaparas
328 Density-Based
Clustering over an Evolving Data Stream with Noise
Feng Cao, Martin Ester, Weining Qian, Aoying Zhou
340 A
Random Walks Method for Text Classification
Yunpeng Xu, Xing Yi, Changshui Zhang
348 Efficient
Mining of Temporally Annotated Sequences
Fosca Giannotti, Mirco Nanni, Dino Pedreschi
360 A
Framework for Local Supervised Dimensionality Reduction of High Dimensional Data
Charu C. Aggarwal
372 Segmentation
and dimensionality reduction
Ella Bingham, Aristides Gionis, Niina Haiminen, Heli Hiisilä, Heikki
Mannila and Evimaria Terzi
384 Probabilistic
Multi-State Split-Merge Algorithm for Coupling Parameter Estimates
Juan K. Lin
395 Item
Sets that Compress
Arno Siebes, Jilles Vreeken, Matthijs van Leeuwen
407 Mining
Approximate Frequent Itemsets In the Presence of Noise: Algorithm and Analysis
Jinze Liu, Susan Paulsen, Xing Sun, Wei Wang, Andrew Nobel, Jan Prins
419 Mining
frequent closed itemsets out-of-core
Claudio Lucchese, Salvatore Orlando, Raffaele Perego
430 Local
L2-Thresholding Based Data Mining in Peer-to-Peer Systems
Ran Wolff, Kanishka Bhaduriy, Hillol Karguptaz
442 Collaborative
Information Extraction and Mining from Multiple Web Documents
Tak-Lam Wong, Wai Lam, Shing-Kit Chan
453 Collaborative
Document Clustering
Khaled Hammouda, Mohamed Kamel
464 Cluster
Description Formats, Problems and Algorithms
Byron J. Gao and Martin Ester
469 Positive
Borders or Negative Borders: How to Make Lossless Generator Based Representations
Concise
Guimei Liu, Jinyan Li, Limsoon Wong, Wynne Hsu
474 Bayesian
K-Means as a "Maximization-Expectation" Algorithm
Max Welling, Kenichi Kurihara
479 A
Framework for Clustering Massive Text and Categorical Data Streams
Charu C. Aggarwal and Philip S. Yu
484 Cone
Cluster Labeling for Support Vector Clustering
Sei-Hyung Lee and Karen M. Daniels
489 Semi-Supervised
Clustering with Partial Background Information
Jing Gao, Pang-Ning Tany, Haibin Cheng
494 A
New Privacy-Preserving Distributed k-Clustering Algorithm
Geetha Jagannathan, Krishnan Pillaipakkamnatt, Rebecca N. Wright
499 ODAC:
Hierarchical Clustering of Time Series Data Streams
Pedro Pereira Rodrigues, João Gama, João Pedro Pedroso
504 Detecting
the Change of Clustering Structure in Categorical Data Streams
Keke Chen, Ling Liu
509 Dissimilarity
Measures for Detecting Hepatotoxicity in Clinical Trial Data
Matthew Eric Otey, Srinivasan Parthasarathy, Donald C. Trost
514 Transductive
De-Noising and Dimensionality Reduction using Total Bregman Regression
Sreangsu Acharyya
519 Robust
Estimation for Mixture of Probability Tables based on beta-likelihood
Yu Fujimoto and Noboru Murata
524 Fast
optimal bandwidth selection for kernel density estimation
Vikas Chandrakant Raykar, Ramani Duraiswami
529 Risk-Sensitive
Learning via Expected Shortfall Minimization
Hisashi Kashima
534 On
Approximate Solutions to Support Vector Machines
Dongwei Cao, Daniel Boley
539 Confidence
Estimation Methods for Partially Supervised Information Extraction
Eugene Agichtein
544 Inference
of Node Replacement Recursive Graph Grammars
Jacek P. Kukluk, Lawrence B. Holder, Diane J. Cook
549 Learning
from Incomplete Ratings Using Non-negative Matrix Factorization
Sheng Zhang, Weihong Wang, James Ford, Fillia Makedon
554 Health
monitoring of a shaft transmission system via hybrid models of PCR and PLS
Yi Fang, Hyun-Woo Cho, Myong Kee Jeong
559 Modeling
Evolutionary Behaviors for Community-based Dynamic Recommendation
Xiaodan Song, Ching-Yung Lin, Belle L. Tseng, Ming-Ting Sun
564 A
Systematic Cross-Comparison of Sequence Classifiers
Binyamin Rozenfeld, Ronen Feldman, Moshe Fresko
569 Data-Enhanced
Predictive Modeling for Sales Targeting
Saharon Rosset, Richard D. Lawrence
574 Graph-based
Methods for Orbit Classification
Abraham Bagherjeiran, Chandrika Kamath
579 Mining
and Validating Localized Frequent Itemsets with Dynamic Tolerance
Olfa Nasraoui, Suchandra Goswami
584 Profiling
Protein Families from Partially Aligned Sequences
Saikat Mukherjee, Chang Zhao, I.V. Ramakrishnan
589 Personalized
Knowledge Discovery: Mining Novel Association Rules from Text
Xin Chen, Yi-Fang Wu
594 A
Novel Framework for Incorporating Labeled Examples into Anomaly Detection
Jing Gao, Haibin Chengy, Pang-Ning Tan
599 Towards
the Prediction of Protein Abundance from Tandem Mass Spectrometry Data
Anthony J. Bonner, Han Liu
604 Using
Compression to Identify Classes of Inauthentic Texts
Mehmet M. Dalkilic, Wyatt T. Clark, James C. Costello, Predrag Radivojac
609 Fast
Mining of Distance-Based Outliers in High Dimensional Datasets
Amol Ghoting, Srinivasan Parthasarathy, and Matthew Eric Otey
614 Spatial
Weighted Outlier Detection
Yufeng Kou, Chang-Tien Lu, Dechang Chen
619 Robust
Clustering for Tracking Noisy Evolving Data Streams
Olfa Nasraoui, Carlos Rojas
624 WIP:
mining Weighted Interesting Patterns with a strong weight and/or support affinity
Unil Yun and John J. Leggett
629 Discovering
Frequent Tree Patterns over Data Streams
Mark Cheng-Enn Hsieh, Yi-Hung Wu, Arbee L.P. Chen
634 Finding
Sequential Patterns from a Massive Number of Spatio-Temporal Events
Yan Huang, Liqin Zhang, and Pusheng Zhang
639 Mining
Minimal Contrast Subgraph Patterns
Roger Ming Hieng Ting, James Bailey

