Area Under ROC Optimisation using a Ramp Approximation
Alan Herschtal, Bhavani Raskutti, Peter K. Campbell

On the Necessary and Sufficient Conditions of a Meaningful Distance Function for High Dimensional Data Space
Chih-Ming Hsu, Ming-Syan Chen

CPM: A Covariance-preserving Projection Method
Jieping Ye, Tao Xiong, Ravi Janardan

Transform Regression and the Kolmogorov Superposition Theorem
Edwin Pednault

A Latent Dirichlet Model for Unsupervised Entity Resolution
Indrajit Bhattacharya, Lise Getoor

Deriving Private Information from Randomly Perturbed Ratings
Sheng Zhang, James Ford, and Fillia Makedon

Name Reference Resolution in Organizational Email Archives
Christopher P. Diehl, Lise Getoor, Galileo Namataz

Automated Knowledge Discovery from Simulators
M.C. Burl, D. DeCoste, B.L. Enke, D. Mazzoni, W.J. Merline, L. Scharenbroich

Mining for Outliers in Sequential Databases
Pei Sun, Sanjay Chawla, Bavani Arunasalam

Mining Control Flow Abnormality for Logic Error Isolation
Chao Liu, Xifeng Yan, Jiawei Han

Scan Detection: A Data Mining Approach
György J. Simon, Hui Xiong, Eric Eilertson, Vipin Kumar

Learning Bayesian Networks from Incomplete Data: An Efficient Method for Generating Approximate Predictive Distributions
Carsten Riggelsen

Efficient Markov Network Structure Discovery using Independence Tests
Facundo Bromberg, Dimitris Margaritis, Vasant Honavar

K-Means Clustering Over a Large, Dynamic Network
Souptik Datta, Chris Giannella, Hillol Kargupta

Adapting K-Medians to Generate Normalized Cluster Centers
Benjamin J. Anderson, Deborah S. Gross, David R. Musicant, Anna M. Ritz, Thomas G. Smith, Leah E. Steinberg

Advanced Prototype Machines: Exploring Prototypes for Classification
Hans-Peter Kriegel Matthias Schubert

Toward Semantic XML Clustering
Andrea Tagarelli, Sergio Greco

A Semantic Approach for Mining Hidden Links from Complementary and Non-interactive Biomedical Literature
Xiaohua Hu, Xiaodan Zhang, Illhoi Yoo, Yanqing Zhang

Representation is Everything: Towards Efficient and Adaptable Similarity Measures for Biological Data
Charu C. Aggarwal

Mining Frequent Agreement Subtrees in Phylogenetic Databases
Sen Zhang and Jason T. L. Wang

Trend Relational Analysis and Grey-Fuzzy Clustering Method
Zhijie Chen, Weizhen Chen, Qile Chen and Mian-Yun Chen

Joint Cluster Analysis of Attribute Data and Relationship Data: the Connected k-Center Problem
Martin Ester, Rong Ge, Byron J. Gao, Zengjian Hu, Boaz Ben-Moshe

Weighted Clustering Ensembles
Muna Al-Razgan, Carlotta Domeniconi

Clustering in the Presence of Bridge-Nodes
Jerry Scripps and Pang-Ning Tan

Mining Interesting Patterns from Very High Dimensional Data: A Top-Down Row Enumeration Approach
Hongyan Liu, Jiawei Han, Dong Xin, Zheng Shao

Mining Frequent Patterns by Differential Refinement of Clustered Bitmaps
Jianwei Li, Alok Choudhary, Nan Jiang, Wei-keng Liao

Discovery of Co-evoluting Spatial Co-located Event Sets
Jin Soung Yoo, Shashi Shekhar, Sangho Kim, Mete Celik

Efficient Algorithms for Sequence Segmentation
Evimaria Terzi, Panayiotis Tsaparas

Density-Based Clustering over an Evolving Data Stream with Noise
Feng Cao, Martin Ester, Weining Qian, Aoying Zhou

A Random Walks Method for Text Classification
Yunpeng Xu, Xing Yi, Changshui Zhang

Efficient Mining of Temporally Annotated Sequences
Fosca Giannotti, Mirco Nanni, Dino Pedreschi

A Framework for Local Supervised Dimensionality Reduction of High Dimensional Data
Charu C. Aggarwal

Segmentation and dimensionality reduction
Ella Bingham, Aristides Gionis, Niina Haiminen, Heli Hiisilä, Heikki Mannila and Evimaria Terzi

Probabilistic Multi-State Split-Merge Algorithm for Coupling Parameter Estimates
Juan K. Lin

Item Sets that Compress
Arno Siebes, Jilles Vreeken, Matthijs van Leeuwen

Mining Approximate Frequent Itemsets In the Presence of Noise: Algorithm and Analysis
Jinze Liu, Susan Paulsen, Xing Sun, Wei Wang, Andrew Nobel, Jan Prins

Mining frequent closed itemsets out-of-core
Claudio Lucchese, Salvatore Orlando, Raffaele Perego

Local L2-Thresholding Based Data Mining in Peer-to-Peer Systems
Ran Wolff, Kanishka Bhaduriy, Hillol Karguptaz

Collaborative Information Extraction and Mining from Multiple Web Documents
Tak-Lam Wong, Wai Lam, Shing-Kit Chan

Collaborative Document Clustering
Khaled Hammouda, Mohamed Kamel

Cluster Description Formats, Problems and Algorithms
Byron J. Gao and Martin Ester

Positive Borders or Negative Borders: How to Make Lossless Generator Based Representations Concise
Guimei Liu, Jinyan Li, Limsoon Wong, Wynne Hsu

Bayesian K-Means as a "Maximization-Expectation" Algorithm
Max Welling, Kenichi Kurihara

A Framework for Clustering Massive Text and Categorical Data Streams
Charu C. Aggarwal and Philip S. Yu

Cone Cluster Labeling for Support Vector Clustering
Sei-Hyung Lee and Karen M. Daniels

Semi-Supervised Clustering with Partial Background Information
Jing Gao, Pang-Ning Tany, Haibin Cheng

A New Privacy-Preserving Distributed k-Clustering Algorithm
Geetha Jagannathan, Krishnan Pillaipakkamnatt, Rebecca N. Wright

ODAC: Hierarchical Clustering of Time Series Data Streams
Pedro Pereira Rodrigues, João Gama, João Pedro Pedroso

Detecting the Change of Clustering Structure in Categorical Data Streams
Keke Chen, Ling Liu

Dissimilarity Measures for Detecting Hepatotoxicity in Clinical Trial Data
Matthew Eric Otey, Srinivasan Parthasarathy, Donald C. Trost

Transductive De-Noising and Dimensionality Reduction using Total Bregman Regression
Sreangsu Acharyya

Robust Estimation for Mixture of Probability Tables based on beta-likelihood
Yu Fujimoto and Noboru Murata

Fast optimal bandwidth selection for kernel density estimation
Vikas Chandrakant Raykar, Ramani Duraiswami

Risk-Sensitive Learning via Expected Shortfall Minimization
Hisashi Kashima

On Approximate Solutions to Support Vector Machines
Dongwei Cao, Daniel Boley

Confidence Estimation Methods for Partially Supervised Information Extraction
Eugene Agichtein

Inference of Node Replacement Recursive Graph Grammars
Jacek P. Kukluk, Lawrence B. Holder, Diane J. Cook

Learning from Incomplete Ratings Using Non-negative Matrix Factorization
Sheng Zhang, Weihong Wang, James Ford, Fillia Makedon

Health monitoring of a shaft transmission system via hybrid models of PCR and PLS
Yi Fang, Hyun-Woo Cho, Myong Kee Jeong

Modeling Evolutionary Behaviors for Community-based Dynamic Recommendation
Xiaodan Song, Ching-Yung Lin, Belle L. Tseng, Ming-Ting Sun

A Systematic Cross-Comparison of Sequence Classifiers
Binyamin Rozenfeld, Ronen Feldman, Moshe Fresko

Data-Enhanced Predictive Modeling for Sales Targeting
Saharon Rosset, Richard D. Lawrence

Graph-based Methods for Orbit Classification
Abraham Bagherjeiran, Chandrika Kamath

Mining and Validating Localized Frequent Itemsets with Dynamic Tolerance
Olfa Nasraoui, Suchandra Goswami 

Profiling Protein Families from Partially Aligned Sequences
Saikat Mukherjee, Chang Zhao, I.V. Ramakrishnan

Personalized Knowledge Discovery: Mining Novel Association Rules from Text
Xin Chen, Yi-Fang Wu

A Novel Framework for Incorporating Labeled Examples into Anomaly Detection
Jing Gao, Haibin Chengy, Pang-Ning Tan

Towards the Prediction of Protein Abundance from Tandem Mass Spectrometry Data
Anthony J. Bonner, Han Liu

Using Compression to Identify Classes of Inauthentic Texts
Mehmet M. Dalkilic, Wyatt T. Clark, James C. Costello, Predrag Radivojac

Fast Mining of Distance-Based Outliers in High Dimensional Datasets
Amol Ghoting, Srinivasan Parthasarathy, and Matthew Eric Otey

Spatial Weighted Outlier Detection
Yufeng Kou, Chang-Tien Lu, Dechang Chen

Robust Clustering for Tracking Noisy Evolving Data Streams
Olfa Nasraoui, Carlos Rojas

WIP: mining Weighted Interesting Patterns with a strong weight and/or support affinity
Unil Yun and John J. Leggett

Discovering Frequent Tree Patterns over Data Streams
Mark Cheng-Enn Hsieh, Yi-Hung Wu, Arbee L.P. Chen

Finding Sequential Patterns from a Massive Number of Spatio-Temporal Events
Yan Huang, Liqin Zhang, and Pusheng Zhang

Mining Minimal Contrast Subgraph Patterns
Roger Ming Hieng Ting, James Bailey 


