SDM'05 Preliminary Program

Thursday, April 21, 2005

7:00am - 8:15am Continental Breakfast
8:15am – 8:30am Welcoming Remarks
8:30am – 9:45am

Keynote Speaker

Strategies for Visual Data Mining
Ed Wegman, George Mason University

9:45am – 10:00am Coffee Break
10:00 – 12:00am

Tutorial 1: Segmentation Algorithms for Time Series and Sequence Data

Track: Statistics in Data Mining
Chair: David Skillicorn

Computational Developments of Ψ-learning
Authors: Sijin Liu, Xiaotong Shen, Wing Hung Wong

A Random Walks Perspective on Maximizing Satisfaction and Profit
Authors: Matthew Brand

Surveying Data for Patchy Structure
Authors: Ronald Pearson

2-Dimensional Singular Value Decomposition for 2D Maps and Images
Authors: Chris Ding, Jieping Ye

Track: Stream Data Mining
Chair: Hiroshi Motoda

Summarizing and Mining Skewed Data Streams
Authors: Graham Cormode, Muthukrishnan S.

Online Analysis of Community Evolution in Data Streams
Authors: Charu Aggarwal, Philip Yu

Mining Frequent Itemsets from Data Streams with a Time-Sensitive Sliding Window
Authors: Chih-Hsiang Lin, Ding-Ying Chiu, Yi-Hung Wu, Arbee L. P. Chen

On Abnormality Detection in Spuriously Populated Data Streams
Authors: Charu Aggarwal

12:00 - 1:30pm Lunch
1:30pm - 2:45pm

Keynote Speaker

Embedded
Mark Hansen, University of California, Los Angeles

2:45pm - 3:15pm Coffee Break
3:15pm - 4:45pm

Special Session 1: Statistics and Data Mining

Mining Earth Science Data for Geophysical Structure: A Case Study in Cloud Detection
Bin Yu, University of California, Berkeley

Using a Bayesian Distance Measure to Combine Rare Event Definitions
William DuMouchel, Lincoln Technologies

Highly Structured Models in High Energy Astrophysics
David van Dyk, UCI

Track: Privacy Preserving Data Mining
Chair: Pradeep Dubey

Privacy-Preserving Classification of Customer Data without Loss of Accuracy
Authors: Zhiqiang Yang, Sheng Zhong, Rebecca Wright

Privacy Aware Market Basket Data Set Generation: A Feasible Approach for Inverse Frequent Set Mining
Authors: Xintao Wu, Ying Wu, Yongge Wang, Yingjiu Li

On Variable Constraints in Privacy Preserving Data Mining
Authors: Charu Aggarwal, Philip Yu

Track: Clustering
Chair: George Karypis

Clustering with Model-level Constraints
Authors: David Gondek, Shivakumar Vaithyanathan, Ashutosh Garg

Clustering with Constraints: Feasibility Issues and the k-Means Algorithm
Authors: Ian Davidson, S.S. Ravi

A Cutting Algorithm for the Minimum Sum-of-Squared Error Clustering
Authors: Yu Xia, Jiming Peng

4:45pm - 5:00pm Intermission
5:00pm - 6:30pm

Special Session 1: (Conti)

Model Based Detection of Distribution Changes in Temporal Data Sets
Igor Cadez, GCS, Inc.

A Sequential Monte Carlo Method for Bayesian Analysis of Massive Datasets
Greg Ridgeway,The Rand Corporation

Bayesian Analysis of the Power Spectrum of the Cosmic Microwave Background
Jeff Jewell,Data Understanding Systems Group, JPL

Track: Scientific Data Mining
Chair: Mehran Sahami

Dynamic Classification of Defect Structures in Molecular Dynamics Simulation Data
Authors: Sameep Mehta, Steve Barr, Alex Choy, Hui Yang, Srinivasan Parthasarathy, Raghu Machiraju, John Wilkins

Striking Two Birds With One Stone: Simultaneous Mining of Positive and Negative Spatial Patterns
Authors: Bavani Arunasalam, Sanjay Chawla, Pei Sun

Finding Young Stellar Populations in Elliptical Galaxies from Independent Components of Optical Spectra
Authors: Ata Kaban, Louisa Nolan, Somak Raychaudhury

Track: Classifiers and Ensembles
Chair: Rebecca Wright

Hybrid Data Reduction with Fuzzy Rough Set Theory for Classification
Authors: Qinghua Hu

HARMONY: Efficiently Mining the Best Rules for Classification
Authors: Jianyong Wang, George Karypis

On Error Correlation and Accuracy of Nearest Neighbor Ensemble Classifiers
Authors: Carlotta Domeniconi, Bojun Yan

6:30pm - 8:15pm Poster Session & Welcome Reception

Friday, April 22, 2005

7:00am – 8:15am Continental Breakfast
8:15am – 8:30am Welcoming Remarks
8:30am – 9:45am

Keynote Speaker

Simple Models for Customer-Based Analysis: Linking RFM with CLV
Peter Fader, University of Pennsylvania

9:45am – 10:00am Coffee Break
10:00 – 12:00am

Tutorial 2: Pattern Discovery in Biosequences

Track: Association Rules and Database Issues
Chair: Huan Liu

Lazy Learning for Classification Based on Query Projections
Authors: Yiqiu Han, Wai Lam

Mining Non-Derivable Association Rules
Authors: Bart Goethals, Juho Muhonen, Hannu Toivonen

Depth-First Non-Derivable Itemset Mining
Authors: Toon Calders, Bart Goethals

Exploiting Relationships for Domain-Independent Data Cleaning
Authors: Dmitri Kalashnikov, Sharad Mehrotra, Zhaoqi Chen

Track: Graphs and Graphical Models
Chair: Ee-Peng Lim

A Spectral Clustering Approach To Finding Communities in Graph
Authors: Scott White, Padhraic Smyth

Mining Behavior Graphs for ``Backtrace'' of Noncrashing Bugs
Authors: Chao Liu, Xifeng Yan, Hwanjo Yu, Jiawei Han, Philip Yu

Learning to Refine Ontology for a New Web Site Using a Bayesian Approach
Authors: Tak-Lam Wong, Wai Lam

Exploiting Parameter Related Domain Knowledge for Learning in Graphical Models
Authors: Radu Stefan Niculescu, Tom Mitchell, Bharat Rao

12:00 - 1:30pm Lunch
1:30pm - 2:45pm

Keynote Speaker

The Practice of Cluster Analysis
Jon Kettenring, Retired

2:45pm - 3:15pm Coffee Break
3:15pm – 4:45pm

Special Session 2: Industry/Government Applications

Online Learning by Projecting: From Theory to Lrge Scale Web-spam Filtering
Yoram Singer (Google Inc.)

Density Estimation with Mercer Kernels
William Macready (NASA Ames Research Center)

Track: SVM and Classification
Chair: Ramasamy Uthurusamy

Exploiting Geometry for Support Vector Machine Indexing
Authors: Navneet Panda, Edward Chang

Parallel Computation of RBF Kernels for Support Vector Classifiers
Authors: Shibin Qiu, Terran Lane

Loadstar: A Load Shedding Scheme for Classifying Data Streams
Authors: Yun Chi, Philip Yu, Haixun Wang, Richard Muntz

Track: Complex Data Types: Text, Images, and Sequences
Chair: Michael W. Berry

Topic-driven Clustering for Document Datasets
Authors: Ying Zhao, George Karypis

Variational Learning for Noisy-OR Component Analysis
Authors: Tomas Singliar, Milos Hauskrecht

Summarizing Sequential Data with Closed Partial Orders
Authors: Gemma Casas-Garriga

4:45pm - 5:00pm Intermission
5:00pm – 6:30pm

Special Session 2: (Conti)

Automated Text Analysis of Unstructured Test in NASA's Aviation Safety Reporting System
Thomas A Ferryman, Battelle Pacific Northwest National Laboratory

Toward Automated Diagnosis and Forecasting of Performance Problems in Enterprise IT Infrastructures: A Pattern Recognition Approach
Moises Goldszmidt (Hewlett-Packard Labs)

Track: Statistics in Data Mining
Chair: Joydeep Ghosh

SUMSRM: A New Statistic for the Structural Break Detection in Time Series
Authors: Kwok Pan Pang

Markov Models for Identification of Significant Episodes
Authors: Robert Gwadera, Mikhail Atallah, Wojciech Szpankowski

Efficient Mining of Maximal Sequential Patterns Using Multiple Samples
Authors: Congnan Luo, Soon Chung

Track: Scientific Data Mining
Chair: Parthasarathy Srinivasan

Gaussian Processes for Active Data Mining of Spatial Aggregates
Authors: Naren Ramakrishnan, Chris Bailey-Kellogg, Satish Tadepalli, Varun Pandey

Correlation Clustering for Learning Mixtures of Canonical Correlation Models
Authors: Xiaoli Fern, Carla Brodley, Mark Friedl

On Periodicity Detection and Structural Periodic Similarity
Authors: Michail Vlachos, Philip Yu, Vittorio Castelli

Saturday, April 23, 2005

7:00am – 8:15am Continental Breakfast
8:30am – 10:00am Workshops Begin
10:00am – 10:30am Coffee Break
10:30am – 12:00 Sessions Resume
12:00 – 1:45pm Lunch
1:45pm – 2:45pm Sessions Resume
2:45pm – 3:15pm Coffee Break
3:15pm – 4:30pm Sessions Resume
4:30pm Conference Adjourns

Poster Papers

Cross Table Cubing: Mining Iceberg Cubes from Data Warehouses
Authors: Jian Pei, Moonjung Cho, David Cheung

Decision Tree Induction in High Dimensional, Hierarchically Distributed Databases
Authors: Amir Bar-Or, Ran Wolff, Assaf Schuster, Daniel Keren

Slope One Predictors for Online Rating-Based Collaborative Filtering
Authors: Daniel Lemire, Anna Maclachlan

Sparse Fisher Discriminant Analysis for Computer Aided Detection
Authors: Mehmet Dundar, Glenn Fung, Jinbo Bi, Sandilya Sathyakama, Bharat Rao

Expanding the Training Data Space Using Emerging Patterns and Genetic Methods
Authors: Hamad Alhammady, Kotagiri Ramamohanarao

Making Data Mining Models Useful to Model Non-paying Customers of Exchange Carriers
Authors: Wei Fan, Janek Mathuria, Chang-tien Lu

Matrix Condition Number Prediction with SVM Regression and Feature Selection
Authors: Shuting Xu, Jun Zhang

Cluster Validity Analysis of Alternative Results from Multi-Objective Optimization
Authors: Reda Alhajj

ClosedPROWL: Efficient Mining of Closed Frequent Continuities by Projected Window List Technology
Authors: Kuo-Yu Huang

Three Myths about Dynamic Time Warping Data Mining
Authors: Chotirat Ann Ratanamahatana, Eamonn Keogh

PCA and Kernel PCA using Polynomial Filtering: A Case Study on Face Recognition
Authors: Effrosyni Kokiopoulou, Yousef Saad

Mining Top-K Itemsets over a Sliding Window Based on Zipfian Distribution
Authors: Raymond Chi-Wing Wong, Ada Wai-Chee Fu

Hierarchical Document Classification Using Automatically Generated Hierarchy
Authors: Tao Li

On Clustering Binary Data
Authors: Tao Li

Time-series Bitmaps: a Practical Visualization Tool for Working with Large Time Series Databases
Authors: Nitin Kumar, Venkata Nishanth Lolla, Eamonn Keogh, Stefano Lonardi, Chotirat Ann Ratanamahatana

Pushing Feature Selection Ahead Of Join
Authors: Rong She, Ke Wang, Yabo Xu, Philip Yu

Discarding Insignificant Rules during Impact Rule Discovery in Large, Dense Databases
Authors: Shiying Huang, Geoffrey Webb

SPID4.7: Discretization Using Successive Pseudo Deletion at Maximum Information Gain Boundary Points
Authors: Himika Biswas, Somnath Pal

Iterative Mining for Rules with Constrained Antecedents
Authors: Zheng Sun, Philip Yu, Xiang-Yang Li

Influence in Ratings-Based Recommender Systems: An Algorithm-Independent Approach
Authors: Al Rashid, George Karypis, John Riedl

Mining Unconnected Patterns in Workflows
Authors: Gianluigi Greco, Antonella Guzzo, Giuseppe Manco, Domenico Sacca'

The Best Nurturers in Computer Science Research
Authors: Bharath Kumar Mohan

Knowledge Discovery from Heterogeneous Dynamic Systems using Change-Point Correlations
Authors: Tsuyoshi Ide, Keisuke Inoue

Building Decision Trees on Records Linked through Key References
Authors: Ke Wang, Yabo Xu, Philip Yu, Rong She

Efficient Allocation of Marketing Resources using Dynamic Programming
Authors: Giuliano Tirenni, , Abderrahim Labbi, Andre Elisseeff, Cesar Berrospi

Near-Neighbor Search in Pattern Distance Spaces
Authors: Haixun Wang, Chang-Shing Perng, Philip Yu

An Algorithm for Well Structured Subspace Clusters
Authors: Haiyun Bian, Raj Bhatnagar

CBS: A New Classification Method by Using Sequential Patterns
Authors: Vincent Shin-Mu Tseng, Chao-Hui Lee

SeqIndex: Indexing Sequences by Sequential Pattern Analysis
Authors: Hong Cheng, Xifeng Yan, Jiawei Han

On the Equivalence of Nonnegative Matrix Factorization and Spectral Clustering
Authors: Chris Ding, Xiaofeng He

Kronecker Factorization for Speeding up Kernel Machines
Authors: Gang Wu, Zhihua Zhang, Edward Chang

Symmetric Statistical Translation Models for Automatic Image Annotation
Authors: Feng Kang, Rong Jin

Correcting Sampling Bias in Structural Genomics through Iterative Selection of Underrepresented Targets
Authors: Kang Peng, Slobodan Vucetic, Zoran Obradovic

Statictical Models for Unequally Spaced Time Series
Authors: Alina Beygelzimer, Emre Erdogan, Sheng Ma, Irina Rish

CLSI: A Flexible Approximation Scheme from Clustered Term-Document Matrices
Authors: Efstratios Gallopoulos, Dimitrios Zeimpekis

WFIM: Weighted Frequent Itemset Mining with a Weight Range and a Minimum Weight
Authors: Unil Yun, John Leggett

Model-based Clustering With Probabilistic Constraints
Authors: Martin Hiu Chung Law, Alexander Topchy, Anil K. Jain

 


Last Edited: 3/13/05
DHTML Menus by http://www.milonic.com/