SIAM: Program

Program

Wednesday, April 25, 2007

5:00PM – 7:00PM Registration Opens

Thursday, April 26, 2007

7:00AM – 7:30PM Registration

7:00AM – 5:30PM Internet Café

7:30AM – 8:00AM Continental Breakfast

8:00AM – 8:15AM Welcome Remarks

8:15AM – 9:30AM Invited Keynote
Machine Learning and Analyzing Human Brain Activity
Tom M. Mitchell, Carnegie Mellon University
Session Chair: Chid Apte

9:30AM – 10:00AM Coffee Break

10:00AM - 12:00PM Three parallel sessions S1, S2, S3

-----------------------------------------------------------------

S1. Classification (chair: Jaideep Srivastava)

Title: A General Framework for Mining Concept-Drifting Data Streams with Skewed Distributions
Authors: Jing Gao, Wei Fan, Jiawei Han and Philip S. Yu

Title: Fast Counting with AV-Space for Efficient Rule Induction
Authors: Linyan Wang and Aijun An

Title: Maximizing the Area under the ROC Curve with Decision Lists and Rule Sets
Authors: Henrik Bostrom

Title: Maximum Margin Classifiers with Specified False Positive and False Negative Error Rates
Authors: J. Saketha Nath and C. Bhattacharyya

-----------------------------------------------------------------

S2. Theoretical Foundations (chair: Michael Berry)

Title: An Analysis of Logistic Models: Exponential Family Connections and Online Performance
Authors: Arindam Banerjee

Title: Bandits for Taxonomies: A Model-based Approach
Authors: Sandeep Pandey, Deepak Agarwal, Deepayan Chakrabarti and Vanja Josifovski

Title: Boosting Optimal Logical Patterns Using Noisy Data
Authors: Noam Goldberg and Chung-chieh Shan

Title: Constraint-Based Pattern Set Mining
Authors: Luc De Raedt and Albrecht Zimmermann

-----------------------------------------------------------------

S3. Clustering

Title: Adaptive Concept Learning through Clustering and Aggregation of Relational Data
Authors: Hichem Frigui and Cheul Hwang

Title: RCMap: Efficiently Creating High-Quality Euclidean Embeddings
Authors: Arun Qamra and Edward Chang

Title: Active Learning of Constraints for Semi-supervised Text Clustering
Authors: Ruizhang Huang, Wai Lam and Zhigang Zhang

Title: Mining Naturally Smooth Evolution of Clusters from Dynamic Data
Authors: Yi Wang, Shi-Xia Liu, Jianhua Feng, and Lizhu Zhou

-----------------------------------------------------------------

12:00PM -1:30PM Lunch Break on your own

1:30PM - 2:45PM Invited Keynote
Predictive Learning via Rule Ensembles
Jerome H. Friedman, Stanford University
Session Chair: Vipin Kumar

2:45PM - 3:15PM Coffee Break

3:15PM - 4:45PM Two parallel sessions (S4 and S5) and Invited Session (IS)

-----------------------------------------------------------------

S4: Graphs (chair: Wei Wang)

Title: Clustering by weighted cuts in directed graphs
Authors: Marina Meila and William Pentney

Title: Multi-way Clustering on Relation Graphs
Authors: Arindam Banerjee, Sugato Basu and Srujana Merugu

Title: Fast Multilevel Transduction on Graphs
Authors: Fei Wang and Changshui Zhang

-----------------------------------------------------------------

S5: Applications (chair: Hui Yang)

Title: Harmonium-Based Models for Semantic Video Representation and Classification
Authors: Jun Yang, Yan Liu, Eric Xing and Alexander Hauptmann

Title: Identifying Bundles of Product Options using Mutual Information Clustering
Authors: Claudia Perlich and Saharon Rosset

Title: Lattice based Clustering of Temporal Gene-Expression Matrices
Authors: Yang Huang and Martin Farach-Colton

-----------------------------------------------------------------

IS: Invited Session on Statistical Learning: Joe Verducci (chair)

A Large Margin Method for Semi-supervised Learning
Authors: Xiaotong Shen, Junhui Wang and Wei Pan

Improved Centroids Estimation for the Nearest Shrunken Centroid Classifier
Authors: Sijian Wang and Ji Zhu*

Classification with Reject Option
Authors: Radu Herbei

-----------------------------------------------------------------

4:45PM – 5:00PM Organizational Break

5:00PM – 6:20PM Poster Spotlights (Plenary) Chair: Dan Boley

Robust, Complete, and Efficient Correlation Clustering - Elke Achtert, Christian Böhm, Hans-Peter Kriegel, Peer Kröger and Arthur Zimek
On Anonymization of String Data - Charu Aggarwal and Philip Yu
Load Shedding in Classifying Multi-Source Streaming Data: A Bayes Risk Approach - Yijian Bai, Haixun Wang and Carlo Zaniolo
Topic Models over Text Streams: A Study of Batch and Online Unsupervised Learning - Arindam Banerjee and Sugato Basu
Are Approximation Algorithms for Consensus Clustering Worthwhile? - Michael Bertolacci and Anthony Wirth
Learning from Time-Changing Data with Adaptive Windowing - Albert Bifet and Ricard Gavaldà
WAT: Finding Top-K Discords in Time Series Database - Yingyi Bu, Tat-Wing Leung, Ada Wai-Chee Fu, Eamonn Keogh, Jian Pei and Sam Meshkin
A PAC Bound for Approximate Support Vector Machines - Dongwei Cao and Daniel Boley
Localized Support Vector Machine and Its Efficient Algorithm - Haibin Cheng, Pang-Ning Tan and Rong Jin
Understanding and Utilizing the Hierarchy of Abnormal BGP Events - Dejing Dou, Jun Li, Han Qin, Shiwoong Kim and Sheng Zhong
Distributed Top-K Outlier Detection from Astronomy Catalogs using the DEMAC System - Haimonti Dutta, Chris Giannella, Kirk Borne and Hillol Kargupta
Mining Visual and Textual Data for Constructing a Multi-Modal Thesaurus - Hichem Frigui and Joshua Caudill
HP2PC: Scalable Hierarchically-Distributed Peer-to-Peer Clustering - Khaled Hammouda and Mohamed Kamel
Bursty Feature Representation for Clustering Text Streams - Qi He, Kuiyu Chang, Ee-Peng Lim and Jun Zhang
Flexible Anonymization For Privacy Preserving Data Publishing: A Systematic Search Based Approach - Bijit Hore, Ravi Chandra Jammalamadaka and Sharad Mehrotra
A System for Keyword Search on Textual Streams - Vagelis Hristidis, Oscar Valdivia, Michail Vlachos and Philip S. Yu
Co-Preserving Patterns in Bipartite Partitioning for Topic Identification - Tianming Hu, Hui Xiong, Sam Yuan Sung
Change-Point Detection using Krylov Subspace Learning - Tsuyoshi Ide and Koji Tsuda
Approximating Representations for Large Numerical Databases - Szymon Jaroszewicz and Marcin Korzen
Distance Preserving Dimension Reduction for Manifold Learning - Hyunsoo Kim, Haesun Park and Hongyuan Zha
Stacked Graphical Models for Efficient Inference in Markov Random Fields - Zhenzhen Kou and William W. Cohen
Summarizing Review Scores of "Unequal'' Reviewers - Hady W. Lauw, Ee-Peng Lim and Ke Wang
A Better Alternative to Piecewise Linear Time Series Segmentation - Daniel Lemire
Patterns of Cascading Behavior in Large Blog Graphs - Jure Leskovec, Mary McGlohon, Christos Faloutsos, Natalie Glance and Matthew Hurst
PoClustering: Lossless Clustering of Dissimilarity Data - Jinze Liu, Qi Zhang, Wei Wang, Leonard McMillan and Jan Prins
An incremental data-stream sketch using sparse random projections - Aditya Krishna Menon, Gia Vinh Anh Pham, Sanjay Chawla and Anastasios Viglas
Performance of Recommendation Systems in Dynamic Streaming Environments - Olfa Nasraoui, Jeff Cerwinske, Carlos Rojas and Fabio Gonzalez
Scalable Name Disambiguation using Multi-level Graph Partition - Byung-Won On and Dongwon Lee
Dynamic Algorithm for Graph Clustering Using Minimum Cut Tree - Barna Saha and Pabitra Mitra
Rank Aggregation for Similar Items - D. Sculley
Sketching Landscapes of Page Farms - Bin Zhou and Jian Pei
Estimating False Negatives for Classification Problems with Cluster Structure - Gyorgy J. Simon, Vipin Kumar, and Zhi-Li Zhang
Discriminating Subsequence Discovery for Sequence Clustering - Jianyong Wang, Yuzhou Zhang, Lizhu Zhou, George Karypis and Charu Aggarwal
Fast Best-Match Shape Searching in Rotation Invariant Metric Spaces - Dragomir Yankov, Eamonn Keogh, Li Wei, Xiaopeng Xi and Wendy Hodges
HACS: Heuristic Algorithm for Clustering Subsets - Ding Yuan and Nick Street
On Demand Phenotype Ranking through Subspace Clustering - Xiang Zhang, Wei Wang and Jun Huan
Semi-Supervised Dimensionality Reduction - Daoqiang Zhang, Zhi-Hua Zhou and Songcan Chen
Computing Statistical Profiles of Active Sites in Proteins - Chang Zhao, Jalal Mahmud, I.V. Ramakrishnan and Subramanyam Swaminathan
Semi-supervised Feature Selection via Spectral Analysis - Zheng Zhao and Huan Liu

6:30PM – 8:30PM Welcome Reception and Poster Session

Friday, April 27, 2007

7:00AM – 4:00PM Registration

7:00AM – 5:30PM Internet Café

7:30AM – 8:00AM Continental Breakfast

8:00AM – 8:15AM Announcements

8:15AM – 9:30AM Invited Keynote
Deep Computing in Biology: Challenges and Progress
Dr. Ajay Royyuru, IBM Research
Session Chair: David Skillicorn

9:30AM – 10:00AM Break

10:00AM - 12:00PM Three parallel sessions S6, S7, S8

-----------------------------------------------------------------

S6: Privacy and Security

Title: AC-Framework for Privacy-Preserving Collaboration
Authors: Wei Jiang and Chris Clifton

Title: On Privacy-Preservation of Text and Sparse Binary Data with Sketches
Authors: Charu Aggarwal and Philip Yu

Title: Preventing Information Leaks in Email
Authors: Vitor Carvalho and William Cohen

Title: Towards Attack-Resilient Geometric Data Perturbation
Authors: Keke Chen, Gordon Sun, and Ling Liu

-----------------------------------------------------------------

S7: Spatial and Temporal Mining (chair: Sanjay Chawla)

Title: Finding Motifs in Database of Shapes
Authors: Xiaopeng Xi, Eamonn Keogh, Li Wei and Agenor Mafra-Neto

Title: Incremental Spectral Clustering With Application to Monitoring of Evolving Blog Communities
Authors: Huazhong Ning, Wei Xu, Chi Yun, Yihong Gong and Thomas Huang

Title: ROAM: Rule- and Motif-Based Anomaly Detection in Massive Moving Object Data Sets
Authors: Xiaolei Li, Jiawei Han, Sangkyum Kim and Hector Gonzalez

Title: Segmentations with rearrangements
Authors: Aristides Gionis and Evimaria Terzi

-----------------------------------------------------------------

S8: Learning (chair: Hui Xiong)

Title: Efficient Multiclass Boosting Classification with Active Learning
Authors: Jian Huang, Seyda Ertekin, Yang Song, Hongyuan Zha and C. Lee Giles

Title: Kernel-based Detection of Mislabeled Training Examples
Authors: Hamed Valizadegan and Pang-Ning Tan

Title: On Sample Selection Bias and Its Efficient Correction via Model Averaging and Unlabeled Examples
Authors: Wei Fan and Ian Davidson

Title: Probabilistic Joint Feature Selection for Multi-task Learning
Authors: Tao Xiong, Jinbo Bi, Bharat Rao and Vladimir Cherkassky

-----------------------------------------------------------------

12:00PM - 1:30PM Lunch Break on your own

1:30PM - 2:45PM Invited Keynote
The Next Algorithmic and Theoretical Challenges for Search Engines
Corinna Cortes, Google Research
Session Chair: Srinivasan Parthasarthy

2:45PM - 3:15PM Coffee Break

2:55PM-4:55PM Invited CRM Tutorial
Data Analytics for Marketing Decision Support
Presenters: Saharon Rosset (IBM) and Naoki Abe (IBM)

3:15PM - 4:45PM Two parallel sessions (S9 and S10)

-----------------------------------------------------------------

S9: Matrices and Tensors (chair: Arindam Banerjee)

Title: Fast Newton-type Methods for the Least Squares Nonnegative Matrix Approximation Problem
Authors: Dongmin Kim, Suvrit Sra and Inderjit Dhillon

Title: Higher Order Orthogonal Iteration of Tensors (HOOI) and its Relation to PCA and GLRAM
Authors: Benard Sheehan and Yousef Saad

Title: Less is More: Compact Matrix Decomposition for Large Sparse Graphs
Authors: Jimeng Sun, Yinglian Xie, Hui Zhang and Christos Faloutsos

-----------------------------------------------------------------

S10: Dimensionality (chair: Pang-Ning Tan)

Title: Conical Dimension as an Intrinsic Dimension Estimator and its Applications
Authors: Xin Yang, Sebastien Michea and Hongyuan Zha

Title: Nonlinear Dimensionality Reduction using Approximate Nearest Neighbors
Authors: Erion Plaku and Lydia Kavraki

Title: On Point Sampling Versus Space Sampling for Dimensionality Reduction
Authors: Charu Aggarwal

-----------------------------------------------------------------

4:45PM - 5:00PM Organizational Break

5:00PM - 6:15PM Panel
Data Mining Research: Current Status and Future Opportunities
Moderator: Haym Hirsh, NSF
Panelists:    Ajay Royyuru - IBM Research
                    Jerry Friedman - Stanford University
                    Christos Faloutsos - CMU
                    Mehran Sahami - Google

6:30PM - Special reception and poster session sponsored by the Digital Technology Center (DTC) at the University of Minnesota to showcase data mining research at the University. This is not a SIAM event, but is open to all attendees of the conference.

Saturday, April 28, 2007

7:30AM – 4:00PM Regstration

7:30AM – 4:00PM Internet Café

8:00AM-4:30PM Workshop on Text Mining Schedule [PDF, 18KB]

8:30AM-5:15PM Workshop on Biomedical Informatics Schedule [PDF, 23KB]

8:45AM-12:00PM Tutorial II
Mining Large Time-evolving Data Using Matrix and Tensor Tools
Presenters: Christos Faloutsos (CMU), Tamara G Kolda (Sandia National Labs), and Jimeng Sun (CMU)

8:45AM -12:00PM Tutorial III
Dimensionality Reduction for Data Mining
Presenters: Lei Yu (Binghamton U), Jieping Ye (Arizona State U), and Huan Liu (Arizona State U)

10:00AM – 10:45AM Coffee Break

12:00PM – 1:30PM Lunch

1:30PM - 3:30PM Tutorial IV
A Statistical Framework for Mining Data Streams
Presenters: Simon Urbanek (AT&T Labs) and Tamraparni Dasu (AT&T Labs)

3:00PM – 3:45PM Coffee Break

End of Conference

Donate · Contact Us · Site Map · Join SIAM · My Account

Program

Wednesday, April 25, 2007

Thursday, April 26, 2007

S1. Classification (chair: Jaideep Srivastava)

S2. Theoretical Foundations (chair: Michael Berry)

S3. Clustering

S4: Graphs (chair: Wei Wang)

S5: Applications (chair: Hui Yang)

IS: Invited Session on Statistical Learning: Joe Verducci (chair)

Friday, April 27, 2007

S6: Privacy and Security

S7: Spatial and Temporal Mining (chair: Sanjay Chawla)

S8: Learning (chair: Hui Xiong)

S9: Matrices and Tensors (chair: Arindam Banerjee)

S10: Dimensionality (chair: Pang-Ning Tan)

Saturday, April 28, 2007

In This Section

Also See