2009 Data Mining Proceedings: Table of Contents

Proceedings of the Ninth SIAM International Conference on Data Mining

Each link below is to a PDF of the paper as it was submitted. Papers are listed in program order. PDF file names represent the Proceedings (DM and year 09), followed by order of appearance (e.g. 001) and first author's last name and first initial..

Preface, Message from the Conference Co-Chair Acknowledgments

Sessions:
S1 S2 S3 S4 S5 S6 S7 S8 S9 S10 S11

Poster Spotlights

Session S1: Clustering

2          GAD: General Activity Detection for Fast Clustering on Large Data
            Xin Jin, Sangkyum Kim, Jiawei Han, Liangliang Cao, and Zhijun Yin

14        CORE: Nonparametric Clustering of Large Numeric Databases
            Andrej Taliun, Michael H. Böhlen, and Arturas Mazeika

26        Constraint-Based Subspace Clustering
            Elisa Fromont, Adriana Prado, and Céline Robardet

38        Integrated KL (K-means – Laplacian) Clustering: A New Clustering Approach by Combining Attribute Data and Pairwise Relations
            Fei Wang, Chris Ding, and Tao Li

49        Hybrid Clustering of Text Mining and Bibliometrics Applied to Journal Sets
            Xinhai Liu, Shi Yu, Yves Moreau, Bart De Moor, Wolfgang Glänzel, and Frizo Janssens

Session S2: Time Series

61        Event Discovery in Time Series
            Dan Preston, Pavlos Protopapas, and Carla Brodley

73        FuncICA for Time Series Pattern Discovery
            Nishant Mehta and Alexander Gray

85        Autocannibalistic and Anyspace Indexing Algorithms with Application to Sensor Data Mining
            Lexiang Ye, Xiaoyue Wang, Eamonn Keogh, and Agenor Mafra-Neto

97        Proximity-Based Anomaly Detection Using Sparse Structure Learning
            Tsuyoshi Idé, Aurelie C. Lozano, Naoki Abe, and Yan Liu

109      Optimal Distance Bounds on Time-Series Data
            Michail Vlachos, Suleyman S. Kozat, and Philip S. Yu

Session S3: Statistical Methods and Applications

121      Application of Bayesian Partition Models in Warranty Data Analysis
            Markus Mueller, Christoph Schlieder, and Axel Blumenstock

133      Learning Random-Walk Kernels for Protein Remote Homology Identification and Motif Discovery
            Renqiang Min, Rui Kuang, Anthony Bonner, and Zhaolei Zhang

145      Outlier Detection with Globally Optimal Exemplar-Based GMM
            Xingwei Yang, Longin Jan Latecki, and Dragoljub Pokrajac

155      Prior-Free Rare Category Detection
            Jingrui He and Jaime Carbonell

164      A Family of Large Margin Linear Classifiers and Its Application in Dynamic Environments
            Jianqiang Shen and Thomas G. Dietterich

Session S4: Unsupervised Learning and Clustering

173      DensEst: Density Estimation for Data Mining in High Dimensional Spaces
            Emmanuel Müller, Ira Assent, Ralph Krieger, Stephan Günnemann, and Thomas Seidl

185      A Framework for Exploring Categorical Data
            Varun Chandola, Shyam Boriah, and Vipin Kumar

197      Discovering Substantial Distinctions among Incremental Bi-Clusters
            Faris Alqadah and Raj Bhatnagar

209      Bayesian Cluster Ensembles
            Hongjun Wang, Hanhuai Shan, and Arindam Banerjee

221      Agglomerative Mean-Shift Clustering via Query Set Compression
            Xiao-Tong Yuan, Bao-Gang Hu, and Ran He

Session S5: Data Stream Mining

233      Adaptive Concept Drift Detection
            Anton Dries and Ulrich Rückert

245      Scalable Distributed Change Detection from Astronomy Data Streams Using Local, Asynchronous Eigen Monitoring Algorithms       
            Kamalika Das, Kanishka Bhaduri, Sugandha Arora, Wesley Griffin, Kirk Borne, Chris Giannella, and Hillol Kargupta

257      Positive Unlabeled Learning for Data Stream Classification
            Xiao-Li Li, Philip S. Yu, Bing Liu, and See-Kiong Ng

269      Time-Decayed Correlated Aggregates over Data Streams
            Graham Cormode, Srikanta Tirthapura, and Bojian Xu

281      Multi-Modal Hierarchical Dirichlet Process Model for Predicting Image Annotation and Image-Object Label Correspondence
            Oksana Yakhnenko and Vasant Honavar

Poster Spotlights

295      A Bayesian Approach to Graphy Regression with Relevant Subgraph Selection
            Silvia Chiappa, Hiroto Saigo, and Koji Tsuda

305      A Hybrid Data Mining Metaheuristic for the p-Median Problem
            Alexandre Plastino, Erick R. Fonseca, Richard Fuchshuber, Simone de L. Martins, Alex A. Freitas, Martino Luis, and Said Salhi

317      A New Constraint for Mining Sets in Sequences
            Boris Cule, Bart Goethals, and Céline Robardet

329      A Re-evaluation of the Over-Searching Phenomenon in Inductive Rule Learning
            Frederik Janssen and Johannes Fürnkranz

341      A Semi-Supervised Framework for Feature Mapping and Multiclass Classification
            Bo Chen, Wai Lam, Ivor Tsang, and Tak-Lam Wong

353      Aligned Graph Classification with Regularized Logistic Regression
            Brian Quanz and Jun Huan

365      An Entity Based Model for Coreference Resolution
            Michael Wick, Aron Culitta, Khashayar Rohanimanesh, and Andrew McCallum

377      Analyses for Service Interaction Networks with Applications to Service Delivery
            S. Kameshwaran, Sameep Mehta, Vinayaka Pandit, Gyana Parija, Sudhanshu Singh, and N. Viswanadham

389      Change-Point Detection in Time-Series Data by Direct Density-Ratio Estimation
            Yoshinobu Kawahara and Masashi Sugiyama

401      Context Aware Trace Clustering: Towards Improving Process Mining Results
            R. P. Jagadeesh Chandra Bose and Wil M. P. van der Aalst

413      Detection and Characterization of Anomalies in Multivariate Time Series
            Haibin Cheng, Pang-Ning Tan, Christopher Potter and Steven Klooster

425      Discovery of Geospatial Discriminating Patterns from Remote Sensing Datasets
            Wei Ding, Tomasz Stepinski, and Josue Salazar

437      Diversity-Based Weighting Schemes for Clustering Ensembles
            Francesco Gullo, Andrea Tagarelli, and Sergio Greco

449      Divide and Conquer Strategies for Effective Information Retrieval
            Jie Chen and Yousef Saad

461      Speeding Up Secure Computations via Embedded Caching
            K. Zhai, W. K. Ng, A. R. Herianto, and S. Han

473      Exact Discovery of Time Series Motifs
            Abdullah Mueen, Eamonn Keogh, Qiang Zhu, Sydney Cash, and Brandon Westover

485      Exploiting Semantic Constraints for Estimating Supersenses with CRFs
            Gerhard Paaβ and Frank Reichartz

497      Feature Weighted SVMs Using Receiver Operating Characteristics
            Shaoyi Zhang, M. Maruf Hossain, Md. Rafiul Hassan, James Bailey, and Kotagiri Ramamohanarao

509      FEDRA: A Fast and Efficient Dimensionality Reduction Algorithm
            Panagis Magdalinos, Christos Doulkeridis, and Michalis Vazirgiannis

521      Finding Representative Association Rules from Large Rule Collections
            Warren L. Davis IV, Peter Schwarz, and Evimaria Terzi

533      FutureRank: Ranking Scientific Articles by Predicting their Future PageRank
            Hassan Sayyadi and Lise Getoor

545      Highlighting Diverse Concepts in Documents
            Kun Liu, Evimaria Terzi, and Tyrone Grandison

557      Identifying Information-Rich Subspace Trends in High-Dimensional Data
            Snehal Pokharkar and Chandan K. Reddy

569      Low-Entropy Set Selection
            Hannes Heikinheimo, Jilles Vreeken, Arno Siebes, and Heikki Mannila

581      Measuring Discrimination in Socially-Sensitive Decision Records
            Dino Pedreschi, Salvatore Ruggieri, and Franco Turini

593      Mining Cohesive Patterns from Graphs with Feature Vectors
            Flavia Moser, Recep Colak, Arash Rafiey, and Martin Ester

605      Mining Complex Spatio-Temporal Sequence Patterns
            Florian Verhein

617      Mining for Surprise Events Within Text Streams
            Paul Whitney, Dave Engel, and Nick Cramer

628      Multi-field Correlated Topic Modeling
            Konstantin Salomatin, Yiming Yang, and Abhimanyu Lad

638      Multiple Kernel Clustering
            Bin Zhao, James T. Kwok, and Changshui Zhang

650      MUSK: Uniform Sampling of k Maximal Patterns
            Mohammad Al Hasan and Mohammed Zaki

662      Noise Robust Classification Based on Spread Spectrum
            Joern David

673      Non-negative Matrix Factorization, Convexity and Isometry
            Nikolaos Vasiloglou, Alexander G. Gray, and David V. Anderson

685      Non-parametric Information-Theoretic Measures of One-Dimensional Distribution Functions from Continuous Time Series
            Paolo D’Alberto and Ali Dasdan

697      On Maximum Coverage in the Streaming Model & Application to Multi-topic Blog-Watch
            Barna Saha and Lise Getoor

709      On Randomness Measures for Social Networks
            Xiaowei Ying and Xintao Wu

721      On Segment-Based Stream Modeling and Its Applications
            Charu C. Aggarwal

733      On the Comparison of Relative Clustering Validity Criteria
            Lucas Vendramin, Ricardo J. G. B. Campello, and Eduardo R. Hruschka

745      Parallel Pairwise Clustering
            Elad Yom-Tov and Noam Slonim

756      PICC Counting: Who Needs Joins When You Can Propagate Efficiently?
            Jong Wook Kim and K. Selçuk Candan

768      Providing Privacy through Plausibly Deniable Search
            Mummoorthy Murugesan and Chris Clifton

780      Randomization Techniques for Graphs
            Sami Hanhijärvi, Gemma C. Garriga, and Kai Puolamäki

792      Semi-supervised Learning by Sparse Representation
            Shuicheng Yan and Huan Wang

802      ShatterPlots: Fast Tools for Mining Large Graphs
            Ana Paula Appel, Deepayan Chakrabarti, Christos Faloutsos, Ravi Kumar, Jure Leskovec, and Andrew Tomkins

814      Spatially Cost-Sensitive Active Learning
            Alexander Liu, Goo Jun, and Joydeep Ghosh

826      Structure and Dynamics of Research Collaboration in Computer Science
            Christian Bird, Earl Barr, and Andre Nash

838      Text Categorization with All Substring Features
            Daisuke Okanohara and Jun’ichi Tsujii

847      The Set Classification Problem and Solution Methods
            Xia Ning and George Karypis

859      Topic Evolution in a Stream of Documents
            André Gohr, Alexander Hinneburg, René Schult, and Myra Spiliopoulou

871      Tracking User Mobility to Detect Suspicious Behavior
            Gaurav Tandon and Philip K. Chan

Session S6: Supervised Learning

884      Toward Optimal Ordering of Prediction Tasks
            Abhimanyu Lad, Yiming Yang, Rayid Ghani, and Bryan Kisiel

894      Hierarchical Linear Discriminant Analysis for Beamforming
            Jaegul Choo, Barry L. Drake, and Haesun Park

906      Twin Vector Machines for Online Learning on a Budget
            Zhuang Wang and Slobodan Vucetic

918      The Metric Dilemma: Competence-Conscious Associative Classification
            Adriano Veloso, Mohammed Zaki, Wagner Meira Jr., and Marcos Gonçalves

Session S7: Privacy and Social Networks

930      AMORI: A Metric-Based One Rule Inducer
            Niklas Lavesson and Paul Davidsson

942      Identifying Unsafe Routes for Network-Based Trajectory Privacy
            Aris Gkoulalas-Divanis, Vassilios S. Verykios, and Mohamed F. Mokbel

954      Privacy Preservation in Social Networks with Sensitive Edge Weights
            Lian Liu, Jie Wang, Jinze Liu, and Jun Zhang

966      Graph Generation with Prescribed Feature Constraints
            Xiaowei Ying and Xintao Wu

978      Detecting Communities in Social Networks Using Max-Min Modularity
            Jiyang Chen, Osmar R. Zaïane, and Randy Goebel

990      A Bayesian Approach Toward Finding Communities and Their Evolutions in Dynamic Social Networks
            Tianbao Yang, Yun Chi, Shenghuo Zhu, Yihong Gong, and Rong Jin

Session S8: Relational Mining and High Performance Learning

1002    Efficient Discovery of Interesting Patterns Based on Strong Closedness
            Mario Boley, Tamás Horváth, and Stefan Wrobel

1014    Efficient Computation of Partial-Support for Mining Interesting Itemsets
            Ardian Kristanto Poernomo and Vivekanand Gopalkrishnan

1026    Grammar Mining
            Siegfried Nijssen and Luc De Raedt

1038    Top-k Correlative Graph Mining
            Yiping Ke, James Cheng, and Jeffrey Xu Yu

1050    High Performance Parallel/Distributed Biclustering Using Barycenter Heuristic
            Arifa Nisar, Waseem Ahmad, Wei-keng Liao, and Alok Choudhary

Session S9: Mining Graphs and Semi Structured Data

1063    MultiVis: Content-Based Social Network Exploration through Multi-way Visual Analysis
            Jimeng Sun, Spiros Papadimitriou, Ching-Yung Lin, Nan Cao, Shixia Liu, and Weihong Qian

1075    Near-optimal Supervised Feature Selection among Frequent Subgraphs
            Marisa Thoma, Hong Cheng, Arthur Gretton, Jiawei Han, Hans-Peter Kriegel, Alex Smola, Le Song, Philip S. Yu, Xifeng Yan, and Karsten Borgwardt

1087    Polynomial-Delay and Polynomial-Space Algorithms for Mining Closed Sequences, Graphs, and Pictures in Accessible Set Systems
            Hiroki Arimura and Takeaki Uno

1099    Link Propagation: A Fast Semi-supervised Learning Algorithm for Link Prediction
            Hisashi Kashima, Tsuyoshi Kato, Yoshihiro Yamanishi, Masashi Sugiyama, and Koji Tsuda

1111    Understanding Importance of Collaborations in Co-authorship Networks: A Supportiveness Analysis Approach
            Yi Han, Bin Zhou, Jian Pei, and Yan Jia

Session S10: Text Mining and Data Reduction

1123    Topic Cube: Topic Modeling for OLAP on Multidimensional Text Databases
            Duo Zhang, Chengxiang Zhai, and Jiawei Han

1135    Local Relevance Weighted Maximum Margin Criterion for Text Classification
            Quanquan Gu and Jie Zhou

1147    Multi-topic Based Query-Oriented Summarization
            Jie Tang, Limin Yao, and Dewei Chen

1159    Straightforward Feature Selection for Scalable Latent Semantic Indexing
            Jun Yan, Shuicheng Yan, Ning Liu, and Zheng Chen

1171    Parallel Large Scale Feature Selection for Logistic Regression
            Sameer Singh, Jeremy Kubica, Scott Larsen, and Daria Sorokina

Session S11: Mining Spatio-Temporal Data and Efficient Learning

1183    Travel-Time Prediction Using Gaussian Process Regression: A Trajectory-Based Approach
            Tsuyoshi Idé and Sei Kato

1195    Discretized Spatio-Temporal Scan Window
            Seyed H. Mohammadi, Vandana P. Janeja, and Aryya Gangopadhyay

1207    Finding Links and Initiators: A Graph-Reconstruction Problem
            Heikki Mannila and Evimaria Terzi

1218    Efficient Multiplicative Updates for Support Vector Machines
            Vamsi K. Potluru, Sergey M. Plis, Morten Mørup, Vincent D. Calhoun, and Terran Lane

1230    Efficient Active Learning with Boosting
            Zheng Wang, Yangqiu Song, and Changshui Zha

Renew SIAM · Contact Us · Site Map · Join SIAM · My Account
Facebook Twitter Flickr Youtube