Proceedings of the Tenth SIAM International Conference on Data Mining
Message from the Conference Co-Chairs and Acknowledgments
Welcome from the Conference Chairs
Welcome to SDM 2010 – The Tenth SIAM International Conference on Data Mining. SDM 2010 continues a series of conferences focusing on the theory and practice of data mining as applied to data sets in science, engineering, biomedicine, and the social sciences, among others. Following the tradition of past conferences, the main technical program is accompanied by a number of specialized workshops, minisymposia, tutorials, a plenary panel, and the doctoral forum.
The keynote talks form the centerpiece of SDM 2010: Stephen Muggleton from the Imperial College, London, UK; Grace Wahba from the University of Wisconsin; Phillip Gibbons from Intel Research; and Vipin Kumar from the University of Minnesota will share their experience and vision with us. The conference also has a diverse and rich program; apart from our strong technical program of contributed papers, there is a rich set of integrated tutorials on a wide range of topics: outlier detection, mining sparse representations, ranking methods in machine learning, and supervised and unsupervised ensemble methods. The main program is also further complemented by a group of focused workshops and a minisymposium on current and emerging topics such as data mining for sustainable development, high-performance analytics, text mining, data mining for a smarter infrastructure, an invited workshop on clustering theory and applications, and a minisymposium on natural language–based data mining. Building on the success of last year's doctoral forum, SDM 2010 will see a continuation of this popular event, which will be held along with the plenary panel on the second day of the conference.
We would like to thank the entire organizing committee for the excellent job they have done putting together a strong technical program. Special thanks go to the program co-chairs, Bart Goethals and Jian Pei. They recruited the area chairs and the program committee, guided the papers submitted through a rigorous and high-standard review process, worked with the reviewers to find the highest quality submissions, and mediated among them when differences of opinion arose. We also thank Amol Ghoting for serving as publicity chair; Hui Xiong, who worked with us and SIAM to create these proceedings; Ruoming Jin and Carlotta Domeniconi, who did a great job attracting corporate and government sponsorship to support the event; Jennifer Dy for putting together a strong tutorial program; Charu Aggarwal and Marina Meila for attracting and identifying a strong set of workshops to complement the main conference program; and Misha Belkin, who served as the local arrangements chair. We also thank the tutorial speakers and workshop organizers for their hard work to bring us these exciting programs. The steering committee members—Chid Apte, Arnold Goodman, Robert Grossman, Jiawei Han, Anil K. Jain, Vipin Kumar, David Skillicorn, Padhraic Smyth, and Jeffrey D. Ullman—made invaluable contributions in setting the general direction of the conference.
We would also like to thank our sponsors, Google, IBM, the National Science Foundation, and the Ohio State University, for their support of the conference, particularly the travel support for student authors and the doctoral forum. Our special thanks go to the SIAM staff members, specifically Nicole Erle, Nancy Griscom, and Linda Thiel, who provided critical support overseeing all the logistics and making the smooth operation of the entire conference possible. The conference is co-sponsored by the American Statistical Association, continuing our desire to seek closer collaboration between our two communities.
Finally, we thank the authors and the participants, who are the primary reason for the success of the conference. We trust you will enjoy our time together at SDM 2010 in Columbus.
Srinivasan Parthasarathy and Bing Liu, Conference Co-Chairs
Chandrika Kamath, Steering Committee Chair
Message from the Program Committee Chairs
Welcome to the Tenth SIAM International Conference on Data Mining (SDM 2010). This year's conference is a continuation of the great success in the SDM conference series as one of the leading forums for data mining researchers, practitioners, developers, and users to exchange cutting-edge ideas, techniques, and experience.
SDM 2010 received 344 submissions contributed by authors worldwide, from countries such as Australia, Belgium, Brazil, Canada, China, Denmark, Finland, France, Germany, Hong Kong, Hungary, India, Italy, Japan, Korea, Mexico, New Zealand, Portugal, the Republic of Serbia, Russia, San Marino, Sweden, Singapore, Spain, Taiwan, Thailand, Turkey, the United Kingdom, and the United States. This truly reflected the international character of this conference. Moreover, 198 submissions were claimed as student papers, a strong indication that SDM 2010 has attracted the upcoming generation of data miners.
We did not have any target number of accepted papers in mind when we started the review process. Instead, we advocated both quality and novelty. We strongly encouraged new ideas and new applications. This year we implemented a new review protocol which divided the review process into two phases. In the first phase, each paper was reviewed by two reviewers. Papers receiving two negative reviews were rejected and notified early; 169 submissions fell into this category. The remaining 175 submissions were further examined in the second phase, where a senior PC took a close look at each paper, coordinated discussion with the reviewers, and made recommendations. After the rigorous review process, we accepted 82 papers, a 23.84% acceptance rate. Several papers were accepted due to their creative ideas, in spite of some issues which can be further improved, such as presentation and experiment design. We are confident that the authors have taken the review comments into consideration when making up the final versions.
In this year, to promote data mining applications, we encouraged authors to indicate their submissions as application papers if they focus on application issues. In total, 89 papers—25.87% of the submissions—fell into this category. We urged reviewers to pay attention to the application nature of those submissions. We are happy that 14 application papers were accepted, which represents an acceptance rate of 15.73%. The lower acceptance rate for application papers evidences the challenges in developing successful data mining applications.
To facilitate the exchange of ideas and in-depth interaction among attendees, with great help from the conference general chairs and local organizers, we allocated the same amount of oral presentation time to each accepted paper and provided each paper an opportunity to run a poster in a plenary poster session. The creative work of all the authors, the dedicated efforts of our program committee members and external reviewers, and the leadership and expertise of the senior program committee members have resulted in an outstanding set of papers that will surely exert influence and promote excellence in data mining for many years to come.
We would like to take this opportunity to thank all the program committee members and external reviewers for their expert help in the challenging task of reviewing, discussing, and recommending papers. We gratefully appreciate the excellent help from the senior program committee members, Charu Aggarwal, Arindam Banerjee, Ian Davidson, Inderjit Dhillon, Carlotta Domeniconi, Wei Fan, Johannes Furnkranz, Joao Gama, Gemma C. Garriga, Ruoming Jin, Eamonn Keogh, Jure Leskovec, Huan Liu, Sameep Mehta, Wagner Meira, Zoran Obradovic, Luc De Raedt, Celine Robardet, Saharon Rosset, Arno Siebes, Jimeng Sun, Evimaria Terzi, Jianyong Wang, Jieping Ye, and Jeffrey X. Yu, who handled the reviewing process with great care and insight.
We are sincerely grateful to all the SIAM staff members who have greatly contributed to this conference with exceptional support to make this process highly enjoyable. We thank editors Bing Liu, Srinivasan Parthasarathy, and Chandrika Kamath for excellent support and timely suggestions on many important issues. Finally, we thank all the authors for their submissions and participation in SDM 2010. No conference would be successful without excellent papers and inspiring presentations.
With a gamut of activities at SDM 2010, we wish that you all enjoy the conference program, make new friends, and stumble upon new ideas for future success!
Bart Goethals and Jian Pei
SDM 2010 Organizing Committee
Steering Committee Chair
Chandrika Kamath, Lawrence Livermore National Laboratory
Bing Liu, University of Illinois – Chicago
Srinivasan Parthasarathy, The Ohio State University
Bart Goethals, University of Antwerp
Jian Pei, Simon Fraser University
Marina Meila, University of Washington
Charu Aggarwal, IBM Research
Jennifer Dy, Northeastern University
Eamonn Keogh, University of California – Riverside
Amol Ghoting, IBM Research
Carlotta Domeniconi, George Mason University
Ruoming Jin, Kent State University
Mikhail Belkin, The Ohio State University
Hui Xiong, Rutgers University
Senior Program Committee Members
Gemma C. Garriga
Luc De Raedt
Jeffrey X. Yu
Program Committee Members
Tijl De Bie
Jeroen De Knijf
Prakash Mandayam Comare
Syed Faraz Mahmood