Data Mining Evolved: Trends and Challenges

Usama Fayyad
President & CEO, digiMine, Inc.

Data Mining has received much attention as companies and organizations started to ask how they can better utilize the huge data stores they built up over the past two decades. While some interesting progress has been achieved over the past few years, especially when it comes to techniques and scalable algorithms, very few organizations have managed to benefit from the technology. This paradoxical situation of having too much data yet not be able to utilize it or mine it arose because of both technical and business challenges. We will cover these challenges, paint a picture for where the data problems are, and then introduce data mining and its role. Key to the evolution of data mining as a practical area of practice is the development of solutions that embed data mining algorithms and that overcome some of the major challenges that stand in the way of its application such as data warehousing, data integration and fusion, and data extraction. While emphasis in the academic community has been focused primarily on developing new algorithms, few people have paid attention to the problem that algorithms cannot get to data today in a format that is usable by the algorithms.

We present the challenges and our view of how the technology needs to evolve in the context of particular data mining algorithms and how they can be used with very large databases. This includes considering aspects and challenges for fitting data mining with database systems. Finally, of particular interest is the business challenges of how to make the technology really work in practice. In this coverage, we will span the spectrum from technical to business issues. We shall also cover applications in the business setting to illustrate the challenges and contributions of data mining in eBusinesses as an example. Finally, there are still many unsolved deeper problems in this field and hence we conclude by revisiting the technical challenges facing the field.


Dr. Fayyad is co-founder, President and CEO of digiMine, Inc. a privately held company of over 100 employees focused on data mining and business intelligence solutions. Prior to digiMine, Dr. Fayyad founded and led Microsoft Research's Data Mining & Exploration (DMX) Group. At Microsoft he also led the development of data mining components within Microsoft products, including SQL Server 2000. From 1989 to 1995, Dr. Fayyad was at NASA's Jet Propulsion Laboratory (JPL), California Institute of Technology, where he founded and grew a multi-million dollar advanced research program to develop data mining systems for the analysis of large scientific databases. This work solved some significant scientific problems and earned him numerous awards including the most distinguished excellence award from Caltech/JPL and a U.S. Government Medal from NASA. Dr. Fayyad has published over 150 technical articles and is co-editor of two books on Data Mining and Knowledge Discovery in Databases.

He is very active in the KDD community and has served as program co-chair of KDD-94 and KDD-95 (the 1st International Conference on Knowledge Discovery and Data Mining) and as general chair of KDD-96 and KDD-99. He is Editor-in-Chief of the scientific technical journal: Data Mining and Knowledge Discovery. He is a Director of the ACM Special Interest Group on knowledge Discovery and Data Mining (SIGKDD) and Editor-in-Chief of its newsletter: SIGKDD Explorations. He serves on several Editorial Boards including the Communications of the ACM and Artificial Intelligence Magazine. He received his Ph.D. in 1991 from The University of Michigan Ann Arbor in Computer Science and Engineering. He holds two BSE's in engineering, M.Sc. in Mathematics, and MSE in Computer Engineering.

Return to Program