Big Data Analytics for Healthcare
Jimeng Sun, IBM T. J. Watson Research Center, USA
Chandan Reddy, Wayne State University, USA
Large amounts of heterogeneous medical data have become available in various healthcare organizations (payers, providers, pharmaceuticals). Those data could be an enabling resource for deriving insights for improving care delivery and reducing waste. The enormity and complexity of these datasets present great challenges in analyses and subsequent applications to a practical clinical environment. In this tutorial, we introduce the characteristics and related mining challenges on dealing with big medical data. Many of those insights come from medical informatics community, which is highly related to data mining but focuses on biomedical specifics. We survey various related papers from data mining venues as well as medical informatics venues to share with the audiences key problems and trends in healthcare analytics research, with different applications ranging from clinical text mining, predictive modeling, patient similarity, genetic data analysis, privacy on medical data and medical images.
Jimeng Sun is a research staff member at IBM TJ Watson Research Center. Dr. Sun graduated with PhD in Computer Science in Carnegie Mellon University in the fall 2007. His advisor was Prof. Christos Faloutsos. He studied in Computer science department at Carnegie Mellon University from 2003 to 2007. His research focus is on healthcare analytics and informatics, large-scale data mining, graph mining, high dimensional data mining such as time series, matrices, and tensors (data cubes) and visual analytics. Dr. Sun has received ICDM best research paper in 2007 and KDD Dissertation runner-up award in 2008 and SDM best research paper in 2007. For more details, one can refer to his personal homepage at http://www.dasfa.net/jimeng .
Chandan Reddy is an Assistant Professor in the Department of Computer Science at Wayne State University. He received his PhD from Cornell University and MS from Michigan State University. His primary research interests are data mining and machine learning with applications to healthcare informatics, bioinformatics, and social network analysis. His research is currently being funded by the National Science Foundation, Department of Transportation, and the Susan G. Komen for the Cure Foundation. He has published over 40 peer-reviewed articles in leading conferences and journals including IEEE TPAMI, IEEE TKDE, ACM SIGKDD, IEEE ICDM, SIAM DM, and ACM CIKM. He received the Best Application Paper Award in ACM SIGKDD 2010 and was a finalist of the INFORMS Franz Edelman Award Competition in 2011. He is a member of IEEE, ACM, and SIAM.