Sampling and Summarization for Social Networks
Shou-De Lin, National Taiwan University, Taiwan
Mi-Yen Yeh, Institute of Information Science, Academia Sinica, Taiwan
Cheng-Te Li, National Taiwan University, Taiwan
In current social network services, the massive amounts of network data are created and become great sources for different purposes such as sociology study and marketing analysis. However, the scale of real-world social networks (usually contain millions of nodes and edges) imposes great challenges in information processing and analysis. It is crucial to devise effective and efficient approaches to handle cases when only incomplete information can be obtained, or when the size of the network is too large to be handled. Generally there are two strategies to tackle large-scaled networks, sampling and summarization. In social network sampling, it is assumed that the full network is unseen or impossible to be obtained. Techniques to sample a sub-network with the goal to preserve some specific properties of the original network are needed. As for social network summarization, the entire network is known in prior, but is usually too big for humans to visualize and for machines to process efficiently. In this sense, graph summarization aim at condensing networks as much as possible without losing too much information. In this tutorial, we will introduce the state-of-the-art solutions for social network sampling and summarization, and highlight the research challenges and unsolved issues.
Shou-de Lin holds a BS in EE from National Taiwan University, an MS-EE from the University of Michigan, and an MS in Computational Linguistics and PhD in Computer Science both from University of Southern California. In 2007, he joined the CSIE Department of National Taiwan University as an assistant professor. He leads the Machine Discovery and Social Network Mining Lab in NTU. Before joining NTU, he was a post-doctoral research fellow at the Los Alamos National Lab. Prof. Lin's research includes the areas of knowledge discovery and data mining, social network analysis, natural language processing, and machine learning. His international recognition includes the best paper award in IEEE Web Intelligent conference 2003, Google Research Award in 2007, Microsoft research award in 2008, merit paper award in TAAI 2010, US Airforce AOARD Research Award 2011, and best paper award in ASONAM 2011. He is one of the all-time winners in ACM KDD Cup, leading or co-leading the NTU teams to win championships in 2008 (co-champion with IBM Research), 2010 (student team champion and overall team champion), and 2011 (dual champions in both tracks), 2012 (champion in Track 2), and ranked 3rd in 2009. In the past 4 years, he has published more than 40 papers in the area of social network analysis and mining in several top journals and conferences such as TKDE, SNAM, KDD, SIGIR, WWW, ACL, and MM.
Mi-Yen Yeh received her B.S. and Ph.D. degrees from Department of Electrical Engineering, National Taiwan University, Taipei, Taiwan, in 2002 and 2009, respectively. She is now Assistant Research Fellow at Institute of Information Science (and Research Center for IT Innovation under joint appointment), Academia Sinica, Taipei, Taiwan. Her research interests include databases and data mining. She received Exploration Research Award of Pan Wen Yuan Foundation in 2011. She is a member of IEEE and ACM.
Cheng-Te Li is now a Ph.D. candidate in the Graduate Institute of Networking and Multimedia at National Taiwan University, Taipei, Taiwan. His research interests include social network mining and social media analytics. His international recognition includes Facebook Fellowship Finalist Award 2012, ACM KDD Cup 2012 First Prize, ASONAM 2011 Best Paper Award, and Microsoft Research Asia Fellowship 2010.