Stochastic Optimization for Analyzing and Mining Big Data
Tianbao Yang, NEC Laboratories America, Inc., USA
Rong Jin, Michigan State University, USA
Shenghuo Zhu, NEC Laboratories America, Inc., USA
In recent years of machine learning applications, the size of data has been observed with an unprecedented growth. To tackle millions of billions of data points, stochastic optimization has emerged as a dominant approach for big data analytics. In this talk, we shall present recent advances in stochastic optimization for solving fundamental problems arising in big data analytics. In particular, we will present efficient stochastic optimization algorithms for solving classification (e.g., support vector machines, logistic regression) and regression (e.g. ridge regression and lasso) problems. This talk consists of four parts. The first part gives an introduction to machine learning and stochastic optimization that motivates stochastic optimization for big data analytics. The second part presents several start-of-the-art stochastic optimization algorithms for solving big data classification and regression problems. The third part presents general strategies of stochastic optimization, including stochastic gradient descent for a variety of objective functions, accelerated stochastic gradient descent for composite optimization, variance reduced stochastic optimization algorithms, parallel and distributed optimization algorithms. In the fourth part, we discuss some implementation issues and introduce a practical library of distributed stochastic optimization for solving big data classification and regression problems.