TY - GEN
T1 - Adaptive sequential optimization with applications to machine learning
AU - Wilson, Craig
AU - Veeravalli, Venugopal V.
N1 - Funding Information:
This work was supported by the NSF under award CCF 11-11342 through the University of Illinois at Urbana-Champaign
Publisher Copyright:
© 2016 IEEE.
PY - 2016/5/18
Y1 - 2016/5/18
N2 - A framework is introduced for solving a sequence of slowly changing optimization problems, including those arising in regression and classification applications, using optimization algorithms such as stochastic gradient descent (SGD). The optimization problems change slowly in the sense that the minimizers change at either a fixed or bounded rate. A method based on estimates of the change in the minimizers and properties of the optimization algorithm is introduced for adaptively selecting the number of samples needed from the distributions underlying each problem in order to ensure that the excess risk, i.e., the expected gap between the loss achieved by the approximate minimizer produced by the optimization algorithm and the exact minimizer, does not exceed a target level. Experiments with synthetic and real data are used to confirm that this approach performs well.
AB - A framework is introduced for solving a sequence of slowly changing optimization problems, including those arising in regression and classification applications, using optimization algorithms such as stochastic gradient descent (SGD). The optimization problems change slowly in the sense that the minimizers change at either a fixed or bounded rate. A method based on estimates of the change in the minimizers and properties of the optimization algorithm is introduced for adaptively selecting the number of samples needed from the distributions underlying each problem in order to ensure that the excess risk, i.e., the expected gap between the loss achieved by the approximate minimizer produced by the optimization algorithm and the exact minimizer, does not exceed a target level. Experiments with synthetic and real data are used to confirm that this approach performs well.
KW - adaptive algorithms
KW - gradient methods
KW - machine learning
KW - stochastic optimization
UR - http://www.scopus.com/inward/record.url?scp=84973364206&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84973364206&partnerID=8YFLogxK
U2 - 10.1109/ICASSP.2016.7472156
DO - 10.1109/ICASSP.2016.7472156
M3 - Conference contribution
AN - SCOPUS:84973364206
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 2642
EP - 2646
BT - 2016 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 41st IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016
Y2 - 20 March 2016 through 25 March 2016
ER -