TY - GEN
T1 - Enhancing Parameter-Free Frank Wolfe with an Extra Subproblem
AU - Li, Bingcong
AU - Wang, Lingda
AU - Giannakis, Georgios B.
AU - Zhao, Zhizhen
N1 - Publisher Copyright:
Copyright © 2021, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.
PY - 2021
Y1 - 2021
N2 - Aiming at convex optimization under structural constraints, this work introduces and analyzes a variant of the Frank Wolfe (FW) algorithm termed ExtraFW. The distinct feature of ExtraFW is the pair of gradients leveraged per iteration, thanks to which the decision variable is updated in a prediction-correction (PC) format. Relying on no problem dependent parameters in the step sizes, the convergence rate of ExtraFW for general convex problems is shown to be O(k1 ), which is optimal in the sense of matching the lower bound on the number of solved FW subproblems. However, the merit of ExtraFW is its faster rate O(k12 ) on a class of machine learning problems. Compared with other parameter-free FW variants that have faster rates on the same problems, ExtraFW has improved rates and fine-grained analysis thanks to its PC update. Numerical tests on binary classification with different sparsity-promoting constraints demonstrate that the empirical performance of ExtraFW is significantly better than FW, and even faster than Nesterov’s accelerated gradient on certain datasets. For matrix completion, ExtraFW enjoys smaller optimality gap, and lower rank than FW.
AB - Aiming at convex optimization under structural constraints, this work introduces and analyzes a variant of the Frank Wolfe (FW) algorithm termed ExtraFW. The distinct feature of ExtraFW is the pair of gradients leveraged per iteration, thanks to which the decision variable is updated in a prediction-correction (PC) format. Relying on no problem dependent parameters in the step sizes, the convergence rate of ExtraFW for general convex problems is shown to be O(k1 ), which is optimal in the sense of matching the lower bound on the number of solved FW subproblems. However, the merit of ExtraFW is its faster rate O(k12 ) on a class of machine learning problems. Compared with other parameter-free FW variants that have faster rates on the same problems, ExtraFW has improved rates and fine-grained analysis thanks to its PC update. Numerical tests on binary classification with different sparsity-promoting constraints demonstrate that the empirical performance of ExtraFW is significantly better than FW, and even faster than Nesterov’s accelerated gradient on certain datasets. For matrix completion, ExtraFW enjoys smaller optimality gap, and lower rank than FW.
UR - http://www.scopus.com/inward/record.url?scp=85112707587&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85112707587&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85112707587
T3 - 35th AAAI Conference on Artificial Intelligence, AAAI 2021
SP - 8324
EP - 8331
BT - 35th AAAI Conference on Artificial Intelligence, AAAI 2021
PB - Association for the Advancement of Artificial Intelligence
T2 - 35th AAAI Conference on Artificial Intelligence, AAAI 2021
Y2 - 2 February 2021 through 9 February 2021
ER -