Peter Stone's Selected Publications

Classified by TopicClassified by Publication TypeSorted by DateSorted by First Author Last NameClassified by Funding Source


BOME! Bilevel Optimization Made Easy: A Simple First-Order Approach

BOME! Bilevel Optimization Made Easy: A Simple First-Order Approach.
Bo Liu, Mao Ye, Stephen Wright, Peter Stone, and Qiang Liu.
In Conference on Neural Information Processing Systems, 2022, December 2022.

Download

[PDF]4.2MB  [slides.pdf]1.6MB  [poster.pdf]885.6kB  

Abstract

Bilevel optimization (BO) is useful for solving a variety of important machine learning problems including but not limited to hyperparameter optimization, meta-learning, continual learning, and reinforcement learning. Conventional BO methods need to differentiate through the low-level optimization process with implicit differentiation, which requires expensive calculations related to the Hessian matrix. There has been a recent quest for first-order methods for BO, but the methods proposed to date tend to be complicated and impractical for large-scale deep learning applications. In this work, we propose a simple first-order BO algorithm that depends only on first-order gradient information, requires no implicit differentiation, and is practical and efficient for large-scale non-convex functions in deep learning. We provide non-asymptotic convergence analysis of the proposed method to stationary points for non-convex objectives and present empirical results that show its superior practical performance.

BibTeX Entry

@InProceedings{NeurIPS2022-Liu,
  author = {Bo Liu and Mao Ye and Stephen Wright and Peter Stone and Qiang Liu},
  title = {BOME! Bilevel Optimization Made Easy: A Simple First-Order Approach},
  booktitle = {Conference on Neural Information Processing Systems, 2022},
  location = {New Orleans, LA},
  month = {December},
  year = {2022},
  abstract = {
  Bilevel optimization (BO) is useful for solving a variety of important machine 
  learning problems including but not limited to hyperparameter optimization, 
  meta-learning, continual learning, and reinforcement learning.
  Conventional BO methods need to differentiate through the low-level optimization
  process with implicit differentiation, which requires expensive calculations
  related to the Hessian matrix. There has been a recent quest for first-order 
  methods for BO, but the methods proposed to date tend to be complicated and
  impractical for large-scale deep learning applications. In this work, we 
  propose a simple first-order BO algorithm that depends only on first-order 
  gradient information, requires no implicit differentiation, and is practical 
  and efficient for large-scale non-convex functions in deep learning. We provide
  non-asymptotic convergence analysis of the proposed method to stationary points
  for non-convex objectives and present empirical results that show its superior
  practical performance.
  },
}

Generated by bib2html.pl (written by Patrick Riley ) on Wed Apr 17, 2024 18:42:51