Speaker: Adam Kalai
Online convex optimization: gradient descent without a gradient
We study a general online convex optimization problem. There is a
convex set S and an unknown sequence of concave reward functions
f_1,f_2,.... In each period, we choose a feasible point x_t in S, and
receive the reward f_t(x_t) (f_t itself remains unknown). Our goal is
to have large average reward. We give an algorithm whose average
reward is guaranteed to approach the average reward of the best single
point x in S.
As an example, consider a company that has to decide how much to spend
advertising on a number of channels each week. At the end of the
week, they only find out their total profit, which is a function of
their expenditures. At the end of the year, we can guarantee that
their expected total profit is almost as large as if they had used
the best single campaign each week, for the whole year, chosen with
the benefit of hindsight. We assume only that the functions are
concave and bounded, i.e., no statistical assumptions relating one
function to the others.
Our approach is very simple: we give an approximation of the gradient
that is computed from evaluating a function at a single (random) point.
Joint work with Abie Flaxman and Brendan McMahan