Page 275 - Understanding Machine Learning
P. 275
21.3 Online Convex Optimization 257
21.3 ONLINE CONVEX OPTIMIZATION
In Chapter 12 we studied convex learning problems and showed learnability results
for these problems in the agnostic PAC learning framework. In this section we
show that similar learnability results hold for convex problems in the online learning
framework. In particular, we consider the following problem.
Online Convex Optimization
definitions:
hypothesis class H ; domain Z ; loss function : H × Z → R
assumptions:
H is convex
∀z ∈ Z, (·,z) is a convex function
for t = 1,2,...,T
learner predicts a vector w (t) ∈ H
environment responds with z t ∈ Z
(t)
learner suffers loss (w ,z t )
As in the online classification problem, we analyze the regret of the algorithm.
Recall that the regret of an online algorithm with respect to a competing hypothesis,
which here will be some vector w ∈ H, is defined as
T T
(t)
Regret (w ,T ) = (w ,z t ) − (w ,z t ). (21.5)
A
t=1 t=1
As before, the regret of the algorithm relative to a set of competing vectors, H,is
defined as
Regret (H,T ) = sup Regret (w ,T ).
A A
w ∈H
In Chapter 14 we have shown that Stochastic Gradient Descent solves convex
learning problems in the agnostic PAC model. We now show that a very similar
algorithm, Online Gradient Descent, solves online convex learning problems.
Online Gradient Descent
parameter: η> 0
initialize: w (1) = 0
for t = 1,2,...,T
predict w (t)
receive z t and let f t ( · ) = (·,z t )
(t)
choose v t ∈ ∂ f t (w )
update:
1
1. w (t+ ) (t) − ηv t
2 = w
1
2. w (t+1) = argmin w − w (t+ )
2
w∈H