Page 275 - Understanding Machine Learning

P. 275

21.3 Online Convex Optimization 257

21.3 ONLINE CONVEX OPTIMIZATION
In Chapter 12 we studied convex learning problems and showed learnability results
for these problems in the agnostic PAC learning framework. In this section we
show that similar learnability results hold for convex problems in the online learning
framework. In particular, we consider the following problem.

Online Convex Optimization
deﬁnitions:
hypothesis class H ; domain Z ; loss function : H × Z → R
assumptions:
H is convex
∀z ∈ Z, (·,z) is a convex function
for t = 1,2,...,T
learner predicts a vector w (t) ∈ H
environment responds with z t ∈ Z
(t)
learner suffers loss (w ,z t )

As in the online classiﬁcation problem, we analyze the regret of the algorithm.
Recall that the regret of an online algorithm with respect to a competing hypothesis,

which here will be some vector w ∈ H, is deﬁned as
T T

(t)
Regret (w ,T ) = (w ,z t ) − (w ,z t ). (21.5)
A
t=1 t=1
As before, the regret of the algorithm relative to a set of competing vectors, H,is
deﬁned as

Regret (H,T ) = sup Regret (w ,T ).
A A

w ∈H
In Chapter 14 we have shown that Stochastic Gradient Descent solves convex
learning problems in the agnostic PAC model. We now show that a very similar
algorithm, Online Gradient Descent, solves online convex learning problems.

Online Gradient Descent

parameter: η> 0
initialize: w (1) = 0
for t = 1,2,...,T
predict w (t)
receive z t and let f t ( · ) = (·,z t )
(t)
choose v t ∈ ∂ f t (w )
update:
1
1. w (t+ ) (t) − ηv t
2 = w
1
2. w (t+1) = argmin w − w (t+ )
2
w∈H

270 271 272 273 274 275 276 277 278 279 280