Page 275 - Understanding Machine Learning
        P. 275
     21.3 Online Convex Optimization  257
              21.3 ONLINE CONVEX OPTIMIZATION
              In Chapter 12 we studied convex learning problems and showed learnability results
              for these problems in the agnostic PAC learning framework. In this section we
              show that similar learnability results hold for convex problems in the online learning
              framework. In particular, we consider the following problem.
                                     Online Convex Optimization
                       definitions:
                         hypothesis class H ; domain Z ; loss function   : H × Z → R
                       assumptions:
                         H is convex
                         ∀z ∈ Z,  (·,z) is a convex function
                       for t = 1,2,...,T
                         learner predicts a vector w (t)  ∈ H
                         environment responds with z t ∈ Z
                                             (t)
                         learner suffers loss  (w ,z t )
                 As in the online classification problem, we analyze the regret of the algorithm.
              Recall that the regret of an online algorithm with respect to a competing hypothesis,
              which here will be some vector w ∈ H, is defined as
                                              T             T
                                                   (t)
                              Regret (w ,T ) =   (w ,z t ) −   (w ,z t ).       (21.5)
                                    A
                                             t=1           t=1
              As before, the regret of the algorithm relative to a set of competing vectors, H,is
              defined as
                                 Regret (H,T ) = sup Regret (w ,T ).
                                       A                   A
                                                w ∈H
                 In Chapter 14 we have shown that Stochastic Gradient Descent solves convex
              learning problems in the agnostic PAC model. We now show that a very similar
              algorithm, Online Gradient Descent, solves online convex learning problems.
                                      Online Gradient Descent
                        parameter: η> 0
                       initialize: w (1)  = 0
                       for t = 1,2,...,T
                         predict w (t)
                         receive z t and let f t ( · ) =  (·,z t )
                                        (t)
                         choose v t ∈ ∂ f t (w )
                         update:
                                1
                          1. w (t+ )  (t)  − ηv t
                                2 = w
                                                      1
                          2. w (t+1)  = argmin   w − w (t+ )
                                                      2
                                          w∈H





