Page 74 - Understanding Machine Learning
P. 74

The  VC-Dimension
           56

                        Hint: Use Exercise 6.3 in Chapter 5.
                     2. Prove that for every H that is PAC learnable, VCdim(H) < ∞. (Note that this is
                        the implication 3 → 6 in Theorem 6.7.)
                 6.11 VC of union: Let H 1 ,...,H r be hypothesis classes over some fixed domain set X.
                     Let d = max i VCdim(H i ) and assume for simplicity that d ≥ 3.
                     1. Prove that

                                                 r
                                         VCdim ∪ i=1 H i ≤ 4d log(2d) + 2log(r).
                        Hint: Takeaset of k examples and assume that they are shattered by the union
                                                                  k
                        class. Therefore, the union class can produce all 2 possible labelings on these
                        examples. Use Sauer’s lemma to show that the union class cannot produce more
                                                    d
                                                  k
                              d
                        than rk labelings. Therefore, 2 rk . Now use Lemma A.2.
                     2. (*) Prove that for r = 2 it holds that
                                               VCdim(H 1 ∪ H 2) ≤ 2d + 1.
                 6.12 Dudley classes: In this question we discuss an algebraic framework for defining con-
                                     n
                     cept classes over R and show a connection between the VC dimension of such
                                                                        n
                     classes and their algebraic properties. Given a function f : R → R we define the
                     corresponding function, POS( f )(x) = 1 [ f (x)>0] .For a class F of real valued func-
                     tions we define a corresponding class of functions POS(F) ={POS( f ): f ∈ F}.We
                     say that a family, F, of real valued functions is linearly closed if for all f ,g ∈ F and
                     r ∈ R,( f +rg)∈ F (where addition and scalar multiplication of functions are defined
                                               n
                     point wise, namely, for all x ∈ R ,( f +rg)(x) = f (x) +rg(x)). Note that if a family
                     of functions is linearly closed then we can view it as a vector space over the reals.
                                                                          def
                                     n
                     For a function g : R → R and a family of functions F,let F + g ={ f + g : f ∈ F}.
                     Hypothesis classes that have a representation as POS(F + g) for some vector space
                     of functions F and some function g are called Dudley classes.
                                            n
                     1. Show that for every g : R → R and every vector space of functions F as defined
                        earlier, VCdim(POS(F + g)) = VCdim(POS(F)).
                     2. (**) For every linearly closed family of real valued functions F,the VC-
                        dimension of the corresponding class POS(F) equals the linear dimension of
                        F (as a vector space). Hint: Let f 1 ,..., f d be a basis for the vector space F. Con-
                                                                    d
                                                               n
                        sider the mapping x  è  ( f 1 (x),..., f d (x)) (from R to R ). Note that this mapping
                                                              n
                        induces a matching between functions over R of the form POS( f ) and homo-
                                                 d
                        geneous linear halfspaces in R (the VC-dimension of the class of homogeneous
                        linear halfspaces is analyzed in Chapter 9).
                     3. Show that each of the following classes can be represented as a Dudley class:
                                                       n
                        1. The class HS n of halfspaces over R (see Chapter 9).
                                                                       n
                        2. The class HH S n of all homogeneous halfspaces over R (see Chapter 9).
                                                                          d
                        3. The class B d of all functions defined by (open) balls in R . Use the Dudley
                          representation to figure out the VC-dimension of this class.
                               d
                        4. Let P denote the class of functions defined by polynomial inequalities of
                               n
                          degree ≤ d, namely,
                                d
                               P ={h p : p is a polynomial of degree ≤ d in the variables x 1 ,...,x n },
                                n
                          where for x = (x 1 ....,x n ), h p (x) = 1 [p(x)≥0] (the degree of a multivariable poly-
                          nomial is the maximal sum of variable exponents over all of its terms. For
                                                     3 2
                                                              2
                          example, the degree of p(x) = 3x x + 4x 3 x is 5).
                                                     1 2      7
   69   70   71   72   73   74   75   76   77   78   79