Page 72 - Understanding Machine Learning
P. 72

The VC-Dimension
           54

                 1997;  Bartlett,  Long  &  Williamson  1994;  Anthony  &  Bartlet  1999),  and  the
                 Natarajan dimension characterizes learnability of some multiclass learning prob-
                 lems  (Natarajan  1989).  However,  in  general,  there  is  no  equivalence  between
                 learnability and uniform convergence. See (Shalev-Shwartz, Shamir, Srebro &
                 Sridharan 2010; Daniely, Sabato, Ben-David & Shalev-Shwartz 2011).
                    Sauer’s lemma has been proved by Sauer in response to a problem of Erdos
                 (Sauer 1972). Shelah (with Perles) proved it as a useful lemma for Shelah’s theory
                                                          1
                 of stable models (Shelah 1972). Gil Kalai tells us that at some later time,  Benjy
                 Weiss asked Perles about such a result in the context of ergodic theory, and Perles,
                 who forgot that he had proved it once, proved it again. Vapnik and Chervonenkis
                 proved the lemma in the context of statistical learning theory.


                 6.8 EXERCISES

                  6.1 Show the following monotonicity property of VC-dimension: For every two hypoth-
                     esis classes if H ⊆ H then VCdim(H ) ≤ VCdim(H).


                  6.2 Given some finite domain set, X, and a number k ≤|X|, figure out the VC-dimension
                     of each of the following classes (and prove your claims):
                         X           X
                     1. H   ={h ∈{0,1} : |{x : h(x) = 1}| = k}: that is, the set of all functions that assign
                         =k
                        the value 1 to exactly k elements of X.
                     2. H at−most−k ={h ∈{0,1} : |{x : h(x) = 1}| ≤ k or |{x : h(x) = 0}| ≤ k}.
                                           X
                                                     n
                  6.3 Let X be the Boolean hypercube {0,1} .For a set I ⊆{1,2,...,n} we define a parity
                                                                            n
                     function h I as follows. On a binary vector x = (x 1 ,x 2 ,...,x n ) ∈{0,1} ,


                                             h I (x) =  x i  mod 2 .
                                                     i∈I
                     (That is, h I computes parity of bits in I.) What is the VC-dimension of the class of
                     all such parity functions, H n-parity ={h I : I ⊆{1,2,...,n}}?
                  6.4 We proved Sauer’s lemma by proving that for every class H of finite VC-dimension
                     d, and every subset A of the domain,
                                                                  d
                                                                     |A|

                                    |H A |≤ |{B ⊆ A : H shatters B}| ≤   .
                                                                      i
                                                                 i=0
                     Show that there are cases in which the previous two inequalities are strict (namely,
                     the ≤ can be replaced by <) and cases in which they can be replaced by equalities.
                     Demonstrate all four combinations of = and <.
                                                          d      d
                  6.5 VC-dimension of axis aligned rectangles in R : Let H rec  be the class of axis aligned
                                 d                               2
                     rectangles in R . We have already seen that VCdim(H rec ) = 4. Prove that in general,
                             d
                     VCdim(H rec ) = 2d.
                                                            d
                  6.6 VC-dimension of Boolean conjunctions: Let H  be the class of Boolean conjunc-
                                                            con
                     tions over the variables x 1 ,...,x d (d ≥ 2). We already know that this class is finite
                                                                                  d
                     and thus (agnostic) PAC learnable. In this question we calculate VCdim(H con ).
                                   d     d
                     1. Show that |H  |≤ 3 + 1.
                                   con
                     2. Conclude that VCdim(H) ≤ d log3.
                                  d
                     3. Show that H  shatters the set of unit vectors {e i : i ≤ d}.
                                  con
                 1  http://gilkalai.wordpress.com/2008/09/28/extremal-combinatorics-iii-some-basic-theorems
   67   68   69   70   71   72   73   74   75   76   77