Page 73 - Understanding Machine Learning
P. 73

6.8 Exercises  55

                                         d
                  4. (**) Show that VCdim(H  ) ≤ d.
                                         con
                    Hint: Assume by contradiction that there exists a set C ={c 1 ,...,c d+1 } that is
                                d                               d
                    shattered by H  .Let h 1 ,...,h d+1 be hypotheses in H  that satisfy
                                con                             con

                                                           0 i = j
                                      ∀i, j ∈ [d + 1], h i (c j ) =
                                                           1otherwise
                    For each i ∈ [d + 1], h i (or more accurately, the conjunction that corresponds to
                    h i ) contains some literal   i which is false on c i and true on c j for each j  = i.Use
                    the Pigeonhole principle to show that there must be a pair i < j ≤ d + 1such
                    that   i and   j use the same x k and use that fact to derive a contradiction to the
                    requirements from the conjunctions h i ,h j .
                                     d                                        d
                  5. Consider the class H  of monotone Boolean conjunctions over {0,1} . Mono-
                                     mcon
                                                                                   d
                    tonicity here means that the conjunctions do not contain negations. As in H  ,
                                                                                   con
                    the empty conjunction is interpreted as the all-positive hypothesis. We augment
                      d                              −                  d
                    H mcon  with the all-negative hypothesis h . Show that VCdim(H mcon ) = d.
              6.7 We have shown that for a finite hypothesis class H,VCdim(H) ≤ log(|H|) .How-
                 ever, this is just an upper bound. The VC-dimension of a class can be much lower
                 than that:
                  1. Find an example of a class H of functions over the real interval X = [0,1] such
                    that H is infinite while VCdim(H) = 1.
                  2. Give an example of a finite hypothesis class H over the domain X = [0,1], where
                    VCdim(H) = log (|H|) .
                                   2
              6.8 (*) It is often the case that the VC-dimension of a hypothesis class equals (or can
                 be bounded above by) the number of parameters one needs to set in order to define
                 each hypothesis in the class. For instance, if H is the class of axis aligned rectangles in
                  d
                 R ,then VCdim(H) = 2d, which is equal to the number of parameters used to define
                               d
                 a rectangle in R . Here is an example that shows that this is not always the case.
                 We will see that a hypothesis class might be very complex and even not learnable,
                 although it has a small number of parameters.
                   Consider the domain X = R, and the hypothesis class
                                        H ={x  → sin(θx)  : θ ∈ R}

                 (here, we take  −1 = 0). Prove that VCdim(H) =∞.
                 Hint: There is more than one way to prove the required result. One option is by
                 applying the following lemma: If 0.x 1 x 2 x 3 ..., is the binary expansion of x ∈ (0,1),
                                                 m
                 then for any natural number m,  sin(2 πx) = (1 − x m ), provided that ∃k ≥ m s.t.
                 x k = 1.
              6.9 Let H be the class of signed intervals, that is,
                 H ={h a,b,s : a ≤ b,s ∈{−1,1}} where

                                                  s   if x ∈ [a,b]
                                       h a,b,s (x) =
                                                  −s  if x /∈ [a,b]
                 Calculate VCdim(H).
             6.10 Let H be a class of functions from X to {0,1}.
                  1. Prove that if VCdim(H) ≥ d,for any d, then for some probability distribution D
                    over X ×{0,1}, for every sample size, m,
                                                                 d − m
                                       E [L D (A(S))] ≥ min L D (h) +
                                      S∼D m           h∈H         2d
   68   69   70   71   72   73   74   75   76   77   78