Page 72 - Understanding Machine Learning
P. 72
The VC-Dimension
54
1997; Bartlett, Long & Williamson 1994; Anthony & Bartlet 1999), and the
Natarajan dimension characterizes learnability of some multiclass learning prob-
lems (Natarajan 1989). However, in general, there is no equivalence between
learnability and uniform convergence. See (Shalev-Shwartz, Shamir, Srebro &
Sridharan 2010; Daniely, Sabato, Ben-David & Shalev-Shwartz 2011).
Sauer’s lemma has been proved by Sauer in response to a problem of Erdos
(Sauer 1972). Shelah (with Perles) proved it as a useful lemma for Shelah’s theory
1
of stable models (Shelah 1972). Gil Kalai tells us that at some later time, Benjy
Weiss asked Perles about such a result in the context of ergodic theory, and Perles,
who forgot that he had proved it once, proved it again. Vapnik and Chervonenkis
proved the lemma in the context of statistical learning theory.
6.8 EXERCISES
6.1 Show the following monotonicity property of VC-dimension: For every two hypoth-
esis classes if H ⊆ H then VCdim(H ) ≤ VCdim(H).
6.2 Given some finite domain set, X, and a number k ≤|X|, figure out the VC-dimension
of each of the following classes (and prove your claims):
X X
1. H ={h ∈{0,1} : |{x : h(x) = 1}| = k}: that is, the set of all functions that assign
=k
the value 1 to exactly k elements of X.
2. H at−most−k ={h ∈{0,1} : |{x : h(x) = 1}| ≤ k or |{x : h(x) = 0}| ≤ k}.
X
n
6.3 Let X be the Boolean hypercube {0,1} .For a set I ⊆{1,2,...,n} we define a parity
n
function h I as follows. On a binary vector x = (x 1 ,x 2 ,...,x n ) ∈{0,1} ,
h I (x) = x i mod 2 .
i∈I
(That is, h I computes parity of bits in I.) What is the VC-dimension of the class of
all such parity functions, H n-parity ={h I : I ⊆{1,2,...,n}}?
6.4 We proved Sauer’s lemma by proving that for every class H of finite VC-dimension
d, and every subset A of the domain,
d
|A|
|H A |≤ |{B ⊆ A : H shatters B}| ≤ .
i
i=0
Show that there are cases in which the previous two inequalities are strict (namely,
the ≤ can be replaced by <) and cases in which they can be replaced by equalities.
Demonstrate all four combinations of = and <.
d d
6.5 VC-dimension of axis aligned rectangles in R : Let H rec be the class of axis aligned
d 2
rectangles in R . We have already seen that VCdim(H rec ) = 4. Prove that in general,
d
VCdim(H rec ) = 2d.
d
6.6 VC-dimension of Boolean conjunctions: Let H be the class of Boolean conjunc-
con
tions over the variables x 1 ,...,x d (d ≥ 2). We already know that this class is finite
d
and thus (agnostic) PAC learnable. In this question we calculate VCdim(H con ).
d d
1. Show that |H |≤ 3 + 1.
con
2. Conclude that VCdim(H) ≤ d log3.
d
3. Show that H shatters the set of unit vectors {e i : i ≤ d}.
con
1 http://gilkalai.wordpress.com/2008/09/28/extremal-combinatorics-iii-some-basic-theorems