Page 370 - Understanding Machine Learning
P. 370
Multiclass Learna bility
352
To define the Natarajan dimension, we first generalize the definition of
shattering.
Definition 29.1 (Shattering (Multiclass Version)). We say that a set C ⊂ X is
shattered by H if there exist two functions f 0 , f 1 : C → [ k] such that
For every x ∈ C, f 0 ( x) = f 1 ( x).
For every B ⊂ C, there exists a function h ∈ H such that
∀x ∈ B,h( x) = f 0 ( x) and ∀x ∈ C \ B,h( x) = f 1 ( x).
Definition 29.2 (Natarajan Dimension). The Natarajan dimension of H, denoted
Ndim(H), is the maximal size of a shattered set C ⊂ X.
It is not hard to see that in the case that there are exactly two classes,
Ndim(H) = VCdim(H). Therefore, the Natarajan dimension generalizes the VC
dimension. We next show that the Natarajan dimension allows us to generalize the
fundamental theorem of statistical learning from binary classification to multiclass
classification.
29.2 THE MULTICLASS FUNDAMENTAL THEOREM
Theorem 29.3 (The Multiclass Fundamental Theorem). There exist absolute con-
stants C 1 ,C 2 > 0 such that the following holds. For every hypothesis class H of
functions from X to [ k], such that the Natarajan dimension of H is d, we have
1. H has the uniform convergence property with sample complexity
d + log(1/δ) UC d log(k) + log(1/δ)
C 1 2 ≤ m H (
,δ) ≤ C 2 2 .
2. H is agnostic PAC learnable with sample complexity
d + log(1/δ) d log(k) + log(1/δ)
C 1 2 ≤ m H (
,δ) ≤ C 2 2 .
3. H is PAC learnable (assuming realizability) with sample complexity
kd
d + log(1/δ) d log
+ log(1/δ)
C 1 ≤ m H (
,δ) ≤ C 2 .
29.2.1 On the Proof of Theorem 29.3
The lower bounds in Theorem 29.3 can be deduced by a reduction from the binary
fundamental theorem (see Exercise 29.5).
The upper bounds in Theorem 29.3 can be proved along the same lines of the
proof of the fundamental theorem for binary classification, given in Chapter 28
(see Exercise 29.4). The sole ingredient of that proof that should be modified in
a nonstraightforward manner is Sauer’s lemma. It applies only to binary classes and
therefore must be replaced. An appropriate substitute is Natarajan’s lemma:
Ndim(H) 2Ndim(H)
Lemma 29.4 (Natarajan). |H|≤ |X| · k .