Page 370 - Understanding Machine Learning

P. 370

Multiclass Learna bility
352

To deﬁne the Natarajan dimension, we ﬁrst generalize the deﬁnition of
shattering.

Deﬁnition 29.1 (Shattering (Multiclass Version)). We say that a set C ⊂ X is

shattered by H if there exist two functions f 0 , f 1 : C → [ k] such that

For every x ∈ C, f 0 ( x) = f 1 ( x).

For every B ⊂ C, there exists a function h ∈ H such that

∀x ∈ B,h( x) = f 0 ( x) and ∀x ∈ C \ B,h( x) = f 1 ( x).

Deﬁnition 29.2 (Natarajan Dimension). The Natarajan dimension of H, denoted
Ndim(H), is the maximal size of a shattered set C ⊂ X.

It is not hard to see that in the case that there are exactly two classes,
Ndim(H) = VCdim(H). Therefore, the Natarajan dimension generalizes the VC
dimension. We next show that the Natarajan dimension allows us to generalize the
fundamental theorem of statistical learning from binary classiﬁcation to multiclass
classiﬁcation.
29.2 THE MULTICLASS FUNDAMENTAL THEOREM
Theorem 29.3 (The Multiclass Fundamental Theorem). There exist absolute con-
stants C 1 ,C 2 > 0 such that the following holds. For every hypothesis class H of

functions from X to [ k], such that the Natarajan dimension of H is d, we have

1. H has the uniform convergence property with sample complexity

d + log(1/δ) UC d log(k) + log(1/δ)

C 1 2 ≤ m H (
,δ) ≤ C 2 2 .

2. H is agnostic PAC learnable with sample complexity

d + log(1/δ) d log(k) + log(1/δ)

C 1 2 ≤ m H (
,δ) ≤ C 2 2 .

3. H is PAC learnable (assuming realizability) with sample complexity

d + log(1/δ) d log
+ log(1/δ)

C 1 ≤ m H (
,δ) ≤ C 2 .

29.2.1 On the Proof of Theorem 29.3
The lower bounds in Theorem 29.3 can be deduced by a reduction from the binary
fundamental theorem (see Exercise 29.5).
The upper bounds in Theorem 29.3 can be proved along the same lines of the
proof of the fundamental theorem for binary classiﬁcation, given in Chapter 28
(see Exercise 29.4). The sole ingredient of that proof that should be modiﬁed in
a nonstraightforward manner is Sauer’s lemma. It applies only to binary classes and
therefore must be replaced. An appropriate substitute is Natarajan’s lemma:
Ndim(H) 2Ndim(H)
Lemma 29.4 (Natarajan). |H|≤ |X| · k .

365 366 367 368 369 370 371 372 373 374 375