Page 39 - Understanding Machine Learning

P. 39

2.4 Exercises 21

R *
- +
+ R(S)
+ -
+

+
R 1
Figure 2.2. Axis aligned rectangles.
Formally, given real numbers a 1 ≤ b 1 ,a 2 ≤ b 2 , deﬁne the classiﬁer h (a 1 ,b 1 ,a 2 ,b 2 ) by

1if a 1 ≤ x 1 ≤ b 1 and a 2 ≤ x 2 ≤ b 2
h (a 1 ,b 1 ,a 2 ,b 2 ) (x 1 ,x 2 ) = . (2.10)
0otherwise
The class of all axis aligned rectangles in the plane is deﬁned as
2
H ={h (a 1 ,b 1 ,a 2 ,b 2 ) : a 1 ≤ b 1 , and a 2 ≤ b 2 }.
rec
Note that this is an inﬁnite size hypothesis class. Throughout this exercise we rely
on the realizability assumption.
1. Let A be the algorithm that returns the smallest rectangle enclosing all positive
examples in the training set. Show that A is an ERM.
4log(4/δ)
2. Show that if A receives a training set of size ≥ then, with probability of

at least 1− δ it returns a hypothesis with error of at most
.
∗
Hint: Fix some distribution D over X,let R = R(a ,b ,a ,b )bethe rectan-
∗
∗
∗
∗
1 1 2 2
gle that generates the labels, and let f be the corresponding hypothesis. Let
∗
a 1 ≥ a be a number such that the probability mass (with respect to D)of the
1
rectangle R 1 = R(a ,a 1 ,a ,b ) is exactly
/4. Similarly, let b 1 ,a 2 ,b 2 be numbers
∗
∗
∗
1 2 2
∗ ∗ ∗
such that the probability masses of the rectangles R 2 = R(b 1 ,b ,a ,b ), R 3 =
1 2 2
∗
∗
R(a ,b ,a ,a 2 ), R 4 = R(a ,b ,b 2 ,b ) are all exactly
/4. Let R(S)bethe
∗
∗
∗
∗
1 1 2 1 1 2
rectangle returned by A. See illustration in Figure 2.2.
∗
Show that R(S) ⊆ R .
Show that if S contains (positive) examples in all of the rectangles
R 1 , R 2 , R 3 , R 4 , then the hypothesis returned by A haserrorof at most
.
For each i ∈{1,...,4}, upper bound the probability that S does not contain
an example from R i .
Use the union bound to conclude the argument.
d
3. Repeat the previous question for the class of axis aligned rectangles in R .
4. Show that the runtime of applying the algorithm A mentioned earlier is polyno-
mial in d,1/
, and in log(1/δ).

34 35 36 37 38 39 40 41 42 43 44