Page 9 - Data Science Algorithms in a Week
P. 9
Table of Contents
Preface 1
Chapter 1: Classification Using K Nearest Neighbors 6
Mary and her temperature preferences 6
Implementation of k-nearest neighbors algorithm 10
Map of Italy example - choosing the value of k 15
House ownership - data rescaling 18
Text classification - using non-Euclidean distances 20
Text classification - k-NN in higher-dimensions 23
Summary 25
Problems 25
Chapter 2: Naive Bayes 29
Medical test - basic application of Bayes' theorem 30
Proof of Bayes' theorem and its extension 31
Extended Bayes' theorem 32
Playing chess - independent events 33
Implementation of naive Bayes classifier 34
Playing chess - dependent events 37
Gender classification - Bayes for continuous random variables 40
Summary 42
Problems 43
Chapter 3: Decision Trees 51
Swim preference - representing data with decision tree 52
Information theory 53
Information entropy 53
Coin flipping 54
Definition of information entropy 54
Information gain 55
Swim preference - information gain calculation 55
ID3 algorithm - decision tree construction 57
Swim preference - decision tree construction by ID3 algorithm 57
Implementation 58
Classifying with a decision tree 64
Classifying a data sample with the swimming preference decision tree 65