Page 18 - Data Science Algorithms in a Week
P. 18

1









                 Classification Using K Nearest



                                                                       Neighbors





            The nearest neighbor algorithm classifies a data instance based on its neighbors. The class of
            a data instance determined by the k-nearest neighbor algorithm is the class with the highest
            representation among the k-closest neighbors.

            In this chapter, we will cover the basics of the k-NN algorithm - understanding it and its
            implementation with a simple example: Mary and her temperature preferences. On the
            example map of Italy, you will learn how to choose a correct value k so that the algorithm
            can perform correctly and with the highest accuracy. You will learn how to rescale the
            values and prepare them for the k-NN algorithm with the example of house preferences. In
            the example of text classification, you will learn how to choose a good metric to measure the
            distances between the data points, and also how to eliminate the irrelevant dimensions in
            higher-dimensional space to ensure that the algorithm performs accurately.



            Mary and her temperature preferences

            As an example, if we know that our friend Mary feels cold when it is 10 degrees Celsius, but
            warm when it is 25 degrees Celsius, then in a room where it is 22 degrees Celsius, the
            nearest neighbor algorithm would guess that our friend would feel warm, because 22 is
            closer to 25 than to 10.
   13   14   15   16   17   18   19   20   21   22   23