Page 94 - Multicloud Workshop - Prework
P. 94

Big to Little















             The basic concept is that we start off with a lot of data. We


         distribute the data across a distributed file system running on


           many nodes. O n each node we run a Map Reduce algorithm

           to create an answer to a simple question across the whole of


                     the large dataset. We run map reduce algorithms in a


              sequential manner until we have answered our query. The


        result of each map: reduce pass will be quite small and can be


                                                             stored on a single server.

















       © 2016 Engage ESM All Rights Reserved
   89   90   91   92   93   94   95   96   97   98   99