Page 94 - Multicloud Workshop

Page 94 - Multicloud Workshop - Prework

P. 94

Big to Little

The basic concept is that we start off with a lot of data. We

distribute the data across a distributed file system running on

many nodes. O n each node we run a Map Reduce algorithm

to create an answer to a simple question across the whole of

the large dataset. We run map reduce algorithms in a

sequential manner until we have answered our query. The

result of each map: reduce pass will be quite small and can be

stored on a single server.

89 90 91 92 93 94 95 96 97 98 99