Page 94 - Multicloud Workshop - Prework
P. 94
Big to Little
The basic concept is that we start off with a lot of data. We
distribute the data across a distributed file system running on
many nodes. O n each node we run a Map Reduce algorithm
to create an answer to a simple question across the whole of
the large dataset. We run map reduce algorithms in a
sequential manner until we have answered our query. The
result of each map: reduce pass will be quite small and can be
stored on a single server.
© 2016 Engage ESM All Rights Reserved