cs314 p. 309

Contents Page-10 Prev Next Page+10 Index

Reduce Worker

A Reduce Worker receives from the Master a set of M file addresses, one for each Map worker. The Reduce worker reads these files; these reads must go across the network, and therefore may take some time and cause network congestion.

The Reduce Worker first sorts its input data by key and groups together all the data values for each key. It then runs the Reduce program on each data set.

The result is a list, (key, list(value)); these are put into the output buffer of the Reduce worker (these will now be sorted by key). When done, the Reduce worker send the file address of its output file to the Master.

The Master can finally combine all the output files from Reduce workers into sorted order by doing a Merge.