Atomic Commit

If multiple worker machines are working on the same data, it is necessary to ensure that only one set of result data is actually used.

An atomic commit is provided by the operating system (and, ultimately, CPU hardware) that allows exactly one result to be committed or accepted for use. If other workers produce the same result, those results will be discarded.

In MapReduce, atomicity is provided by the file system. When a Map worker finishes, it renames its temporary file to the final name; if a file by that name already exists, the renaming will fail.

Contents    Page-10    Prev    Next    Page+10    Index