PageRank
The PageRank algorithm used by Google
expresses the ranking of a web page in terms of two components:
- a base value, (1 - d), usually 0.15
- d * &sum_{i &isin links} PR_{i} / n_{i} where PR_{i} is
the page rank of a page that links to this page, and n_{i}
is the number of links from that page.
The PageRank values can be approximated by relaxation by using
this formula repeatedly within MapReduce.
Each page is initially given a PageRank of 1.0; the sum of all
values will always equal the number of pages.
- Map: Share the love: each page distributes its PageRank
equally across the pages it links to.
- Reduce: Each page sums the incoming values, multiplies
by 0.85, and adds 0.15 .
