The Wayback Machine - https://web.archive.org/web/20200812011214/https://github.com/topics/mapreduce-java
Skip to content
#

mapreduce-java

Here are 89 public repositories matching this topic...

The goal of this programming assignment is to compute the PageRanks of an input set of hyperlinked Wikipedia documents using Hadoop MapReduce. The PageRank score of a web page serves as an indicator of the importance of the page. Many web search engines (e.g., Google) use PageRank scores in some form to rank user-submitted queries. The goals of this assignment are to: 1. Understand the PageRank algorithm and how it works in MapReduce. 2. Implement PageRank and execute it on a large corpus of data. 3. Examine the output from running PageRank on Simple English Wikipedia to measure the relative importance of pages in the corpus. To run your program on the full Simple English Wikipedia archive, you will need to run it on the dsba-hadoop cluster to which you have access.

  • Updated Apr 22, 2018
  • Java

Improve this page

Add a description, image, and links to the mapreduce-java topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the mapreduce-java topic, visit your repo's landing page and select "manage topics."

Learn more

You can’t perform that action at this time.