This class we will look at the basics of the Map-Reduce process, it’s different components, and how it all works together.

The lab exercises main towards you building and running your first Map-Reduce process.

Lab Exercises

Follow the notes very carefully. There are some configurations settings. If you miss one of these or do one incorrectly, the code will not compile and/or will not run for you. See the sample code.

Complete Exercise 1 and Exercise 2 before the next class.

Exercise 1 is the same/similar as the Tutorial on the Hadoop website. Check out this webpage for more details and is an alternative location for the Sample Code.

WARNING: You need to be careful of what version of Hadoop is used for each tutorial/examples. Some package and functions names might have slightly different names and/or functionality between different versions of Hadoop. RTFM! (Read The Fabulous Manuals) i.e. the documentation!

Other tutorials – Alternative Lab Exercises

Additional Reading and Materials

Sample code
Shakespeare data set – gdrive link
Shakespeare data set – dropbox link
Shakespeare data set – website link
Hadoop in Action – Some Case Studies
10 Hadoop Tutorials