Skip to main content

Close
Get the best research tool on the web today,and free!
Connect with people with common interests!

saved by2 people, first byevangineer on 2007-09-06, last byKen Wei on 2008-05-08

  • Hadoop actually comes with a library of stock maps and reducers, and in this case we could have used LongSumReducer which does the same as our reducer
  • We didn't set the input types, since the defaults (
    >
    >
    org.apache.hadoop.io.LongWritable
    >


    for the beginning of line character offsets, and
    >
    >
    org.apache.hadoop.io.Text
    >


    for the lines) are what we need.
    >
    >
    Also, the input format and output format - how the input files are turned into key-value pairs, and how the output key-value pairs are turned into output files - are not specified since the default is to use text files (as opposed to using a more compact binary format)
    >.
  • We also set the Combiner class. A Combiner is just a Reduce task that runs in the same process as the Map task after the Map task has finished.
  • When you run the main method of the job it will use a local job runner that runs Hadoop in the same JVM, which allows you to run a debugger, should you need to.
  • they may be input to a further MapReduce job.