Rick Wagner's Blog: Hadoop Solution for the Hoppity problem

The latest in the Code Snippet series.

A highly contrived solution to the Hoppity problem, using Hadoop.

Why in the world did I do this? After all, Hadoop is a batch framework for working with large amounts of data in parallel, hardly the right tool to solve a trivial small-data problem on a single laptop.

I did it primarily to exercise my Hadoop skills, though even here it's a light workout. But I think Hadoop is well worth learning and practicing, so here goes!

The Setup
- You have to have Hadoop installed. For that task, I used Michael Noll's excellent Hadoop-on-Ubuntu blog entry.
- If you have a problem running after Michael's instructions, if it's a host naming problem, be sure your hosts file has 127.0.0.1 as 'localhost'.

Run scripts
You'll probably want to use utility scripts for the following tasks:

# Compile
javac -cp /usr/local/hadoop/hadoop-0.20.1/hadoop-0.20.1-core.jar:/usr/local/hadoop/hadoop-0.20.1/lib/commons-cli-1.2.jar Hoppity.java
jar -cfv hoppity.jar *.class

# Make a directory
bin/hadoop dfs -mkdir /user/hadoop/HoppityInput

# Copy the file containing Hoppity input into the directory
bin/hadoop dfs -copyFromLocal numHops.txt /user/hadoop/HoppityInput

# Run
bin/hadoop jar hoppity.jar Hoppity /user/hadoop/HoppityInput /user/hadoop/HoppityOutput

# View your output
bin/hadoop dfs -ls /user/hadoop/HoppityOutput
bin/hadoop dfs -cat /user/hadoop/HoppityOutput/part-r-00000

# Destroy the output directory for the inevitable re-runs as you learn
bin/hadoop dfs -rmr /user/hadoop/HoppityOutput

Finally, the code

import java.io.IOException;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.GenericOptionsParser;

public class Hoppity {

  public static class HoppityMapper 
       extends Mapper<Object, Text, Text, Text>{
    
      
    public void map(Object key, Text value, Context context
                    ) throws IOException, InterruptedException {
        int theValue = Integer.parseInt(value.toString().trim());
        StringBuilder result = new StringBuilder();
        for (int idx = 1; idx < (theValue + 1); idx++){
            if (isModN(idx, 3) && isModN(idx, 5)){
                result.append("Hop~");
                continue;
            }
            if (isModN(idx, 3)){
                result.append("Hoppity~");
                continue;
            }
            if (isModN(idx, 5)){
                result.append("Hophop~");
                continue;
            }           
        }
        Text resultKey = new Text("ResultKey");
        Text resultValue = new Text(result.toString());
        context.write(resultKey, resultValue);
    }
    
    private boolean isModN(int num, int mod){
        if ((num % mod) == 0){
            return true;
        }
        return false;
    }
  }
  
  public static class HoppityReducer 
       extends Reducer<Text,Text,Text,Text> {

    public void reduce(Text key, Iterable<Text> values, 
                       Context context
                       ) throws IOException, InterruptedException {
      for (Text val : values) {
          String[] hops = val.toString().split("~");
          for (String hop : hops){
                  Text blankText = new Text();
              context.write(blankText, new Text(hop));  
          }
      }      
    }
  }

  public static void main(String[] args) throws Exception {
    Configuration conf = new Configuration();
    String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs();
    if (otherArgs.length != 2) {
      System.err.println("Usage: Hoppity <in> <out>");
      System.exit(2);
    }
    Job job = new Job(conf, "hoppity");
    job.setJarByClass(Hoppity.class);
    job.setMapperClass(HoppityMapper.class);
    job.setCombinerClass(HoppityReducer.class);
    job.setReducerClass(HoppityReducer.class);
    job.setOutputKeyClass(Text.class);
    job.setOutputValueClass(Text.class);
    FileInputFormat.addInputPath(job, new Path(otherArgs[0]));
    FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));
    System.exit(job.waitForCompletion(true) ? 0 : 1);
  }
}

Happy Coding!

2 comments:

Anonymous said...: some parts of the code is not visible; December 19, 2009 at 7:14 AM
kalyan hadoop said...: Best Big Data Hadoop Training in Hyderabad @ Kalyan Orienit

Follow the below links to know more knowledge on Hadoop

WebSites:
================
http://www.kalyanhadooptraining.com/

http://www.hyderabadhadooptraining.com/

http://www.bigdatatraininghyderabad.com/

Videos:
===============
https://www.youtube.com/watch?v=-_fTzrgzVQc

https://www.youtube.com/watch?v=Df2Odze87dE

https://www.youtube.com/watch?v=AOfX-tNkYyo

https://www.youtube.com/watch?v=Cyo3y0vlZ3c

https://www.youtube.com/watch?v=jOLSXx6koO4

https://www.youtube.com/watch?v=09mpbNBAmCo; May 1, 2015 at 8:36 AM

Rick Wagner's Blog

Friday, December 18, 2009

Hadoop Solution for the Hoppity problem

2 comments:

Search This Blog

BlogCritics

Blog Archive

Daily Chess Puzzle

About Me