Often, you may want to process input data using a map function only. To do this, simply set mapred.reduce.tasks to zero. The Map/Reduce framework will not create any reducer tasks. Rather, the outputs of the mapper tasks will be the final output of the job.
-D mapred.reduce.tasks=0
Hadoop Streaming also supports map-only jobs by specifying "-D mapred.reduce.tasks=0".
To specify map-only jobs, use
hadoop jar hadoop-streaming-2.7.1.2.4.0.0-169.jar
-D mapred.reduce.tasks=0
-input /user/root/wordcount
-output /user/root/out
-mapper /bin/cat
We can also achieve map-only jobs by specifying "-numReduceTasks 0"
hadoop jar hadoop-streaming-2.7.1.2.4.0.0-169.jar
-input /user/root/wordcount
-output /user/root/out
-mapper /bin/cat
-numReduceTasks 0
-D mapred.reduce.tasks=0
Hadoop Streaming also supports map-only jobs by specifying "-D mapred.reduce.tasks=0".
To specify map-only jobs, use
hadoop jar hadoop-streaming-2.7.1.2.4.0.0-169.jar
-D mapred.reduce.tasks=0
-input /user/root/wordcount
-output /user/root/out
-mapper /bin/cat
We can also achieve map-only jobs by specifying "-numReduceTasks 0"
hadoop jar hadoop-streaming-2.7.1.2.4.0.0-169.jar
-input /user/root/wordcount
-output /user/root/out
-mapper /bin/cat
-numReduceTasks 0
No comments:
Post a Comment