I tried running the following streaming command but i get error Please post an answer



0
[cloudera@localhost hadoop]$ bin/hadoop jar /usr/lib/hadoop-0.20-mapreduce/contrib/streaming/hadoop-streaming-2.0.0-mr1-cdh4.4.0.jar  -file /home/cloudera/class3/streaming/mapper.py    -mapper "python /home/cloudera/class3/streaming/mapper.py" -file /home/cloudera/class3/streaming/reducer.py  -reducer "python /home/cloudera/class3/streaming/reducer.py"  -input /user/cloudera/class3/wcinput  -output /user/cloudera/class3/wcstreamingoutput
packageJobJar: [/home/cloudera/class3/streaming/mapper.py, /home/cloudera/class3/streaming/reducer.py, /tmp/hadoop-cloudera/hadoop-unjar1059483172389320074/] [] /tmp/streamjob231646892283581378.jar tmpDir=null
17/03/27 11:07:58 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
17/03/27 11:07:59 INFO mapred.FileInputFormat: Total input paths to process : 2
17/03/27 11:07:59 INFO streaming.StreamJob: getLocalDirs(): [/tmp/hadoop-cloudera/mapred/local]
17/03/27 11:07:59 INFO streaming.StreamJob: Running job: job_201703270921_0004
17/03/27 11:07:59 INFO streaming.StreamJob: To kill this job, run:
17/03/27 11:07:59 INFO streaming.StreamJob: UNDEF/bin/hadoop job  -Dmapred.job.tracker=localhost.localdomain:8021 -kill job_201703270921_0004
17/03/27 11:07:59 INFO streaming.StreamJob: Tracking URL: http://0.0.0.0:50030/jobdetails.jsp?jobid=job_201703270921_0004
17/03/27 11:08:00 INFO streaming.StreamJob:  map 0%  reduce 0%
17/03/27 11:08:58 INFO streaming.StreamJob:  map 100%  reduce 100%
17/03/27 11:08:58 INFO streaming.StreamJob: To kill this job, run:
17/03/27 11:08:58 INFO streaming.StreamJob: UNDEF/bin/hadoop job  -Dmapred.job.tracker=localhost.localdomain:8021 -kill job_201703270921_0004
17/03/27 11:08:58 INFO streaming.StreamJob: Tracking URL: http://0.0.0.0:50030/jobdetails.jsp?jobid=job_201703270921_0004
17/03/27 11:08:58 ERROR streaming.StreamJob: Job not successful. Error: NA
17/03/27 11:08:58 INFO streaming.StreamJob: killJob...
Streaming Command Failed!

 


3 Answer(s)


0

Hi Abhishek,

The Hadoop framework does not know how to run your mapper and reducer. There's two possible fixes:

FIX 1: explicitly call python.

-mapper "python mapper.py" -reducer "python reducer.py"

FIX 2: tell Hadoop where to find the python interpreter. To do this, you need to explicitly tell it where to find it, in the top line of your *.py files. For example:

#!/usr/bin/env python

Hope this helps.

Thanks.


0

Thanks Abhijit for the answer. Fix 2 was already mentioned in my .py file. But changing -mapper "python /home/cloudera/class3/streaming/mapper.py" to -mapper "python mapper.py" and same to reducer in my command worked for me. Thanks for help.


0

@Abhishek, thanks for the confirmation.

Your Answer

Click on this code-snippet-icon icon to add code snippet.

Upload Files (Maximum image file size - 1.5 MB, other file size - 10 MB, total size - not more than 50 MB)

Email
Password