Hadoop streaming not working when trying to run mapreduce through Python

I am trying to run mapper.py and reducer.py using hadoop streaming but it continues to tell me that the files don't exist. The files are definitely there and made sure they are readable. I tried on both AWS and VM and it is the same.

What might I be doing wrong?

2 Answer(s)


Hello Sree

I tried running the same, and it gave the same error but when I replaced the file paths (from hadoop filesystem to local filesystem), the error was gone. I am able to run the command successfully now. But the output files created are 0 KB.

Please try by replacing the file paths, if you get it done. Please let me know if you get the output files.


hi Sree,

Let us know if you have the Python scripts on local fs or hdfs. If these are there on HDFS , please check the execute permissions which might be the problem. Also you can paste the output of local/hdfs where the python files are located and the streaming command