Word count input files



0
Anyone successfully run the word count program from Module 4? Where are the input files located? All I see in the input folder is a Help file with "grep" commands.

3 Answer(s)


0

Hi Elizabeth

Did you try watching the WordCount video in Module 3

Run your first MapReduce program - WordCount

Its a short 15 minute video which explains all the steps involved

Try watching it and following the steps - if you are using CDH4 - then create the files and folders under /user/cloudera

Thanks

0

Hi Elizabeth,

I have done it successfully. I followed the wordcount video and created a words.xt file and copied to the wordcount input directory and then ran the copyFromLocal to move to hdfs.

Below are the commands form my console
cloudera@cloudera-vm:~/wordcount$ ls
input
cloudera@cloudera-vm:~/wordcount$ cd input
cloudera@cloudera-vm:~/wordcount/input$ ls
words.txt
cloudera@cloudera-vm:~/wordcount/input$ hadoop fs -copyFromLocal
words.txt /wordcount/input/words.txt
cloudera@cloudera-vm:~/wordcount/input$ hadoop fs -ls
Found 3 items
drwxr-xr-x - cloudera supergroup 0 2014-07-09 18:43 /user/cloudera/c2
drwxr-xr-x - cloudera supergroup 0 2014-07-10 16:13
/user/cloudera/class2
drwxr-xr-x - cloudera supergroup 0 2014-07-10 16:23
/user/cloudera/review
cloudera@cloudera-vm:~/wordcount/input$ hadoop fs -ls /wordcount/input
Found 1 items
-rw-r--r-- 1 cloudera supergroup 136 2014-07-12 15:29
/wordcount/input/words.txt
cloudera@cloudera-vm:~/wordcount/input$ hadoop fs -cat
/wordcount/input/words.txt
hadoop rocks hadoop bigelephant
cute elephant
ramana is learning hadoop
texas
planet of the apes
xmen rocks
hadoop again
bigdatacloudera@cloudera-vm:~/wordcount/input$ cd ..
cloudera@cloudera-vm:~/wordcount$ ls
input wordcount.jar
cloudera@cloudera-vm:~/wordcount$ hadoop jar wordcount.jar WordCount
/wordcount/input/words.txt /wordcount/output

Good Luck!!

Ramana

0

Thank you Dezyre and Ramana!