Error running Loading file with PIG


7 Answer(s)


Hi Patrice,
Here is the main error:
Input(s):
Failed to read data from "hdfs://ip-172-31-7-128.us-west-1.compute.internal:8020/user/ec2-user/NASDAQ_daily_prices_A_sample1.csv"

Reason: You are accessing the data from amazon data storage and using the wrong IP.

Hope this helps.
Thanks

I am running command "A = LOAD './NASDAQ_daily_prices_A_sample1.csv USINGPigStorage(,',);" and how comes I am passing the wrong IP. I am not passing any Ip and this is already on amazon server using Putty. So please check it again.

Hi Patrice,
Sorry for confusion.
You are job fail because the pig is not able to find the input file at ./NASDAQ_daily_prices_A_sample1.csv
Please use the "ls" command to check whether file is available at input location. .
Please share the output of this command.
- hadoop fs -ls

Thanks.

The file is in the directory and in hdfs as you can see in the attched screenshot.


Hi Patrice,
Please copy and paste these commands and let me know the Output.
A = LOAD 'NASDAQ_daily_prices_A_sample1.csv' USING PigStorage(,',);
DESCRIBE A;
Thanks

I just saw my error when I was correcting yours. The command is incorrect at this point "USING PigStorage (,',)" which should be "USING PigStorage (',')". it is a comma separated value using single column to escape the comma and not the opposite.
Thank you anyway.

Hi Patrice,
Yes, I also ignore the same point. Although thanks for your update.
Thanks