1-844-696-6465 (US)        +91 77600 44484        help@dezyre.com

I tried to load the file through PIG and getting error as below.



0
grunt> nydailyprices = LOAD '/home/cloudera/keerthi/pig_sample_data/nasdaq-sample/NASDAQ/NASDAQ_daily_prices_A.csv' using PigStorage(',') AS (exchange:chararray,stock_symbol:chararray,date:chararray,stock_price_open:float,stock_price_high:float,stock_price_low:float,stock_price_close:float,stock_volume:float,stock_price_adj_close:float);


grunt> dump nydailyprices 2014-06-26 23:09:18,780 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: UNKNOWN 2014-06-26 23:09:18,780 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - pig.usenewlogicalplan is set to true. New logical plan will be used.
2014-06-26 23:09:18,934 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - (Name: nydailyprices: Store(hdfs://localhost/tmp/temp490928225/tmp-869449159:org.apache.pig.impl.io.InterStorage) - scope-29 Operator Key: scope-29)
2014-06-26 23:09:18,944 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2014-06-26 23:09:18,977 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2014-06-26 23:09:18,977 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2014-06-26 23:09:19,041 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2014-06-26 23:09:19,054 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2014-06-26 23:09:21,505 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2014-06-26 23:09:21,547 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2014-06-26 23:09:21,825 [Thread-6] INFO org.apache.hadoop.mapred.JobClient - Cleaning up the staging area hdfs://localhost/var/lib/hadoop-0.20/cache/mapred/mapred/staging/cloudera/.staging/job_201406142301_0005
2014-06-26 23:09:22,048 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2014-06-26 23:09:22,049 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job null has failed! Stop running all dependent jobs
2014-06-26 23:09:22,054 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2014-06-26 23:09:22,063 [main] ERROR org.apache.pig.tools.pigstats.PigStats - ERROR 2997: Unable to recreate exception from backend error: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Input path does not exist: hdfs://localhost/home/cloudera/keerthi/pig_sample_data/nasdaq-sample/NASDAQ/NASDAQ_daily_prices_A.csv
2014-06-26 23:09:22,063 [main] ERROR org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed!
2014-06-26 23:09:22,064 [main] INFO org.apache.pig.tools.pigstats.PigStats - Script Statistics:

HadoopVersion PigVersion UserId StartedAt FinishedAt Features
0.20.2-cdh3u0 0.8.0-cdh3u0 cloudera 2014-06-26 23:09:19 2014-06-26 23:09:22 UNKNOWN

Failed!

Failed Jobs:
JobId Alias Feature Message Outputs
N/A nydailyprices MAP_ONLY Message: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Input path does not exist: hdfs://localhost/home/cloudera/keerthi/pig_sample_data/nasdaq-sample/NASDAQ/NASDAQ_daily_prices_A.csv
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:280)
at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:944)
at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:961)
at org.apache.hadoop.mapred.JobClient.access$500(JobClient.java:170)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:880)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:833)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:833)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:807)
at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378)
at org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
at org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: hdfs://localhost/home/cloudera/keerthi/pig_sample_data/nasdaq-sample/NASDAQ/NASDAQ_daily_prices_A.csv
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:231)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigTextInputFormat.listStatus(PigTextInputFormat.java:36)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:248)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:268)
... 14 more
hdfs://localhost/tmp/temp490928225/tmp-869449159,

Input(s):
Failed to read data from "/home/cloudera/keerthi/pig_sample_data/nasdaq-sample/NASDAQ/NASDAQ_daily_prices_A.csv"

Output(s):
Failed to produce result in "hdfs://localhost/tmp/temp490928225/tmp-869449159"

Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
null


2014-06-26 23:09:22,065 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!
2014-06-26 23:09:22,071 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias nydailyprices
Details at logfile: /home/cloudera/keerthi/pig_sample_data/nasdaq-sample/NASDAQ/pig_1403849340606.log
grunt>


May I know, why is it happening?

4 Answer(s)


0

Use below commands.
make a directory in your HDFS.

hadoop fs -mkdir pig;

Please move your input file into HDFS.

"user/cloudera/pig"

try to load the same.

grunt> nydailyprices = LOAD '/user/cloudera/pig/NASDAQ_daily_prices_A.csv' using PigStorage(',') AS (exchange:chararray,stock_symbol:chararray,date:chararray,stock_price_open:float,stock_price_high:float,stock_price_low:float,stock_price_close:float,stock_volume:float,stock_price_adj_close:float);

display loaded data.

grunt> dump nydailyprices;

0

Hi Keerthi,

If you look at the error message, it states that input path don't exists : "ERROR 2118: Input path does not exist: hdfs://localhost/home/cloudera/keerthi/pig_sample_data/nasdaq-sample/NASDAQ/NASDAQ_daily_prices_A.csv"

Looks like you have file in local file-system at this path and NOT in hdfs.
To resolve this error, execute following set of commands:
1. hadoop dfs -copyFromLocal /home/cloudera/keerthi/pig_sample_data/nasdaq-sample/NASDAQ/NASDAQ_daily_prices_A.csv /tmp/NASDAQ_daily_prices_A.csv
2. grunt> nydailyprices = LOAD '/tmp/NASDAQ_daily_prices_A.csv' using PigStorage(',') AS (exchange:chararray,stock_symbol:chararray,date:chararray,stock_price_open:float,stock_price_high:float,stock_price_low:float,stock_price_close:float,stock_volume:float,stock_price_adj_close:float);
3. grunt> dump nydailyprices ;

This should work.
Please post, if need further helps.



0

I am facing an error in DUMP command..I got the same error even though I followed your instructions of copying the local file into hadoop directories :(

0

Facing same issue:

[cloudera@quickstart ~]$ hdfs dfs -mkdir /user/cloudera/pigin
[cloudera@quickstart ~]$ hdfs dfs -put /home/cloudera/testfile /user/cloudera/pigin/testfile

--starting pig shell

grunt>
grunt> wordfile = LOAD '/user/cloudera/pigin/testfile' USING PigStorage('\n') as (linesin:chararray);
grunt> describe wordfile
wordfile: {linesin: chararray}

grunt> tempfile = LIMIT wordfile 10;
grunt> DUMP tempfile

Input(s):
Failed to read data from "/user/cloudera/pigin/testfile"

Output(s):

Job DAG:
job_local1468722327_0001 -> null,
null

Your Answer

Click on this code-snippet-icon icon to add code snippet.

Upload Files (Maximum image file size - 1.5 MB, other file size - 10 MB, total size - not more than 50 MB)

Email
Password