Loading data using Pig

Now that in AWS machine user has changed to hduser, files can still move to /home/ec2-user.
File emp.txt is moved to /home/ec2-user using winscp and then using below commands for loading data:
A = load 'emp.txt' using pigstorage(',') as (name:chararray,age:int,salary:int,dept_id:int);
A = load '/home/ec2-user/emp.txt' using pigstorage(',') as (name:chararray,age:int,salary:int,dept_id:int);

Data load fails, please suggest the path I need to provide for pick up file emp.txt.
Also unable to move this file to user/hduser as permissions are not there.

6 Answer(s)


Hi Sugandha,
Please change the user as hduser: su - hduser
You can save your file under - /user/hduser/
Hope this helps.


Hi Abhijit
In WinScp unable to login as user : su - hduser. Access denied.
In AWS already logged in as user : su - hduser


Please check if hostname also needs to be modified while logging in to Winscp.


Hi Sugandha,
In case of Winscp, you don't have to switch to hduser.
You can upload your file, anywhere you want.



Hi Abhijit
I loaded the data with below command:
A = load 'emp.txt' using PigStorage(',') as (name:chararray,age:int,salary:int,dept_id:int);
And then on Dump A below error comes:
Looks like backend error.

Failed to produce result in "hdfs://ip-10-0-0-28.ec2.internal:8020/tmp/temp-291835293/tmp915702883"

Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:

2016-02-11 03:28:22,802 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!
2016-02-11 03:28:22,805 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias A. Backend error : java.lang.IllegalStateException: Job in state DEFINE instead of RUNNING
Details at logfile: /home/hduser/pig_1455179139165.log


Hi Sugandha,
One thing I would like to mention that "emp.txt" should be in hdfs before loading in PigStorage().
If I understand correctly, emp.txt is in /home/ec2-user/
Please upload the file in hdfs. To do that-
- su - hduser //Change user because it has permission to access /user/hduser/
- hadoop fs -put emp.txt /user/hduser/sugandha/emp.txt //Upload the file

Hope this helps. Please correct me if I understand it wrong.