Karan =>PIG Clarifications needed
3 Answer(s)
ramanayerramsetty
Hi Ashley,
Probably Karan could answer better.
I can try answering the first 2 questions
1. But from my knowledge like the way we did in mapreduce for nasdaq assignment we checked if the column is not 'exchange' then process the records, similarly we may have to use may be like
FILTER X exchange !='exchange' which you define in the schema
so that it can skip the header record.
2. For the top 5 records we may have to use ORDER BY ASC/DESC and then apply LIMIT BY 5
3. I tried using multiple commands but couldn't. I am not sure if it applies to all but for the wordcount program we did use flatten(TOKENIZE($0))
FYI- One thing I realised was the aggregate functions are case sensitive use SUM, COUNT, AVG.
Jul 17 2014 06:15 AM
Probably Karan could answer better.
I can try answering the first 2 questions
1. But from my knowledge like the way we did in mapreduce for nasdaq assignment we checked if the column is not 'exchange' then process the records, similarly we may have to use may be like
FILTER X exchange !='exchange' which you define in the schema
so that it can skip the header record.
2. For the top 5 records we may have to use ORDER BY ASC/DESC and then apply LIMIT BY 5
3. I tried using multiple commands but couldn't. I am not sure if it applies to all but for the wordcount program we did use flatten(TOKENIZE($0))
FYI- One thing I realised was the aggregate functions are case sensitive use SUM, COUNT, AVG.
ramanayerramsetty
1. or probably if we know the first row has header we can FILTER the previous relation >1
Jul 17 2014 06:20 AM
DeZyre Support
hi Ashley,
By default the first line will be ignored by Pig when it queries the data using the Pig Latin.
you can login to pig either in hdfs or local mode as shown below and can execute hdfs commands like mkdir, ls ,etc
pig –x (will run in hadoop mode),
pig –x local (will invoke grunt shell)
Thanks
Jul 28 2014 06:38 AM
By default the first line will be ignored by Pig when it queries the data using the Pig Latin.
you can login to pig either in hdfs or local mode as shown below and can execute hdfs commands like mkdir, ls ,etc
pig –x (will run in hadoop mode),
pig –x local (will invoke grunt shell)
Thanks