issue in twitter project



0

Hi All,

I am facing below error while executing the final query in twitter project. Please help.

Query:
--------------

hive> select t.retweeted_screen_name, sum(retweets) as total_retweets, count(*) as tweet_count  from  ( select retweeted_status.user.screen_name as retweeted_screen_name, retweeted_status.text, max(retweeted_status.retweet_count) as retweets from tweets_partioned group by retweeted_status.user.screen_name,retweeted_status.text ) t group by t.retweeted_screen_name order by total_retweets DESC, tweet_count ASC limit 10;

Output:
---------------

Total MapReduce jobs = 3
Launching Job 1 out of 3
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapred.reduce.tasks=<number>
Starting Job = job_201611211027_0004, Tracking URL = http://0.0.0.0:50030/jobdetails.jsp?jobid=job_201611211027_0004
Kill Command = /usr/lib/hadoop/bin/hadoop job  -kill job_201611211027_0004
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2016-11-22 12:12:27,520 Stage-1 map = 0%,  reduce = 0%
2016-11-22 12:12:55,767 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_201611211027_0004 with errors
Error during job, obtaining debugging information...
Job Tracking URL: http://0.0.0.0:50030/jobdetails.jsp?jobid=job_201611211027_0004
Examining task ID: task_201611211027_0004_m_000002 (and more) from job job_201611211027_0004

Task with the most failures(4): 
-----
Task ID:
  task_201611211027_0004_m_000000

URL:
  http://localhost.localdomain:50030/taskdetails.jsp?jobid=job_201611211027_0004&tipid=task_201611211027_0004_m_000000
-----
Diagnostic Messages for this Task:
java.lang.RuntimeException: Error in configuring object
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:413)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
    at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.ja

FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
MapReduce Jobs Launched: 

Job 0: Map: 1  Reduce: 1   HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec

 

 

Thanks,

Swetha.
 


1 Answer(s)


0

Hi Swetha,

Your twitter data is in json format.you have created table on the top of data,so each time when you query the data it needs De-Serializer to interpret josn data into hive supported data format.so before running the query add the jar as:

ADD JAR /usr/lib/hive/lib/hive-serdes-1.0-SNAPSHOT.jar;

this jar is having JSONSerDe.class which deserialize the JSON object.

Hope this helps.

Thanks.

Your Answer

Click on this code-snippet-icon icon to add code snippet.

Upload Files (Maximum image file size - 1.5 MB, other file size - 10 MB, total size - not more than 50 MB)

Email
Password