hive>
> add jar /usr/lib/hive/lib/hive-contrib-0.10.0-cdh4.4.0.jar;
Added /usr/lib/hive/lib/hive-contrib-0.10.0-cdh4.4.0.jar to class path
Added resource: /usr/lib/hive/lib/hive-contrib-0.10.0-cdh4.4.0.jar
hive> add jar /usr/lib/hive/lib/hive-serdes-1.0-SNAPSHOT.jar;
Added /usr/lib/hive/lib/hive-serdes-1.0-SNAPSHOT.jar to class path
Added resource: /usr/lib/hive/lib/hive-serdes-1.0-SNAPSHOT.jar
hive> add jar /usr/lib/hive/lib/hive-serde-0.10.0-cdh4.4.0.jar;
Added /usr/lib/hive/lib/hive-serde-0.10.0-cdh4.4.0.jar to class path
Added resource: /usr/lib/hive/lib/hive-serde-0.10.0-cdh4.4.0.jar
hive> add jar /usr/lib/hive/lib/hive-json-serde-0.2.jar;
Added /usr/lib/hive/lib/hive-json-serde-0.2.jar to class path
Added resource: /usr/lib/hive/lib/hive-json-serde-0.2.jar
hive> add jar /usr/lib/hive/lib/hive-json-serde-0.1.jar;
Added /usr/lib/hive/lib/hive-json-serde-0.1.jar to class path
Added resource: /usr/lib/hive/lib/hive-json-serde-0.1.jar
hive> set hive.vectorized.execution.enabled=false;
hive> set hive.vectorized.execution.reduce.enabled=false;
hive> select retweeted_status.user.screen_name as retweeted_screen_name, retweeted_status.text, max(retweeted_status.retweet_count) as retweets from twitter_tweets group by retweeted_status.user.screen_name,retweeted_status.text limit 10;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapred.reduce.tasks=<number>
Starting Job = job_201705030718_0005, Tracking URL = http://0.0.0.0:50030/jobdetails.jsp?jobid=job_201705030718_0005
Kill Command = /usr/lib/hadoop/bin/hadoop job -kill job_201705030718_0005
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2017-05-03 08:34:48,474 Stage-1 map = 0%, reduce = 0%
2017-05-03 08:35:02,675 Stage-1 map = 28%, reduce = 0%
2017-05-03 08:35:06,739 Stage-1 map = 66%, reduce = 0%
2017-05-03 08:35:10,755 Stage-1 map = 0%, reduce = 0%
2017-05-03 08:35:20,878 Stage-1 map = 47%, reduce = 0%
2017-05-03 08:35:26,906 Stage-1 map = 0%, reduce = 0%
2017-05-03 08:35:38,972 Stage-1 map = 43%, reduce = 0%
2017-05-03 08:35:45,004 Stage-1 map = 0%, reduce = 0%
2017-05-03 08:35:56,057 Stage-1 map = 47%, reduce = 0%
2017-05-03 08:36:02,087 Stage-1 map = 0%, reduce = 0%
2017-05-03 08:36:05,166 Stage-1 map = 100%, reduce = 100%
Ended Job = job_201705030718_0005 with errors
Error during job, obtaining debugging information...
Job Tracking URL: http://0.0.0.0:50030/jobdetails.jsp?jobid=job_201705030718_0005
Examining task ID: task_201705030718_0005_m_000002 (and more) from job job_201705030718_0005
Task with the most failures(4):
-----
Task ID:
task_201705030718_0005_m_000000
URL:
http://localhost.localdomain:50030/taskdetails.jsp?jobid=job_201705030718_0005&tipid=task_201705030718_0005_m_000000
-----
Diagnostic Messages for this Task:
java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable {"filter_level":"low","retweeted":false,"in_reply_to_screen_name":null,"truncated":false,"lang":"en","in_reply_to_status_id_str":null,"id":856937051454930947,"in_reply_to_user_id_str":null,"timestamp_ms":"1493144688844","in_reply_to_status_id":null,"created_at":"Tue Apr 25 18:24:48 +0000 2017","favorite_count":0,"place":null,"coordinates":null,"text":"RT @asthajain22: Well written #IoT #internetofthings #cybersecurity #cyberexpert #technology #tech #technews #bigdata #CTO\u2026 ","contributors":null,"retweeted_status":{"filter_level":"low","retweeted":false,"in_reply_to_screen_name":null,"possibly_sensitive":false,"truncated":true,"lang":"en","in_reply_to_status_id_str":null,"id":856203174876659712,"in_reply_to_user_id_str":null,"in_reply_to_status_id":null,"created_at":"Sun Apr 23 17:48:39 +0000 2017","favorite_count":5,"display_text_range":[0,140],"place":null,"coordinates":null,"text":
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
MapReduce Jobs Launched:
Job 0: Map: 1 Reduce: 1 HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec
Hi Abhishek,
Check this error lines in the code:
{"text":"cybersecurity","indices":[53,67]},{"text":"cyberexpert","indices":[68,80]},{"text":"technology","indices":[81,92]},{"text":"tech","indices":[93,98]},{"text":"technews","indices":[99,108]},{"text":"bigdata","indices":[109,117]},{"text":"CTO","indices":[118,122]}],"user_mentions":[{"id":136975435,"name":"Astha Jain Podcast","ind
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:647)
at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:141)
... 8 more
Caused by: org.apache.hadoop.hive.serde2.SerDeException: org.codehaus.jackson.JsonParseException: Unexpected end-of-input: was expecting closing '"' for name
at [Source: java.io.StringReader@20ec6bb1; line: 1, column: 9161]
at com.cloudera.hive.serde.JSONSerDe.deserialize(JSONSerDe.java:128)
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:632)
... 9 more
Caused by: org.codehaus.jackson.JsonParseException: Unexpected end-of-input: was expecting closing '"' for name
at [Source: java.io.StringReader@20ec6bb1; line: 1, column: 9161]
If you see the error, it says " Unexpected end-of-input: was expecting closing ' " ' for name "
If you check your code, you will find, it end with " "user_mentions":[{"id":136975435,"name":"Astha Jain Podcast","ind "
Please correct the structure, so it can readable by JSON parser.
Hope this helps.
Thanks.
Thanks Abhijit for reply. Yes i saw the log file. But where do i check the code?. And where to correct the structure. Please explain in detail every step so that i can complete my project. Thanks in advance. Please do reply ur help is needed.
Hi Abhishek,
The issue seems to be because of twitter data. Could you please share the twitter data with me, along with hive queries.
Thanks.
HI Abhijit Thanks for reply. I did the complete thing again. New twitter data and ran queries again This time it worked. Thanks for ur help.