what is the difference between hive and hbase (NoSql) ???


1 Answer(s)


Hi Aditya,
Hbase is NoSql database.
Apache Hive is an effective standard for SQL-in-Hadoop. Hive is a front end for parsing SQL statements, generating logical plans, optimizing logical plans, translating them into physical plans which are executed by MapReduce jobs. Apache Hive is designed for the data warehouse system to ease the processing of adhoc queries on massive data sets stored in HDFS and ease data aggregations.

HBase is a real time, open source, column oriented, distributed NoSql database written in Java. HBase is modelled after Google’s BigTable and represents a key value column family store. It is built on top of Apache Hadoop.

HIVE is used to query these files by defining a "virtual" table and running SQL like queries on those tables.
HBase is a full fledged NoSQL database . Difference between HBase and HIVE is that HIVE is not a database , it is a way where your files are virtually connected to a table like structure so that you can execute SQL like queries and these queries are converted to MapReduce job by HIVE and you don't have to bother about writing MapReduce jobs. HBase, on the contrary is a Database but queries are not similar to SQL queries so it is a lot of work for an end user or analyst to learn how to extract data from HBase. Not recommended for end users.

Hive falls in "None of the Category" although the syntax used by Hive is similar to SQL.

Hope this helps.
Thanks.