Hive — How to install in 5 Steps in Windows 10

An easy to go guide for installing Hive in Windows 10.

Image taken from Google images

1. Prerequisites

Hardware Requirement
* RAM — Min. 8GB, if you have SSD in your system then 4GB RAM would also work.
* CPU — Min. Quad-core, with at least 1.80GHz
JRE 1.8 — Offline installer for JRE
Java Development Kit — 1.8
A Software for Un-Zipping like 7Zip or Win Rar
* I will be using 64-bit windows for the process, please check and download the version supported by your system x86 or x64 for all the software.
Hadoop
* I am using Hadoop-2.9.2, you can also use any other STABLE version for Hadoop.
* If you don’t have Hadoop, you can refer to installing it from Hadoop: How to install in 5 Steps in Windows 10.
MySQL Query Browser
Download Hive zip
* I am using Hive-3.1.2, you can also use any other STABLE version for Hive.

Fig 1:- Download Hive-3.1.2

2. Unzip and Install Hive

After Downloading the Hive, we need to Unzip the apache-hive-3.1.2-bin.tar.gz file.

Fig 2:- Extracting Hive Step-1

Once extracted, we would get a new file apache-hive-3.1.2-bin.tar
Now, once again we need to extract this tar file.

Fig 3:- Extracting Hive Step-2

Now we can organize our Hive installation, we can create a folder and move the final extracted file in it. For Eg. :-

Fig 4:- Hive Directory

Please note while creating folders, DO NOT ADD SPACES IN BETWEEN THE FOLDER NAME.(it can cause issues later)
I have placed my Hive in D: drive you can use C: or any other drive also.

3. Setting Up Environment Variables

Another important step in setting up a work environment is to set your Systems environment variable.

To edit environment variables, go to Control Panel > System > click on the “Advanced system settings” link
Alternatively, We can Right-click on This PC icon and click on Properties and click on the “Advanced system settings” link
Or, the easiest way is to search for Environment Variable in the search bar and there you GO…😉

Fig. 5:- Path for Environment Variable

Fig. 6:- Advanced System Settings Screen

3.1 Setting HIVE_HOME

Open environment Variable and click on “New” in “User Variable”

Fig. 7:- Adding Environment Variable

On clicking “New”, we get the below screen.

Fig. 8:- Adding HIVE_HOME

Now as shown, add HIVE_HOME in variable name and path of Hive in Variable Value.
Click OK and we are half done with setting HIVE_HOME.

3.2 Setting Path Variable

The last step in setting the Environment variable is setting Path in System Variable.

Fig. 9:- Setting Path Variable

Select Path variable in the system variables and click on “Edit”.

Fig. 10:- Adding Path

Now we need to add these paths to Path Variable:-
* %HIVE_HOME%\bin
Click OK and OK. & we are done with Setting Environment Variables.

3.3 Verify the Paths

Now we need to verify that what we have done is correct and reflecting.
Open a NEW Command Window
Run following commands.

echo %HIVE_HOME%

4. Editing Hive

Once we have configured the environment variables next step is to configure Hive.

4.1 Replacing bins

The first step in configuring the hive is to download and replace the bin folder.

Go to this GitHub Repo and download the bin folder as a zip.
Extract the zip and replace all the files present under the bin folder to %HIVE_HOME%\bin

Note:- If you are using different version of HIVE then please search for its respective bin folder and download it.

4.2 Creating File Hive-site.xml

Now we need to create the Hive-site.xml file in the hive for configuring it:-

(We can find these files in Hive -> conf -> hive-default.xml.template)

We need to copy the hive-default.xml.template file and paste it in the same location and rename it to hive-site.xml. This will act as our main Config file for Hive.

Fig. 11:- Creating Hive-site.xml

4.3 Editing Configuration Files

4.3.1 Editing the Properties

Now Open the newly created Hive-site.xml and we need to edit the following properties

<name>hive.metastore.uris</name>

<value>thrift://<Your IP Address>:9083</value>

<name>hive.downloaded.resources.dir</name>

<value><Your drive Folder>/${hive.session.id}_resources</value>

<name>hive.exec.scratchdir</name>

<value>/tmp/mydir</value>

Replace the value for <Your IP Address> with the IP Address of your System and replace <Your drive Folder> with the Hive folder Path.

4.3.2 Removing Special Characters

This is a short step and we need to remove all the character present in the hive-site.xml file.

4.3.3 Adding few More Properties

Now we need to add the following properties as it is in the hive-site.xml File.

<name>hive.querylog.location</name>

<value>$HIVE_HOME/iotmp</value>

<description>Location of Hive run time structured log file</description>

</property>

<name>hive.exec.local.scratchdir</name>

<value>$HIVE_HOME/iotmp</value>

<description>Local scratch space for Hive jobs</description>

</property>

<name>hive.downloaded.resources.dir</name>

<value>$HIVE_HOME/iotmp</value>

<description>Temporary local directory for added resources in the remote file system.</description>

</property>

Great..!!! We are almost done with the Hive part, for configuring MySQL database as Metastore for Hive, we need to follow the below steps:-

4.4 Creating Hive User in MySQL

The next important step in configuring Hive is to create users for MySQL.
These Users are used for connecting Hive to MySQL Database for reading and writing data from it.

Note:- You can skip this step if you have created the hive user while SQOOP installation.

Firstly, we need to open the MySQL Workbench and open the workspace(default or any specific, if you want). We will be using the default workspace only for now.

Fig 12:- Open MySQL Workbench

Now Open the Administration option in the Workspace and select Users and privileges option under Management.

Fig 13:- Opening Users and Privileges

Now select Add Account option and Create an new user with Login Name as hive and Limit to Host Mapping as the localhost and Password of your choice.

Fig 14:- Creating Hive User

Now we have to define the roles for this user under Administrative Roles and select DBManager ,DBDesigner and BackupAdmin Roles.

Fig 15:- Assigning Roles

Now we need to grant schema privileges for the user by using Add Entry option and selecting the schemas we need access to.

Fig 16:- Schema Privileges

I am using schema matching pattern as %_bigdata% for all my bigdata related schemas. You can use other 2 options also.

After clicking OK we need to select All the privileges for this schema.

Fig 17:- Select All privileges in the schema

Click Apply and we are done with the creating Hive user.

4.5 Granting permission to Users

Once we have created the user hive the next step is to Grant All privileges to this user for all the Tables in the previously selected Schema.

Open the MySQL cmd Window. We can open it by using the Window’s Search bar.

Fig 18:- MySQL cmd

Upon opening it will ask for your root user password(created while setting up MySQL).
Now we need to run the below command in the cmd window.

grant all privileges on test_bigdata.* to 'hive'@'localhost';

where test_bigdata will be your schema name and hive@localhost will be the user name @ Hostname.

4.6 Creating Metastore

Now we need to create our own metastore for Hive in MySQL...

Firstly, we need to create a database for metastore in MySQL OR we can use the one which used in the previous step test_bigdata in my case.

Now Navigate to the below path

hive -> scripts -> metastore -> upgrade -> mysql and execute the file hive-schema-3.1.0.mysql in MySQL in your database.

Note:- If you are using a different Database, select the folder for the same in upgrade folder and execute the hive-schema file.

4.7 Adding Few More Properties(Metastore related Properties)

Finally, we need to open our hive-site.xml file once again and make some changes their, these are related to Hive metastore that’s why did not add them in starting to distinguish between the different set of properties.

<name>javax.jdo.option.ConnectionUserName</name>

<description>Username to use against metastore database</description>

</property>

<name>javax.jdo.option.ConnectionURL</name>

<value>jdbc:mysql://localhost:3306/<Your Database>?createDatabaseIfNotExist=true</value>

JDBC connect string for a JDBC metastore.

To use SSL to encrypt/authenticate the connection, provide database-specific SSL flag in the connection URL.

For example, jdbc:postgresql://myhost/db?ssl=true for postgres database.

</description>

</property>

<name>hive.metastore.warehouse.dir</name>

<value>hdfs://localhost:9000/user/hive/warehouse</value>

<description>location of default database for the warehouse</description>

</property>

<name>javax.jdo.option.ConnectionPassword</name>

<description>password to use against metastore database</description>

</property>

<name>datanucleus.schema.autoCreateSchema</name>

</property>

<name>datanucleus.schema.autoCreateTables</name>

</property>

<name>datanucleus.schema.validateTables</name>

<description>validates existing schema against code. turn this on if you want to verify existing schema</description>

</property>

Replace the value for <Hive Password> with the hive user password that we created in MySQL user creation. And <Your Database> with the database that we used for metastore in MySQL.

5. Starting Hive

5.1 Starting Hadoop

Now we need to start a new Command Prompt remember to run it as administrator to avoid permission issues and execute the below commands.

start-all.cmd

Fig. 19:- start-all.cmd

All the 4 daemons should be UP and running.

5.2 Starting Hive Metastore

Open a cmd window, run the below command to start the Hive metastore.

hive --service metastore

Fig 20:- Starting Hive Metastore

5.3 Starting Hive

Now open a new cmd window and run the below command to start Hive.

hive

6. Common Issues

6.1 Unable to export or import data in hive

The 1st common issue that we face after starting Hive is that we are unable to import Or Export

Sol:- We need to edit the below property and make it false.

<property>                      <name>hive.metastore.event.db.notification.api.auth</name>     <value>false</value>    

<description>

      Should metastore do authorization against database notification related APIs such as get_next_notification.

      If set to true, then only the superusers in proxy settings have the permission

    </description>

  </property>

6.2 Join not Working

We need to run the below commands before running the join query if we face an issue while running a join query:-

set hive.auto.convert.join=false;

set hive.auto.convert.join.noconditionaltask=false;

Because without these hive tries a map-side join which fails, for normal join set these param as false.

7. Congratulations..!!!!🎉

Congratulation! We have successfully installed Hive.
There are chances that some of us might have faced some issues… Don’t worry it's most likely due to some small miss or incompatible software. If you face any such issue please visit all the steps once again carefully and verify for the right software versions.
If you still can’t get Hive up and running, Don’t hesitate to describe your problem below in the comment section.

Learn Hadoop Installation in 5 steps

Learn SQOOP Installation in 5 steps

Learn Pig Installation in 5 steps

Happy Learning… !!! 🙂

Comments

dfJuly 5, 2021 at 10:04 PM
Hi,

I installed Hive but while opening hive shell I am getting error as : Connection timed out.(Cannot connect to metastore)
Can you please help here as to how I should correct this.
dfJuly 6, 2021 at 4:34 PM
Great! It worked. I can open hive shell now.
Thank you so much.
dfJuly 24, 2021 at 2:56 PM
Hi Team,

I have created one table in Hive with table definition as below:

Create external table student( name string, age int, gender string) row format delimited fields terminated by '|' lines terminated by '\n' Location '/usr/shash';

And thereafter I have loaded data in above table as below:

LOAD DATA LOCAL INPATH 'F:/TestingData/External_legato.txt' INTO table student;

And data in my file is as below:
|VSP|26|M|
|PS|26|M|
|TG|26|F|

After running the command select * from student I am getting result as below:

NULL 26
NULL 26
NULL 26
Can you please help here as why am I getting result like this.
dfAugust 9, 2021 at 2:34 PM
Thank you for this great explanation .I tried with other datasets as well. It's working the way you explained.
Thanikaivel BMarch 1, 2022 at 12:01 PM
Hi Shashank, I installed hive when use command start-all.cmd, it works but when i start command "hive --service metastore" and "hive". It show message( 'hive' is not recognized as an internal or external command, operable program or batch file).

QuickDataScience | Quick & Easy Data Science

Search This Blog

Hive — How to install in 5 Steps in Windows 10

1. Prerequisites

2. Unzip and Install Hive

3. Setting Up Environment Variables

3.1 Setting HIVE_HOME

3.2 Setting Path Variable

3.3 Verify the Paths

4. Editing Hive

4.1 Replacing bins

4.2 Creating File Hive-site.xml

4.3 Editing Configuration Files

4.3.1 Editing the Properties

4.3.2 Removing Special Characters

4.3.3 Adding few More Properties

4.4 Creating Hive User in MySQL

4.5 Granting permission to Users

4.6 Creating Metastore

4.7 Adding Few More Properties(Metastore related Properties)

5. Starting Hive

5.1 Starting Hadoop

5.2 Starting Hive Metastore

5.3 Starting Hive

6. Common Issues

6.1 Unable to export or import data in hive

6.2 Join not Working

7. Congratulations..!!!!🎉

Labels

Comments

Post a Comment