Skip to main content

Posts

Showing posts with the label Apache Pig

Partitioning in HIVE - Learning by Doing

< Previous    Partitioning in Hive We studied the theory part involved in Partitioning in Hive in our previous article. Time to get our hands dirty now.  We will be following the below pattern for the Coding part:-  1. Hadoop Installation . 2. Hive Installation . 3. Static Partitioning.  {The theory part is covered in the previous article.} 4. Dynamic Partitioning. {The theory part is covered in the previous article.} Hope we have installed, and have Hadoop and Hive running. 

Partitioning in Hive

What is Partitioning? In simple words, we can explain Partitioning as the process of dividing something into sections or parts, with the motive of making it easily understandable and manageable. In our everyday  routine  also, we use this concept to ease out our tasks and save time. But we do it so abruptly that we hardly notice how we did it.  Let's see an example and get familiar with the concept.  Suppose we have a deck of cards and need to fetch "Jack of Spades" from the deck of cards. So now there are two ways in which we can accomplish this task. We can start turning over every card one by one, starting from the top/bottom until we reach our card. We group the deck according to suit, i.e. clubs, hearts, spades, diamonds. Now, as soon we hear "Spades", we know which group to look for, thus dividing our work 1/4 times. This grouping of our data according to some specific category reduced our work and saved energy, time and effort.  Defining in Technical Term

Spark — How to install in 5 Steps in Windows 10

 An easy to go guide for installing the Spark in Windows 10. Image taken from Google images 1. Prerequisites Hardware Requirement * RAM — Min. 8GB, if you have SSD in your system then 4GB RAM would also work. * CPU — Min. Quad-core, with at least 1.80GHz JRE 1.8   —   Offline installer for JRE  Java Development Kit — 1.8   A Software for Un-Zipping like   7Zip   or   Win Rar * I will be using 64-bit windows for the process, please check and download the version supported by your system x86 or x64 for all the software. Hadoop * I am using Hadoop-2.9.2, you can also use any other STABLE version for Hadoop.  * If you don’t have Hadoop, you can refer to installing it from   Hadoop: How to install in 5 Steps in Windows 10 . MySQL Query Browser Download Spark Zip * I am using Spark 3.1.1, you can also use any other STABLE version for Spark. * Latest release of Spark is 3.1.2(shown in the image below) released in June'21 Fig 1:- Download Spark-3.1.2

SQOOP — How to install in 5 Steps in Windows 10

  An easy to go guide for installing SQOOP in Windows 10. Image taken from Google images 1. Prerequisites Hardware Requirement * RAM — Min. 8GB, if you have SSD in your system then 4GB RAM would also work. * CPU — Min. Quad-core, with at least 1.80GHz JRE 1.8   — Offline installer for JRE  Java Development Kit — 1.8   A Software for Un-Zipping like   7Zip   or   Win Rar * I will be using 64-bit windows for the process, please check and download the version supported by your system x86 or x64 for all the software. Hadoop * I am using Hadoop-2.9.2, you can also use any other STABLE version for Hadoop.  * If you don’t have Hadoop, you can refer to installing it from   Hadoop: How to install in 5 Steps in Windows 10 . MySQL Query Browser Download SQOOP zip * I am using SQOOP-1.4.7, you can also use any other STABLE version for SQOOP. Fig 1:- Download Sqoop 1.4.7

PIG: How to install in 5 Steps in Windows 10

  An easy to go guide for installing the PIG in Windows 10. Image taken from Google images 1. Prerequisites:-  Hardware Requirement * RAM —  Min. 8GB, if you have SSD in your system then 4GB RAM would also work. * CPU —  Min. Quad-core, with at least 1.80GHz JRE 1.8  —  Offline installer for JRE Java Development Kit — 1.8 A Software for Un-Zipping like 7Zip or Win Rar  ---- * I will be using 64-bit windows for the process, please check and download the version supported by your system x86 or x64 for all the software. Hadoop  ---- * I am using Hadoop-2.9.2, you can also use any other STABLE version for Hadoop. * If you don’t have Hadoop, you can refer to installing it from Hadoop: How to install in 5 Steps in Windows 10 . MySQL Query Browser Download PIG zip  ---- * I am using PIG-0.17.0, you can also use any other STABLE version of Apache Pig . Fig 1:- Download PIG-0.17.0