Skip to main content

Posts

Showing posts with the label external table

Internal VS External

Introduction Hive may not be the first term that pops in our mind when we talk about Hadoop & Big Data, but it is definitely a term, a tool, a tech that everyone discusses as they proceed in their Big Data journey & never parts from it.  In simple words, if we want to explain What is Hive to any new data science enthusiast, in a single line we can say "Hive is the SQL for big data".  Why? Because it is used to manage huge structured data, query and analyse that data & it sits on top of Hadoop.  Hive is a data warehouse infrastructure, that is used to process, query and analyse the structured data in Hadoop. Structured data, the data having definite structure, i.e. table format. It is designed similar to SQL, similar interface, similar queries. The difference is that like other Hadoop techs in the ecosystem implements Map-Reduce to perform the required task, similarly, traditional SQL queries in the hive, known as HQL(Hive Query Language), implements Map-Reduce job