Skip to main content

Posts

Showing posts with the label hive default partition

Hive Default Partition

  Introduction We have already studied two different partitioning techniques in Hive, Dynamic & Static Partitioning. If we look at these techniques, in Static Partitioning we define the partitions manually whereas in Dynamic Partitioning the values are assigned dynamically based on some column values.  But if we look and analyse the data, we know data is never pure, it always has some impurities in it like some of the data will be missing, some are not formatted properly, some are random, etc etc.  Thus, in such cases when we create partitions a problem arises where to assign the NULL values (I am supposing that we created partitions for wrong values also), do we need to create a separate partition for it, then what would be the condition for NULL values, in case of dynamic partition, when we do not have control over the partitions how are we going to deal with NULL data.