Skip to main content

Ordinal Encoding

 



Introduction

When we talk encoding, one thing that usually comes to our mind is why can't we simply write down all the values from a variable in a list and assign them values 1,2,3,4..... and so on. Just like we did in our childhood while playing..!!! 

The answer is YES..!!! we can do it.. in fact, we will do it... or rather we are going to do it here... 

Ordinal Encoding is encoding the categorical variables with ordinal numbers like 1,2,3,4...etc. This way of encoding can be either done by assigning 'Arbitrary' values to the variables or can be based on some value like Mean, or target data.  

Arbitrary Ordinal Encoding:- Here the ordinal numbers are allotted randomly to the variables for the encoding.

Mean Ordinal Encoding:- Here the ordinal numbers are allotted based on the Target Mean value(Just like we did in Mean/Target Encoding) to the variables for the encoding.

Some Important Points

While we go ahead and perform Categorical Variable Encoding using Ordinal Encoding, there are a few points that one should keep in mind:-

Before using this technique, we need to divide the dataset into train and test sets. 
  • Train this technique only over the train set.
  • Using this trained model, encode the values from both train and test sets.
  • In case, if some values are missing in the train set at the time of training the model and encountered in the test set, it will give an error for such values.

Advantages

  • This technique is quite simple to implement.
  • It does not expand the feature space.
  • Creates a monotonic relation between variable and target. 
  • Due to monotonic relation best suitable for Linear Models.

Disadvantages

  • Prone to cause over-fitting
  • Difficult to cross-validate with current libraries.

Practical

We will be using the feature-engine library of python for demo purposes.

1. Importing the Libraries


Importing Ordinal Encoder, libraries and data
Importing Ordinal Encoder, libraries and data


2. Viewing the Data


Dataset Preview
Dataset Preview

3. Initializing the Ordinal Encoder

Initializing Ordinal Encoder
Initializing Ordinal Encoder


Here, while fitting the data(training the data) we need to specify the target variable also. 

4. Transforming using Ordinal Encoder 


Ordinal Encoder transform
Ordinal Encoder transform

Here Male is encoded by value '0' whereas Female is encoded by '1'. 

5. Verifying Data

Ordinal Encoded 'Sex' variable
Ordinal Encoded 'Sex' variable



Resources


Please comment below to get the complete dataset and libraries. 

Learn to install Anaconda Here.


Summary 


In this Quick Reads, we studied a technique, Ordinal Encoding that is commonly used for Categorical Variable Encoding. We had a quick overview of the technique, saw the positives and negatives of this technique and some quick Points to Remember. 

We also performed a practical demo of this technique using a famous python library "feature-engine". 

 Last but not least... "Practice Makes One Perfect". So what are you waiting for practice this technique and comment below your views, doubts or anything? We are here to help you. 

Comments