Skip to main content

Feature Scaling



Let's begin with a famous saying... "Five Fingers are Never Equal". 

Yes, we have heard it a lot, it's true in every case, even in Data Science and Machine Learning... The very first step in the journey of Data Science begins with Data Collection, and this is where we knowingly or unknowingly collect some data which are different in size, units etc. which makes the data vary and inconsistent data.

In case we collect the vehicle data we might have the top speed in MPH, distance covered in KM, dimensions of the vehicle in CM/Inch, Model No. with no unit etc. 

Sample data for Feature Scaling


Thus, when we take this type of raw data and directly pass it through our Machine Learning Algorithms, it will give inconsistent results as the Machine understands no. only and not the units. So, it might give more weightage to the length of the car (1300mm) than the mileage of the car(30kmpl).  

What is Feature Scaling?

We have seen some quick info about the problem statement, now let us understand what is Feature scaling and then why do we need it. 

As the name suggests, we are talking about "scaling", but What to scale and How to scale? 

Wikipedia defines Feature Scaling as:-

Feature scaling is a method used to normalize the range of independent variables or features of data. Data processing is also known as data normalization and is generally performed during the data preprocessing step. 

In Simple terms, Feature scaling is the process of reducing and converting the variables having different scales to similar scales. 

Scaled Sample Data

In the above image, we have scaled our sample data to give a clear picture of scaling. Here we can see the three features (Length, Width, and Mileage) which were previously of very different ranges have now been reduced to a similar scale, i.e. in the range 0-1. 

* The method used will be discussed in further posts.

Why Feature Scaling?

We already have an idea and have seen a brief about why feature scaling is needed. Let's dive deeper and understand it better.

Let's continue our Tata Car's example only and see the impact of scaling. 

Note:- We will be using simple Linear regression to show the maths, just to have a deeper understanding in quick and easy terms.

Sample data for Feature Scaling Math.


We all are familiar with Linear Regression(The first and most basic algebraic equation). 



In the case of multiple variables, the equation can be re-written as:- 



where m1,m2,m3,m4,.... are some constant for respective variables
x1,x2,x3,x4,.... are the variables/features in our dataset.
 is constant. 

Now, putting values in the equation.

y1(Indigo) = m1*4150 + m2*1700 + m3*16 + m4*750000 + m5*2010 + c1

y2(Altroz) = m1*3990 + m2*1755 + m3*25 + m4*20000 + m5*2020 + c2

and so on. 

In the above equation, we can notice 4th variable is having an exceptionally high value which will definitely bias the equation. Also, travelled distance is less important as compared to the age of the car(year of manufacturing). Similarly, Mileage which is usually a 2 digit number will lose its importance in an ML algo when put together with our features. But, this should be the complete opposite in real-world scenarios where a person focuses more on Mileage over other features.

Enough of the overview... Let's move ahead and see the various ways in which we can achieve feature scaling. 


Methods of Feature Scaling:-





Comments