Skip to main content

Feature Scaling -- Maximum Absolute Scaling

 


In previous articles, we read about Feature Scaling and two of the most important techniques used for feature scaling, i.e. Standardization & MinMaxScaling.

Here we will see another feature scaling technique that can be used to scale the variables and is somewhat similar to the MinMaxScaling technique. This technique is popularly known as MaxAbsScaling or Maximum Absolute Scaling.

What is MaxAbsScaling?

Maximum Absolute Scaling is the technique of scaling the data to its absolute maximum value.

The logic used here is to divide each value by the Absolute Maximum Value for each variable/column.
Doing so will scale down all the values between -1 to 1. 

It can be implemented easily in a few lines of code, as shown below in the practical section. 

Note:- Scikit-learn recommends using this transformer on data that is centred at zero or on sparse data.

Formula Used:- 


MaxAbsScaling Formula


Features of MaxAbsScaling:- 

1.  Minimum and Maximum values are scaled between [-1,1]:- 

Since scaling each variable is divided by the Absolute Maximum Value for every variable, thus the values get scaled between -1 & +1. 

2. Mean is not centred at 0:-

This method does not use or consider the mean for scaling, thus the mean does not get centred around any particular value (eg. 0). But chances are it can get centred at 0 for a few of the variables, depending on the variable distribution. 

3. Variance varies across the variables:- 

MaxAbsScaling focuses on scaling the variables only based on the Absolute Maximum value. Thus, the variances can vary for each variable.

4. Sensitive to Outliers:-

Since, the method uses Maximum vales for scaling, in the case of outliers, these outliers sometimes become the Maximum value and thus disturbing the original distribution on scaling.

5. May not preserve the Original shape of Distribution.

As we have already discussed, if any randomly high outlier is used as the Maximum value then the shape of the original distribution gets disturbed. We can also see the same in the graph presented in Practical Section.

Practical:- 

Let's implement it by ourselves to understand it much better

1.  Importing the necessities:-

Importing libraries for MaxAbs Scaling

2. Getting Data Insights

For getting any meaningful insights from the data we first need to be familiar with the data. i.e No. of Rows/Columns, Type of data, What that variable represents, their magnitude etc. etc.

To get a rough idea of our data we use the .head() method.

Boston Data Overview

To know in detail about the dataset, i.e what each variable represents we can use .DESCR()  method.

Description of Boston House Data

To further get the mathematical details from the data, we can use the .describe() method. 

Mathematical Description of Boston House Data

3. Scaling the Data

We will be using the MaxAbsScaler method from skLearn for our data. 

Implementing MaxAbs Scaling

That's it... Simple isn't it...!!! 

Now we can check the absolute maximum values for each variable. 

Maximum Absolute Value for each variable

4. Verifying the Scaling

To verify the end result first, we need to convert the scaled_data to a pandas dataframe.

Converting scaled data to dataframe

Next,  we need to verify if the data has been scaled or not. For which we need to use the .describe() method again. 

Scaled Data:- 

Describing scaled data

We can notice here that the maximum value for each variable has been fixed to 1.0 whereas the Minimum value is around 0 for each variable.

Original Data:- 

Describing Original data


Let's have a look at the distribution graph before and after scaling.

MaxAbs Data Distribution


Summary

We have studied Maximum Absolute Scaling, a technique most commonly used for Feature Scaling. Here is a Quick Note on the technique:- 

1. Minimum and Maximum values are scaled between [-1,1]

2. Mean is not centred at 0.

3. Variance varies across the variables.

4. Sensitive to Outliers.

5. May not preserve the Original shape of Distribution.

6. Can be used with other techniques to centre the mean at 0.


Happy Learning... !!

Comments