Introduction
ExploriPy is yet another Python library used for Exploratory Data Analysis. This library pulled our attention because it is Quick & Easy to implement also simple to grasp the basics. Moreover, the visuals provided by this library are self-explanatory and are graspable by any new user.
The most interesting part that we can't resist mentioning is the easy grouping of the variables in different sections. This makes it more straightforward to understand and analyze our data.
The Four Major sections presented are:-
- Null Values
- Categorical VS Target
- Continuous VS Target
- Continuous VS Continuous
Installation
We will be using Jupyter notebook for the entire purpose you, may use any other IDE of your choice also.
## conda installation
conda install exploripy
## pip installation
pip install exploripy
## Jupyter Notebook installation
pip install exploripy
Let's grab a coffee by the time it gets installed.
Installing ExploriPy |
Once the library is installed, we might be asked to restart the kernel(in the case of Jupyter Notebook) to reflect the changes.
Great we have installed our ExploriPy library. And good to go with some examples which will help us understand it better.
Getting Started
Loading the data
*Please Note:- We prefer using Titanic Dataset as our first dataset for analysis.
Loading Data |
Visualizing the data
ExploriPy opens a new tab of some amazing visuals divided into sections. The first part is the list of variables.
Visualizing Data
A basic list of variables in the dataset.
List of Variables |
Apart from just a list of variables, ExploriPy also groups the variables into Categorical & Continuous Variables.
Variable Grouping |
Another small section is dedicated to the Target Variable and providing insights into it. This is important as the same info will be used in the coming sections.
Target Variable Analysis |
We will be visualizing each section present ExploriPy separately
- NULL Values:-
Null Values play a vital role in Machine Learning models. Null values are the values that are not recorded and can lead to faulty analysis or wrong model creation. Thus, checking the NULL values in the dataset is a vital part of Modelling.
Null Value Analysis
- Categorical Vs Target:-
This section focuses on analyzing the categorical variables of the dataset concerning the Target variable. This is important as we need to have a clear picture of how much & how our target value is related to these Categorical Values.
There is a different subsection of each variable. We have shown one for demo purposes.
Categorical Vs Target Analysis(Sex)
- Continuous Vs Target:-
This section focuses on analyzing the continuous variables of the dataset concerning the Target variable. This is important as we need to have a clear picture of how much & how our target value is related to these Continuous Values.
By Analysis of Continuous variables we can know various parameters about the variable like Quantiles, Kurtosis, Mean, Median, Mode, Variance, Skewness, Standard Deviation, etc.
Categorical Vs Target Analysis
- Continuous Vs Continuous:-
Another important section is the Continuous VS Continuous variables analysis. This section presents a Heat Map, which helps us in identifying the correlation between these variables.
Identifying correlation is important because it helps us decide the dependence of the variables on each other.
We can see the library divided the variables into 3 sections namely:-
Categorical
Continuous
Others
Presently this library does not provide any analysis for the variables in the "Others" category. But the analysis provided for the other two libraries is in-depth and enough to prepare the first step for Data Analysis.
Null Values play a vital role in Machine Learning models. Null values are the values that are not recorded and can lead to faulty analysis or wrong model creation. Thus, checking the NULL values in the dataset is a vital part of Modelling.
Null Value Analysis |
This section focuses on analyzing the categorical variables of the dataset concerning the Target variable. This is important as we need to have a clear picture of how much & how our target value is related to these Categorical Values.
There is a different subsection of each variable. We have shown one for demo purposes.
Categorical Vs Target Analysis(Sex) |
This section focuses on analyzing the continuous variables of the dataset concerning the Target variable. This is important as we need to have a clear picture of how much & how our target value is related to these Continuous Values.
By Analysis of Continuous variables we can know various parameters about the variable like Quantiles, Kurtosis, Mean, Median, Mode, Variance, Skewness, Standard Deviation, etc.
Categorical Vs Target Analysis |
Another important section is the Continuous VS Continuous variables analysis. This section presents a Heat Map, which helps us in identifying the correlation between these variables.
Identifying correlation is important because it helps us decide the dependence of the variables on each other.
We can see the library divided the variables into 3 sections namely:-
Continuous
Others
Presently this library does not provide any analysis for the variables in the "Others" category. But the analysis provided for the other two libraries is in-depth and enough to prepare the first step for Data Analysis.
Comments
Post a Comment