Skip to main content

ExploriPy -- Newer ways to Exploratory Data Analysis


Introduction 

ExploriPy is yet another Python library used for Exploratory Data Analysis. This library pulled our attention because it is Quick & Easy to implement also simple to grasp the basics. Moreover, the visuals provided by this library are self-explanatory and are graspable by any new user. 

The most interesting part that we can't resist mentioning  is the easy grouping of the variables in different sections. This makes it more straightforward to understand and analyze our data.

The Four Major sections presented are:- 

  • Null Values
  • Categorical VS Target
  • Continuous VS Target
  • Continuous VS Continuous

 

Installation 

We will be using Jupyter notebook for the entire purpose you, may use any other IDE of your choice also.

## conda installation

conda install exploripy

## pip installation 

pip install exploripy

## Jupyter Notebook installation

pip install exploripy


Let's grab a coffee by the time it gets installed.

Installing ExploriPy


Once the library is installed, we might be asked to restart the kernel(in the case of Jupyter Notebook) to reflect the changes.


Great we have installed our ExploriPy library. And good to go with some examples which will help us understand it better.

Getting Started

Loading the data

*Please Note:- We prefer using Titanic Dataset as our first dataset for analysis.

Loading Data


Visualizing the data

ExploriPy opens a new tab of some amazing visuals divided into sections. The first part is the list of variables.

Visualizing Data

A basic list of variables in the dataset.  

List of Variables

Apart from just a list of variables, ExploriPy also groups the variables into Categorical & Continuous Variables.


Variable Grouping



Another small section is dedicated to the Target Variable and providing insights into it. This is important as the same info will be used in the coming sections. 


Target Variable Analysis

 

We will be visualizing each section present ExploriPy separately

  1. NULL Values:-

    Null Values play a vital role in Machine Learning models. Null values are the values that are not recorded and can lead to faulty analysis or wrong model creation. Thus, checking the NULL values in the dataset is a vital part of Modelling. 



    Null Value Analysis

  2.  Categorical Vs Target:-

    This section focuses on analyzing the categorical variables of the dataset concerning the Target variable. This is important as we need to have a clear picture of how much & how our target value is related to these Categorical Values.  

    There is a different subsection of each variable. We have shown one for demo purposes. 


    Categorical Vs Target Analysis(Sex)



  3. Continuous Vs Target:-

    This section focuses on analyzing the continuous variables of the dataset concerning the Target variable. This is important as we need to have a clear picture of how much & how our target value is related to these Continuous Values.  

    By Analysis of Continuous variables we can know various parameters about the variable like Quantiles, Kurtosis, Mean, Median, Mode, Variance, Skewness, Standard Deviation, etc.




    Categorical Vs Target Analysis



  4. Continuous Vs Continuous:- 

    Another important section is the Continuous VS Continuous variables analysis. This section presents a Heat Map, which helps us in identifying the correlation between these variables. 

    Identifying correlation is important because it helps us decide the dependence of the variables on each other. 

Continuous VS Continuous variables analysis



We can see the library divided the variables into 3 sections namely:- 

Categorical 
Continuous
Others

Presently this library does not provide any analysis for the variables in the "Others" category. But the analysis provided for the other two libraries is in-depth and enough to prepare the first step for Data Analysis.

Summary

We talked about the ExploriPy library, which is used for Exploratory data analysis. We talked about How to install it and use it. Also, we discussed the report and its sections in brief. 

Till then, What are you waiting for go, ahead download it and start playing with it, and share your views, issues, and suggestions. 

That's all from here. Until then, This is the Quick DataScience Team providing a Quick and Easy guide/insight of another DataScience topic. 




Comments