What is Data Visualization? 
Data Visualization as the name suggests is creating nice, beautiful and informative visuals from our data, which helps get more insights from the data. It helps us and the third person who sees our analysis or report in reading it better. Creating a good visualization helps us in understanding the data better and helps in our machine learning journey. 
The data visualization process uses various graphs, graphics, plots for explaining the data and getting insights. DV is important to simplify complex data by making it more accessible, understandable, and usable to its end users.
If you want to know in more detail about data visualization you can Read IT Here.
Why it is important?
So, now the next question arises why it is so important? And should we really care about adding visualization in our analysis? 
So, here is the answer... it's a big YES... even if we are working with a very small and known dataset we should try to include few data visuals which are good enough to describe the dataset, its insights and what we are trying to predict from it. 
It's very important to have few visuals in our analysis/report it not only makes our report attractive but also more informative with the correct visuals attached. 
It has been rightly said:- 
|  | 
| Fig 1:- Image speaks a thousand word | 
Also, an important and interesting effect that these visuals have is it's hard to remember the data but when the same data is compressed and expressed as an image it remains in our subconscious mind for a longer time...
Few of the traditional python libraries that are mostly used for the visualizations are:- 
Need for One-Click Visualization
A question may arise in our mind What If we have so many data visualization libraries available with us then what is the need to opt for Onee-Click Data Visualization? 
We need it because of some problems with the traditional libraries being used... 
Yes, you read that right...PROBLEMS... You may wonder the thing that we praised so much in previous sections how can it have problems with it... If so why we praised it..!!! 
So, here is my short answer for it... 
|  | 
| Fig 2:- Every coins has 2 sides | 
It depends on which side we are looking at currently and what is the end result we are aiming for... and doesn't means that another side is less important or erroneous. 
Few issues that I would like to highlight are:- 
1. Time Consuming:- 
Yes, the process of visualization is quite a time consuming and lengthy as we need to figure out the variables and graphs to be used, which can easily consume away our time without even knowing it. 
Also, sometimes we might have passed the wrong values which completely changes the meaning of the graph and by the time we realise we may have already invested a huge time into it.
2. Too Many Libraries:- 
Looking for a perfect graph to explain your data is quite a tedious job but when we are given too many options to choose we often get confused in selecting the right one or bias ourselves for using a particular option without looking for other options.
3. We are not clear what to expect from the dataset:-
This scenario happens at times when we are so confused with our dataset that we are not sure from where to start and how to analyse it. In that case, we just start hitting randomly in the hope to hit the target. This again consumes our time and having too many options makes it more difficult. 
 
Note:- The more experience we have the lesser is the chance to get stuck here.
So, to not get stuck and waste our time, it's better to have a tool that performs some basic analysis on its own and provide some starting ground for us. For this purpose, we are going to use the "One-Click Visualization" technique.
How to achieve One-Click DV
Enough of talking and building the background... Let's jump directly to our topic... 
To achieve One-Click Data Visualization we are going to use the "lux" API of python. 
Here is a quick definition for Lux API:-
Lux is a Python library that makes data science easier by automating certain aspects of the data exploration process. Lux is designed to facilitate faster experimentation with data, even when the user does not have a clear idea of what they are looking for. Lux is integrated with an interactive Jupyter widget that allows users to quickly browse through large collections of data directly within their Jupyter notebooks.
Let's have a look at what exactly I am talking about:- 
|  | 
| Fig:- Lux Visuals | 
Interesting..!!! Yes, indeed it is... creating and analyzing these graphs would have cost us around an hour easily but using this API was just a matter of a click right after importing the dataset and we are ready with some good graphs for us to begin our journey. 
Let's get friendly with this great Lux-API.
Installation
Installing lux-API is quick and easy just like Us... 😉
Step 1:- 
Open Jupyter Notebook. 
If you are still not having Jupyter Notebook with you don't worry it comes FREE with Anaconda and you can... Learn how to get it Here 
Great... Now open the Jupyter notebook and Jump to the next step
Step 2:-  
Install lux-API
Now, in the new notebook tab run the below command, to begin with, the installation
pip install lux-API
it would take some time and you will get a success message. 
|  | 
| Fig:- lux-API install | 
Note:- it asks to restart your kernel for using the updated package. So, let's do it and move to the next step.
Step 3:- 
The next step is to install and enable the lux-API extension for Jupyter.
So, we need to run the below commands in our Jupyter notebook.
!jupyter nbextension install --py luxwidget
!jupyter nbextension enable --py luxwidget
This will take few minutes.
|  | 
| Fig:- Enable Jupyter extension | 
Since it was already installed in my system it validated my pre-installed extensions and didn't install them again.
Step 4:-
Importing lux-API
The next step is to import the lux API. We can use the below command for the same. 
import lux 
Step 5:- 
Start using it. 
Yes, now we are good to use our lux-API. The only thing we need to do is simply import our dataset and view it. 
df = pd.read_csv('..train.csv')
df
|  | 
| Fig:- Using lux 
 
 
 Now, we get a new button when we execute the view dataset  "Toggle Pandas/Lux" 
 Just click on it... We get our awaited One-Click visual 
 
 
 |  |  | Fig:- lux Visuals | 
 | 
The lux has analysed our dataset and presented the data in 3 different sections:- Correlation, Distribution, Occurrence
Correlation:-  Correlation searches through all pairwise relationships between two quantitative attributes (e.g., Fare, Age). The visualizations are ranked from most to least linearly correlated based on their Pearson’s correlation score.
Distribution:- Distribution displays univariate histogram distributions of all quantitative attributes (e.g., Age). Visualizations are ranked from most to least skewed.
Occurrence:- Occurrence displays bar charts of counts for all categorical attributes (e.g., Name). Visualizations are ranked from most to least uneven across the bars.
Every section has a different story to tell we can use these sections and the graphs to begin our analysis. These sections are quite helpful as they not only group and make the analysis easy but also provides details about the dataset.
That's all from the lux-API tutorial.
Summary
We have learned about a new Python Library "Lux-API", which is a very useful API that helps in generating beautiful graphs and easy pre-analysis of our dataset.  
Hope you have learned something new today... Don't forget to use it and share your experience about it in the comment section below, the interesting graphs and analysis that you were able to get from it.
Happy Learning.... 😊
 
Comments
Post a Comment