Skip to main content

Posts

Showing posts with the label data visualization

Data Visualization — IPL Data Set (Part 2)

  Welcome to the 3rd Post in the series of Data Visualization, one of the most loved/followed topics of India — IPL (Indian Premier League) (Part 2) 2008–2020. In Part1 we did an analysis based on the Teams , here we will be doing analysis based on all other fields and try to cover some very interesting and unique analyses. Overview of the Data Set Description of columns of IPL Dataset -1 Description of columns of IPL Dataset -2 Let’s Begin by Checking Data in these columns... IPL Data Set 1 Overview IPL Data Set 2 Overview Let’s begin with some visualization and finding the top 10 players, by analyzing the No. of MoM(Man of the Match) awards achieved. MoM Awards The above graph shows the top 10 players of IPL with the most number of Man of the Match Awards… and guess what… it's none other than our Mr. 360 (ABD) with 23 awards followed by The Universe Boss (Gayle 333) 22 awards, roHIT MAN of India with 18, Warner and Captain Cool (MSD) with 17 each. Just a random thought of checkin

Data Visualization — IPL Data Set (Part 1)

Welcome to the 2nd Post in the series of Data Visualization, one of the most loved/followed topics of India — IPL (Indian Premier League) (Part 1) In this, we will be focusing on the various analysis based on the Teams. Overview of the Data Set Description of columns of IPL Dataset Let’s Begin by Checking Data in these columns IPL Data Set 1 Overview IPL Data Set 2 Overview Moving towards the most interesting part, Visualize the dataset and relations. Let's begin by having a look at the total wins by each Team since 2008... Team VS No. of Match Wins The above Bar chart shows the top 5 teams with the most number of Match Wins across all the seasons. Surprisingly, RCB in among the top 5 still hasn’t won any IPL Season. Now Let’s have a look at these wins Team VS Wins based on runs/wickets The above Bar chart is a detailed version of the previous graph which shows the top 5 teams with the most wins divided by wins achieved batting first and batting second. Blue Bars represent the wins

Data Visualization — Netflix Data Set

Welcome to the First Post in the series of Data Visualization, of one of the best time passes and Entertainment for people around the globe — Netflix. We will be going through the dataset and having an overview of the content present on Netflix. Let’s have an overview of the dataset. Description of columns of Netflix Dataset Moving on.. and giving a look at the data present in these columns Netflix Data Overview Let’s Move forward and start with visualizing the data and getting some insights about the data. Firstly, let's see the number of shows based on the type present with us. Number of shows based on types From the above graph, we can notice we have around 5400 Movies data and 2400 TV Shows data present with us. It indicates that No. of movies released on Netflix is higher than the No. of TV Shows released & we can say Netflix is considered more to cinema halls rather than TV sets. Now let’s have a look at the countries producing the most No. of shows for Netflix. Top 20 co

ExploriPy -- Newer ways to Exploratory Data Analysis

Introduction  ExploriPy is yet another Python library used for Exploratory Data Analysis. This library pulled our attention because it is Quick & Easy to implement also simple to grasp the basics. Moreover, the visuals provided by this library are self-explanatory and are graspable by any new user.  The most interesting part that we can't resist mentioning  is the easy grouping of the variables in different sections. This makes it more straightforward to understand and analyze our data. The Four Major sections presented are:-  Null Values Categorical VS Target Continuous VS Target Continuous VS Continuous  

Automatic Visualization with AutoViz

We have discussed Exploratory Data Analysis, known as EDA & have also seen few powerful libraries that we can use extensively for EDA. EDA is a key step in Machine Learning, as it provides the start point for our Machine Learning task. But, there are a lot of issues related to traditional Data Analysis techniques. There are too many new libraries coming up in the market to rectify these issues. One such API is AutoViz, which provides Quick and Easy visualization with some insights about the data.

A Sweat way to Exploratory Data Analysis --- Sweetviz

Another day, another beautiful library for Exploratory Data Analysis(EDA) . Having studied some great libraries like Lux , D-tale , pandas profiling of EDA , we are back with another great API, 'SWEETVIZ', which you can use for your Data Science Project. Introduction It is an open-source Library of Python & is still in the development phase. It already has some great features to offer, & makes it our choice to bring it for you. Its sole purpose is to visualise & analyse data Quickly. The best feature of this API is it provides an option to compare two datasets, i.e. we can compare & analyse the test vs training data together. That's not all it's, just the starting. Let's dive deeper and see what it has more to offer us. 

EDA Techniques

We had a look over the basics of EDA in our previous article  EDA - Exploratory Data Analysis . So now let's move ahead and look at how we can automate the process and the various APIs used for the same. We will be focusing on the 7 major libraries that can be used for the same. These are our personal favourites & we prefer to use them most of the time.  We will look into the libraries' & will cover the install, load, and analyse parts for each separately.  D-tale Pandas - Profiling Lux Sweetviz Autoviz ExploriPy Dora

EDA ---- Exploratory Data Analysis

EDA EDA - Exploratory Data Analysis is the technique of defining, analyzing and investigate the dataset. This technique is used by most data scientists, engineers and everyone who is related to or wants to work and analyze the data. Saying that, it includes the whole majority of us as at any point of time we are dealing with data and we un-knowingly do an initial analysis about which in technical terms is referred to as   "Exploratory Data Analysis". Here is a formal definition of the EDA:-  In statistics, exploratory data analysis is an approach to analyzing data sets to summarize their main characteristics, often using statistical graphics and other data visualization methods.  Still confused about how every one of using this process..!! Let me explain it with a simple example... Suppose you and your group plan for lunch in a restaurant... as soon as we hear "lunch" and "restaurant" our mind starts creating a list of all the known places, next as someon

One Click Data Visualization

What is Data Visualization?  Data Visualization as the name suggests is creating nice, beautiful and informative visuals from our data, which helps get more insights from the data. It helps us and the third person who sees our analysis or report in reading it better. Creating a good visualization helps us in understanding the data better and helps in our machine learning journey.  The data visualization process uses various graphs, graphics, plots for explaining the data and getting insights. DV is important to simplify complex data by making it more  accessible, understandable, and usable to its end users. If you want to know in more detail about data visualization you can Read IT Here .