Skip to main content

Posts

Showing posts with the label dataset

D-Tale -- One Stop Solution for EDA

D-Tale is a new recently launched(Feb 2020) tool for Exploratory Data Analysis. It is made up of Flask(for back-end) and React(for Front-end) providing a powerful analysing and visualizing tool.  D-Tale is a Graphical User Interface platform that is not only Quick & Easy to understand but also great fun to use. It comes with so many features packed and loaded in it that reduces the manual work of Data Engineers/Scientists analysing and understanding the data and removes the load of looking for multiple different libraries used in EDA.  Let's have a look at some features which make it so amazing:- 1. Seamless Integration -- D-tale provides seamless integration with multiple python/ipython notebooks and terminals. So, we can use it with almost any IDE of our choice. 2. Friendly UI  -- The Graphical User Interface provided by D-tale is quite simple and easy to understand, such that anybody can easily get friendly with it & start working right away.  3. Support...

SQL --- Structured Query Language

  What is SQL? Structured Query Language is also known as SQL is the database language and is one of the most famous and in-demand technology.  This language was specially developed for database management i.e. creating a database, inserting and updating records in them, managing accesses and retrieving data from it. SQL is mostly used for Relational Database Management Systems.  Its demand is increasing every single day. As there is an increase in data, demand and need for SQL increases. It is been used by web developers, data analysts, data engineers, and in every other field where we need to store and retrieve data.  One of the main reasons why SQL is gaining popularity is that it is simple, easy, quick, and powerful. Another reason is that the most commonly used version of SQL(MySQL) is open-source(FREE) Another great feature of  SQL is Non Procedural language(explained in the next section). 

EDA ---- Exploratory Data Analysis

EDA EDA - Exploratory Data Analysis is the technique of defining, analyzing and investigate the dataset. This technique is used by most data scientists, engineers and everyone who is related to or wants to work and analyze the data. Saying that, it includes the whole majority of us as at any point of time we are dealing with data and we un-knowingly do an initial analysis about which in technical terms is referred to as   "Exploratory Data Analysis". Here is a formal definition of the EDA:-  In statistics, exploratory data analysis is an approach to analyzing data sets to summarize their main characteristics, often using statistical graphics and other data visualization methods.  Still confused about how every one of using this process..!! Let me explain it with a simple example... Suppose you and your group plan for lunch in a restaurant... as soon as we hear "lunch" and "restaurant" our mind starts creating a list of all the known places, next as someon...

One Click Data Visualization

What is Data Visualization?  Data Visualization as the name suggests is creating nice, beautiful and informative visuals from our data, which helps get more insights from the data. It helps us and the third person who sees our analysis or report in reading it better. Creating a good visualization helps us in understanding the data better and helps in our machine learning journey.  The data visualization process uses various graphs, graphics, plots for explaining the data and getting insights. DV is important to simplify complex data by making it more  accessible, understandable, and usable to its end users. If you want to know in more detail about data visualization you can Read IT Here .

Defining, Analyzing, and Implementing Imputation Techniques

  What is Imputation? Imputation is a technique used for replacing the missing data with some substitute value to retain most of the data/information of the dataset. These techniques are used because removing the data from the dataset every time is not feasible and can lead to a reduction in the size of the dataset to a large extend, which not only raises concerns for biasing the dataset but also leads to incorrect analysis. Fig 1:- Imputation Not Sure What is Missing Data? How it occurs? And its type? Have a look  HERE  to know more about it. Let’s understand the concept of Imputation from the above Fig {Fig 1}. In the above image, I have tried to represent the Missing data on the left table(marked in Red) and by using the Imputation techniques we have filled the missing dataset in the right table(marked in Yellow), without reducing the actual size of the dataset. If we notice here we have increased the column size, which is possible in Imputation(Adding “Missing” catego...

Missing Data -- Understanding The Concepts

  Introduction Machine Learning seems to be a big fascinating term, which attracts a lot of people towards it, and knowing what all we can achieve through it makes the sci-fi imagination of ours jump to another level. No doubt in it, it is a great field and we can achieve everything from an automated reply system to a house cleaning robots, from recommending a movie or a product to help in detecting disease. Most of the things that we see today have already started using ML to better themselves. Though building a model is quite easy, the most challenging task is preprocessing the data and filtering out the Data of Use. So, here I am going to address one of the biggest and common issues that we face at the start of the journey of making a Good ML Model, which is  The   Missing Data . Missing Data can cause many issues and can lead to wrong predictions of our model, which looks like our model failed and started over again. If I have to explain in simple terms, data is like ...