Skip to main content

Posts

Showing posts with the label quickdatascience

ExploriPy -- Newer ways to Exploratory Data Analysis

Introduction  ExploriPy is yet another Python library used for Exploratory Data Analysis. This library pulled our attention because it is Quick & Easy to implement also simple to grasp the basics. Moreover, the visuals provided by this library are self-explanatory and are graspable by any new user.  The most interesting part that we can't resist mentioning  is the easy grouping of the variables in different sections. This makes it more straightforward to understand and analyze our data. The Four Major sections presented are:-  Null Values Categorical VS Target Continuous VS Target Continuous VS Continuous  

The Explorer of Data Sets -- Dora

Exploring the dataset is both fun and tedious but an inevitable step for the Machine Learning journey. The challenge always stands for correctness, completeness and timely analysis of the data.  To overcome these issues lot of libraries are present, having their advantages and disadvantages. We have already discussed a few of them( Pandas profiling , dtale , autoviz , lux , sweetviz ) in previous articles. Today, we would like to present a new library for Exploratory Data Analysis --- Dora.  Saying only an EDA library would not be justified as it does not help explore the dataset but also helps to adjust data for the modelling purpose.

Automatic Visualization with AutoViz

We have discussed Exploratory Data Analysis, known as EDA & have also seen few powerful libraries that we can use extensively for EDA. EDA is a key step in Machine Learning, as it provides the start point for our Machine Learning task. But, there are a lot of issues related to traditional Data Analysis techniques. There are too many new libraries coming up in the market to rectify these issues. One such API is AutoViz, which provides Quick and Easy visualization with some insights about the data.

A Sweat way to Exploratory Data Analysis --- Sweetviz

Another day, another beautiful library for Exploratory Data Analysis(EDA) . Having studied some great libraries like Lux , D-tale , pandas profiling of EDA , we are back with another great API, 'SWEETVIZ', which you can use for your Data Science Project. Introduction It is an open-source Library of Python & is still in the development phase. It already has some great features to offer, & makes it our choice to bring it for you. Its sole purpose is to visualise & analyse data Quickly. The best feature of this API is it provides an option to compare two datasets, i.e. we can compare & analyse the test vs training data together. That's not all it's, just the starting. Let's dive deeper and see what it has more to offer us. 

One Click Data Visualization

What is Data Visualization?  Data Visualization as the name suggests is creating nice, beautiful and informative visuals from our data, which helps get more insights from the data. It helps us and the third person who sees our analysis or report in reading it better. Creating a good visualization helps us in understanding the data better and helps in our machine learning journey.  The data visualization process uses various graphs, graphics, plots for explaining the data and getting insights. DV is important to simplify complex data by making it more  accessible, understandable, and usable to its end users. If you want to know in more detail about data visualization you can Read IT Here .

Anaconda -- How to install in 5 steps in Windows

  Image taken from Google images An easy to go guide for installing the Anaconda in Windows 10. 1. Prerequisites      Hardware Requirement * RAM — Min. 8GB, if you have SSD in your system then 4GB RAM would also work. * CPU — Min. Quad-core, with at least 1.80GHz  Operating System * Windows 8 or later  System Architecture Windows- 64-bit x86, 32-bit x86  Space Minimum 5 GB disk space to download and install   Anaconda   We need to download the Anaconda from HERE .  On opening the link we would be greeted by a great web page.   Now click on "Get Started"   to continue...  The next step is to click on "Download Installer" to proceed...  Select the correct version based on your System's architecture. I will be using a 64-bit installer (477 MB). Your download should now.. it will take some time...  Let's catch up in 2nd Section (Unzip and Install)  

Missing Data -- Understanding The Concepts

  Introduction Machine Learning seems to be a big fascinating term, which attracts a lot of people towards it, and knowing what all we can achieve through it makes the sci-fi imagination of ours jump to another level. No doubt in it, it is a great field and we can achieve everything from an automated reply system to a house cleaning robots, from recommending a movie or a product to help in detecting disease. Most of the things that we see today have already started using ML to better themselves. Though building a model is quite easy, the most challenging task is preprocessing the data and filtering out the Data of Use. So, here I am going to address one of the biggest and common issues that we face at the start of the journey of making a Good ML Model, which is  The   Missing Data . Missing Data can cause many issues and can lead to wrong predictions of our model, which looks like our model failed and started over again. If I have to explain in simple terms, data is like Fuel of our Mo

Spark — How to install in 5 Steps in Windows 10

 An easy to go guide for installing the Spark in Windows 10. Image taken from Google images 1. Prerequisites Hardware Requirement * RAM — Min. 8GB, if you have SSD in your system then 4GB RAM would also work. * CPU — Min. Quad-core, with at least 1.80GHz JRE 1.8   —   Offline installer for JRE  Java Development Kit — 1.8   A Software for Un-Zipping like   7Zip   or   Win Rar * I will be using 64-bit windows for the process, please check and download the version supported by your system x86 or x64 for all the software. Hadoop * I am using Hadoop-2.9.2, you can also use any other STABLE version for Hadoop.  * If you don’t have Hadoop, you can refer to installing it from   Hadoop: How to install in 5 Steps in Windows 10 . MySQL Query Browser Download Spark Zip * I am using Spark 3.1.1, you can also use any other STABLE version for Spark. * Latest release of Spark is 3.1.2(shown in the image below) released in June'21 Fig 1:- Download Spark-3.1.2

SQOOP — How to install in 5 Steps in Windows 10

  An easy to go guide for installing SQOOP in Windows 10. Image taken from Google images 1. Prerequisites Hardware Requirement * RAM — Min. 8GB, if you have SSD in your system then 4GB RAM would also work. * CPU — Min. Quad-core, with at least 1.80GHz JRE 1.8   — Offline installer for JRE  Java Development Kit — 1.8   A Software for Un-Zipping like   7Zip   or   Win Rar * I will be using 64-bit windows for the process, please check and download the version supported by your system x86 or x64 for all the software. Hadoop * I am using Hadoop-2.9.2, you can also use any other STABLE version for Hadoop.  * If you don’t have Hadoop, you can refer to installing it from   Hadoop: How to install in 5 Steps in Windows 10 . MySQL Query Browser Download SQOOP zip * I am using SQOOP-1.4.7, you can also use any other STABLE version for SQOOP. Fig 1:- Download Sqoop 1.4.7