Here we will be covering 6 popular statistical plots along with their use cases in statistics, data science and machine learning.

A data scientist spends around 70 % time in EDA (Exploratory Data Analysis) and data pre-processing. In the EDA part we use visualization heavily to draw some intuition out of the data.

Here we will work with some basic and popular plots and try to draw some conclusions out of them regarding the data. We will also see where we can use those plots during data pre-processing.

Bar Plot

What kind of data shown in a bar plot??

  • Categorical data: Both…


For this blog, we will be going through some basic concepts of population and sample and sampling. :)

Population vs. Sample

At the beginning of every analysis there is data. Sometime we get the data from some reliable source and sometime we need to collect the data. But whichever the case be, we need to know few things regarding the data we need to collect. Population and sample are one of the important aspects, like do we need a sample or the whole population.

Population

A population includes all members from a specified group, all possible outcomes or measurements that are of interest.

EX…


Wine is the only artwork you can drink ( quoted by Luis Fernando Olliveri)

This quote infers the standard of wine in our society. It is not only a drink but a lifestyle in itself. Therefore the quality of a wine really matters.

Generally wine is scored based on sensory data based on it’s quality. But if we have some psychometric data about it like acidity, density, sugar content etc. then we can also predict the quality score.

As some of you might have guessed, here we can use a machine learning or neural network model to do so. We…


Data Storytelling is an important part of data analysis. While the first step is understanding the data, during our storytelling we have to present what we have actually understood. Before jumping into codes and building ML models, it is really important that we are clear about the data, its structure and the problem statement.

During the presentation of our work and findings, there may be people who are not much associated with data science or machine learning and they do not understand much of your code. They will be only interested in the problem statement, the intuition you are able…


PyTorch is an open source machine learning library based on the Torch library, used for various deep learning applications such as computer vision and natural language processing, primarily developed by Facebook’s AI Research lab (FAIR).

Tensorflow, Keras, MXNet, Caffe2 etc. are some of the alternatives of pytorch.

Pytorch tensors

A tensor is an n-dimensional data container. It is quite similar to NumPy’s ndarray. For example, 1d-tensor is a vector, 2d-tensor is a matrix, 3d-tensor is a cube, and 4d-tensor is a vector of cubes.

These tensors are used as basic building blocks of a deep learning network.

Creating a tensor

We can create tensors using…

Subham Kumar Sahoo

Learning, Sharing and Enjoying Data Science!! 😀 Linked-in: https://www.linkedin.com/in/subham-kumar-sahoo-55563a136/

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store