Decision Tree is one of the most powerful and popular algorithms. Decision-tree algorithms fall under the category of supervised learning algorithms. It works for both continuous as well as categorical output variables.

Import modules and Data:

In order to prepare data, train, evaluate, and visualize a decision tree, we will…

Docker is a tool designed to make it easier to create, deploy, and run applications by using containers. Containers allow a developer to package up an application with all of the parts it needs, such as libraries and other dependencies, and deploy it as one package. …

One popular interpretation of big data refers to extremely large data sets. A National Institute of Standards and Technology defined big data as consisting of “extensive datasets — primarily in the characteristics of volume, velocity, and/or variability — that require a scalable architecture for efficient storage, manipulation, and analysis.” …

Generally, while doing programming in any programming language, you need to use various variables to store various information. Variables are nothing but reserved memory locations to store values. This means that, when you create a variable you reserve some space in memory.

In contrast to other programming languages like C…

In the world of working with data there are a couple different ways to handle the data and perform the necessary analysis on them. The two most common coding languages seen used are Python and R, but what is the difference? …

Data visualization is the discipline of trying to understand data by placing it in a visual context so that patterns, trends and correlations that might not otherwise be detected can be exposed.

Python offers multiple great graphing libraries that come packed with lots of different features. No matter if you…

With the advancement of technology people have started using voice commands and automated text responses to interact through the internet. The most common examples of this is seen with Amazon Alexa, Apple’s Siri, or through automated computer chat boxes. All this is founded upon Natural Language Processing or NLP for…

Hypothesis testing is a formal procedure for investigating our ideas about the world using statistics. It is most often used by scientists to test specific predictions, called hypotheses, that arise from theories.

There are 5 main steps in hypothesis testing:

  1. State your research hypothesis as a null (Ho) and alternate…

As a Data Scientist, when you work with data it is often you will encounter different statistical distributions. Part of the work as a Data Scientist is selecting which distribution best represents a given set of data. …

Blake Tolman

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store