Machine Learning Algorithms: Which, When and Where

After Successfully Completing introduction to Machine Learning Course in Udacity, with the basic understanding of machine learning algorithms, in this blog I am consolidating the algorithms with its specialization. There are several algorithms which make the machine learning easier, also having several accomplishments with Python Programming and SciKit-Learn (SKLearn) support. The Understanding from the course made me some Classification of several topics. There are four ideas behind all machine learning process,

(i) Dataset/Question

(ii) Features

(iii) Algorithms

(iv) Evaluation

(i) Dataset/Question

Before diving into the actual progress in a machine learning application, collecting enough, the relevant dataset is important as this will be helpful in starting the progress. The more we study the data, the more we could train the systems. So probably the collection of relevant data is the ultimate part of starting an application.

 

 

 

 

TensorFlow Introduction

Started my TensorFlow learning last weekend, this is my first blog on TensorFlow. While going through the YouTube Video that given the overview of TensorFlow to the high level in Neural Network for the handwritten text images classification. After crossing across some algorithms which were efficiently handled by the Python and SciKit Learn to solve the complex process to much easier steps. I hope the TensorFlow do have support for the Neural Networks library which could help in solving the complex classifications to the simpler ones. Continue reading “TensorFlow Introduction”

Evaluation Metrics

The last part of the Machine Learning Course is the Validation and Evaluation. This is the section where we identify the originality of our result. So this is much important to make sure that our Algorithm/Machine Learning Process is doing the right one as planned. After the validation step, there can be a situation which alarms/showcases the critical errors found in it. Continue reading “Evaluation Metrics”

Principal Component Analysis in Machine Learning

The definition of Principle Component Analysis(PCA) is Systematized way to transform input features into Principal components. The Principal components are used as new features. PCs are directions in data that maximize variance(obviously minimize information loss) when we Project/Compress down onto them. More variance of data along a PC, higher that PC is ranked.  Continue reading “Principal Component Analysis in Machine Learning”

Clustering and Feature Scaling

The lesson 9 and lesson 10 in the course are Clustering and Feature Scaling.

Clustering:

Clustering comes under unsupervised learning methods. An unsupervised learning is also important because most of the time we get data in the real world doesn’t have flags attached to it. If it so, we would turn to unsupervised learning techniques. Continue reading “Clustering and Feature Scaling”

Outliers in Regression

As we saw the Regression is one of the popular machine learning algorithms, I have come over errors and performance of the regression in my previous blog. Outliers are causing in the regression that could also happen like the one in Support Vector Machines or in the Naive Bayes Classifier algorithm do. An outlier is the point of a data that is far away from the regression line. But I had a question that, is it necessary to remove the outliers and do the fit again? The answer is here. Continue reading “Outliers in Regression”

Regression in Machine Learning

I have just started the Lesson 7 in Introduction to Machine Learning Udacity Course, the Regression. Linear Regression is one of the Continuous Supervised Learning methods in machine learning. In other words, it is used to predict the dependent variable(Y) based on values of independent variables(X). It can be used for the cases where we want to predict some continuous quantity like predicting the traffic in a Mall, Dwell time(Time spent in the same position, area, stages of a process). Continue reading “Regression in Machine Learning”

Enron Scam: Analysing Emails

At the Start of Lesson 6, I am quite eager in taking the Mini-Project on the analysis of data-sets related the emails sent by the employees before the fraud happens. as the instructors are explaining the way machine learning could help the situations like the one Enron, an Energy trading company which was Scrutinised by the U.S Government as Fraudulent Company worth multi-millions. Continue reading “Enron Scam: Analysing Emails”