Machine Learning- A Student Guide: MACHINE LEARNING

MACHINE LEARNING - DAY 2

Check out DAY 1: What is Machine Learning and it's types

LINEAR REGRESSION

Notations:

m: number of training sets

x: input variable/feature

y: output variable/ target variable

( x⁽ⁱ⁾, y⁽ⁱ⁾): i^th training set

Linear regression is mostly used in supervised learning and with the given data-set, our aim is to learn a function or a hypothesis h: x -> y, so that h(x) is a “good” predictor for the corresponding value of y.

Hypothesis for linear regression:

h_Θ(x) = Θ₀ + Θ₁x
Or

h(x) = Θ₀+ Θ₁x

Where,

h(x): hypothesis for the problem

Θ₀: constant or the intercept

Θ₁: the slope of the line

COST FUNCTION

The accuracy of the prepared hypothesis can be found out by using a cost function. This takes an average difference of all the results of the hypothesis with the inputs from x and the actual output y.

Given below is the required cost function:

The function is called Squared Error Function.

Now, h_Θ(x⁽ⁱ⁾) - y⁽ⁱ⁾) is the difference between the predicted results for the input x and the real output y. Taking summation of it will provide the total difference between the predicted output and the real output.

Here 1/2 is taken to simplify the calculations which we will see in gradient descent.

AIM: To minimize the cost function J(Θ₀, Θ₁)

Hypothesis : h_Θ(x) = Θ₀ + Θ₁x

Cost function :

To minimize the cost function or to get the best fit the line should pass through all the values in the result set. In such a case

J(Θ_0,Θ₁) = 0

as the distance between the prediction value and the actual value is zero.

Note: for Θ₀ , Θ₁and J(Θ_0,Θ₁) we plot contour plots as they are 3D plots and hence are used to plot 3 values. The smallest circle in the contour plot, when shown in 2D, depicts the global minimum which is the perfect fit for the given hypothesis.

Now the question arises how to find the accurate (Θ_0,Θ₁) values??

The solution for the above question is a technique called Gradient Descent which is our next topic.

GRADIENT DESCENT

Gradient Descent is used to find the Θ_ivalues for i = 0,1,….n to minimize the cost function J(Θ_i).

Formula of Gradient Descent :

Functioning of Gradient Descent:

Note: If α is too small then it will take a lot of time to converge to the global minimum.

Note: If α is too large then instead of converging to the global minimum it will start diverging.

So the choice of α i.e., the learning rate is very important.

Now since for every next value of (Θ_0,Θ₁) the gradient descent algorithm is executed and hence the partial derivative is taken which leads to updating the values of (Θ_0,Θ₁) simultaneously after each iteration.

Now let’s see how the parameter i.e.,(Θ_0,Θ₁) values are found

In the first case, the point A has a positive slope which depicts the value of partial derivative of J(Θ_0,Θ₁) and so the gradient formula decreases the value of (Θ_0,Θ₁) as

Θ₀- (+ve) = decrease in value of Θ₀

Θ₁- (+ve) = decrease in value of Θ₁

and finally it will reach to point B which is the global minimum.

Similarly, in the second case, point A has a negative slope which depicts the value of partial derivative of J(Θ_0,Θ₁) and so the gradient formula increases the value of (Θ_0,Θ₁) as

Θ₀- (-ve) = increase in value of Θ₀

Θ₁- (-ve) = increase in value of Θ₁

and finally it will reach to point B which is the global minimum.

As the point reaches to point B, the derivative will come 0 because the actual output and the predicted output are same and hence derivative of a constant is 0, so the value of (Θ_0,Θ₁) will not change any further and that’s the required values for our parameters.

In this way gradient descent help in figuring out which value set for (Θ_0,Θ₁) would suit the hypothesis for the best fit.

This process is also called Batch Processing since for every computation for (Θ_0,Θ₁), the process looks upon the entire batch of data until and unless it finds the global minimum.

That's all for day 2. Next we will learn about linear regression with multiple variables in DAY 3.

If you feel this article helped you in any way do not forget to share and if you have any thoughts or doubts upon it do write them in the comment section.

Till then Happy Learning..

Machine Learning- A Student Guide

Saturday, March 31, 2018

MACHINE LEARNING | LINEAR REGRESSION

MACHINE LEARNING - DAY 2

No comments:

Post a Comment