Machine Learning- A Student Guide: MACHINE LEARNING

MACHINE LEARNING - DAY 7

DECISION BOUNDARY FOR LOGISTIC REGRESSION

For the basics, you can check the earlier articles.

Terms used in this article can be understood from:

DECISION BOUNDARY

Decision boundary means the shape of the curve dividing the data into 2 segments, one which has y = 1 and the other category with y = 0.

h_Θ(x) = g(Θ^TX) = P(y = 1 | x; Θ )

g(z) = 1/(1+(e^(-z)) = 1/(1+(e^(-Θ^TX))

Suppose, we want to predict y=1 then

h_Θ(x) ≥ 0.5

and, for prediction of y=0,

h_Θ(x) < 0.5

Now, let’s see when are these values possible.

1. y=1 when,

g(z) ≥ 0.5

When z ≥ 0

h_Θ(x) = g(Θ^TX) ≥ 0.5

y = 1, when Θ^TX ≥ 0.

2. y = 0 when,

g(z) < 0.5

When z < 0

h_Θ(x) = g(Θ^TX) < 0.5

y = 0, when Θ^TX < 0.

Now, let’s discuss about the decision boundaries with the help of some examples.

l Example 1:

h_Θ(x) = g(Θ₀+ Θ₁x₁+ Θ₂x₂)

Let Θ₀= -3, Θ₁= 1, Θ₂= 1

Θ = [-3;1;1]

Dimension of Θ matrix is 3X1.

Predict y = 1, if

-3 + x₁ + x₂ ≥ 0 ≈ g(z) > 0.5 ≈ z > 0.

x₁+ x₂≥ 3.

And for y = 0,

x₁+ x₂< 3

NON - LINEAR DECISION BOUNDARIES

Sometimes, the data points are arranged in such a manner that the curve separating them takes a complex shape then a straight line.

l Example

Hypothesis: h_Θ(x) = g(Θ₀+ Θ₁x₁+ Θ₂x₂+ Θ₃x₁²+ Θ₄x₂²)

Let Θ₀= -1, Θ₁= 0, Θ₂= 0, Θ₃ =1, Θ₄= 1 (for now we’ll see how to find the parameters automatically under upcoming lessons.)

Θ = [-1;0;0;1;1]

The dimension of the matrix is 5X1.

To predict:

y = 1 if,

-1 + x₁²+ x₂² ≥ 0 ≈ x₁²+ x₂² ≥ 1(equation of a circle with center at origin).

NOTE: Decision boundaries depends upon the parameters i.e., Θ values.

Decision boundaries can vary depending upon the hypothesis. It can get complex or it can also get simplified with the increase of the parameters and the variables.

Points to remember:

lg(z) ≥ 0.5 ≈ z ≥ 0

lz = 0, e⁰ = 1, g(z) = 1/2

lz → ∞, e^-^∞ → 0 → g(z) = 1

lz → -∞, e^∞ → ∞ → g(z) = 0

That’s all for day 7. Today we learned about the decision boundaries in classification problems, especially in logistic regression.

In day 8, we will be learning about the cost function of logistic regression which will help us in figuring out the parameter i.e., Θ values automatically for the best fit and we will also learn about the concept of multi-class classification in logistic regression.

If you think this article helped you in learning something new or can help someone then do share this article among the peers.

Till then Happy Learning!!!

Machine Learning- A Student Guide

Sunday, May 6, 2018

MACHINE LEARNING | DECISION BOUNDARY

MACHINE LEARNING - DAY 7

No comments:

Post a Comment