Sunday, May 6, 2018

MACHINE LEARNING | DECISION BOUNDARY

MACHINE LEARNING - DAY 7

DECISION BOUNDARY FOR LOGISTIC REGRESSION


For the basics, you can check the earlier articles.

Terms used in this article can be understood from:


DECISION BOUNDARY

Decision boundary means the shape of the curve dividing the data into 2 segments, one which has y = 1 and the other category with y = 0.

hΘ(x) = g(ΘTX) = P(y = 1 | x; Θ )


g(z) = 1/(1+(e^(-z)) = 1/(1+(e^(-ΘTX))


Suppose, we want to predict y=1 then

                             

hΘ(x)  0.5

and, for prediction of y=0,

hΘ(x) < 0.5

Now, let’s see when are these values possible.

1. y=1 when,

g(z)  0.5

When z  0

hΘ(x) = g(ΘTX)  0.5

y = 1, when ΘTX  0.

2. y = 0 when,

g(z) < 0.5

When z < 0

hΘ(x) = g(ΘTX) < 0.5

y = 0, when ΘTX < 0.

Now, let’s discuss about the decision boundaries with the help of some examples.

Example 1:

hΘ(x) = g(Θ0 + Θ1x1 + Θ2x2)



Let Θ0 = -3, Θ1 = 1, Θ2 = 1

Θ = [-3;1;1]

Dimension of Θ matrix is 3X1.

Predict y = 1, if

-3 + x1 + x2  0 ≈ g(z) > 0.5 ≈ z > 0.

x1 + x2 ≥ 3.

And for y = 0,

x1 + x2 < 3


NON - LINEAR DECISION BOUNDARIES

Sometimes, the data points are arranged in such a manner that the curve separating them takes a complex shape then a straight line.

Example

Hypothesis: hΘ(x) = g(Θ0 + Θ1x1 + Θ2x2 + Θ3x12 + Θ4x22)




Let Θ0 = -1, Θ1 = 0, Θ2 = 0, Θ3 =1, Θ4 = 1 (for now we’ll see how to find the parameters automatically under upcoming lessons.)

Θ = [-1;0;0;1;1]

The dimension of the matrix is 5X1.

To predict:

y = 1 if,

-1 + x12 + x22   0 ≈ x12 + x22   1(equation of a circle with center at origin).

NOTE: Decision boundaries depends upon the parameters i.e., Θ values.

Decision boundaries can vary depending upon the hypothesis. It can get complex or it can also get simplified with the increase of the parameters and the variables.

Points to remember:

lg(z)  0.5  z  0

lz = 0, e0 = 1, g(z) = 1/2

lz  , e- → 0 → g(z) = 1

lz  -, e  → g(z) = 0


That’s all for day 7. Today we learned about the decision boundaries in classification problems, especially in logistic regression.

In day 8, we will be learning about the cost function of logistic regression which will help us in figuring out the parameter i.e., Θ values automatically for the best fit and we will also learn about the concept of multi-class classification in logistic regression.

If you think this article helped you in learning something new or can help someone then do share this article among the peers.

Till then Happy Learning!!!





No comments:

Post a Comment