M.L.-Classification and Representation

1.Logistic Regression(classification regression)


Linear Regression may be not suited well for some classification problem,such as classifying the email `which is spam or not ,or judging the cancer's condition depend on its size.

So,there is another algorithm——logistic regression,which has several features Xi,and the output y only two conditions——zero or one.

Hypothesis Representation


In the linear regression,the hypothesis result is θ'x which can be larger than 1 or smaller than 0,so we use sigmoid function to modify the hypothesis result during 1 and 0.

Decision Boundary



The decision boundary is the line that separates the area where y = 0 and where y = 1. It is created by our hypothesis function.

decision boundary can be linear or nonlinear ,sometimes even complicated curve.

As we can seen above,if we define:

h(z) > 0.5 ?—> ?y = 1 ;

h(z) < 0.5 ?—> ?y = 0 ;

which means, z > 0 is the boundary.

so,if z = θ'x ,then θ'x > 0 is the boundary which divide the area into two parts——y = 0 and y = 1; θ'x = θ0*x0 + θ1*x1 + θ2*x2 (this is a linear boundary)

Cost Function


We cannot use the same cost function that we use for linear regression because the Logistic Function will cause the output to be wavy, causing many local optima. In other words, it will not be a convex function.

so, we define the cost function of logistic regression as this :

c.f of logistic function

We can rewrite the cost equation into the form:

cost(h(x),y) = -ylog(h(x)) - (1-y)log(1-h(x))

Gradient Descent


The form is same as the gradient descent of linear regression.

A vectorized implementation is:

vectorized

Advanced Optimization


"Conjugate gradient", "BFGS", and "L-BFGS" are more sophisticated, faster ways to optimize θ that can be used instead of gradient descent. We suggest that you should not write these more sophisticated algorithms yourself (unless you are an expert in numerical computing) but use the libraries instead, as they're already tested and highly optimized. Octave provides them.

2.Multi-class Classification: One-vs-others


if we have more than two categories,instead of y = {0,1} we will expand our definition so that y = {0,1...n}.We divide our problem into n+1 (+1 because the index starts at 0) binary classification problems.

one vs all

To summarize:

Train a logistic regression classifier hθ(x)for each class to predict the probability that y = i .

To make a prediction on a new x, pick the class that maximizes hθ(x).

3.PROBLEM : Over-fitting


The hypothesis function may predict the examples in the training set very well,but can not predict the unseen data well.

three conditions with different features

As is shown in the picture above,the first curve has few features so it does not fit the data well,which called "under-fitting" or "high bias".The second curve is right well.And the last curve fitting all the examples in the training set but it looks like a unreasonable and complicate drawing may can not predict the unseen data.So,under this condition,the curve is called "over-fitting" or "high-variance" .

What are the reasons of over-fitting?

1).too many features

2).too complicate hypothesis function

How to solve it?

1).reduce the features

2).regularization (正則化)

.Keep all the features, but reduce the magnitude of parameters θj.

.Regularization works well when we have a lot of slightly useful features.

Cost Function


modified cost function?

the regular formula:

regularization parameter

Regularized Linear Regression


It will change the form of gradient descent and normal equation.

Gradient Descent

modified gradient descent

Normal Equation

modified normal equation?

Recall that if m < n, then X'X is non-invertible. However, when we add the term λ?L, then X'X+ λ?L becomes invertible.

Regularized Logistic Regression


We can regularize logistic regression in a similar way that we regularize linear regression.

regularized cost function

so,the gradient descent function is changed as following:

regularized gradient descent
最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時請結(jié)合常識與多方信息審慎甄別。
平臺聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點,簡書系信息發(fā)布平臺,僅提供信息存儲服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容