【機器學習】-Week5.1 Cost Function (神經網絡)

Cost Function

Let's first define a few variables that we will need to use:

\bullet L = total number of layers in the network

\bullet s_l = number of units (not counting bias unit) in layer l

\bullet K = number of output units/classes

Recall that in neural networks, we may have many output nodes. We denote?h_{\theta } (x)_{k} ?as being a hypothesis that results in the??k^{th} output. Our cost function for neural networks is going to be a generalization of the one we used for logistic regression. Recall that the cost function for regularized logistic regression was:

For neural networks, it is going to be slightly more complicated:

We have added a few nested summations to account for our multiple output nodes. In the first part of the equation, before the square brackets, we have an additional nested summation that loops through the number of output nodes.

In the regularization part, after the square brackets, we must account for multiple theta matrices. The number of columns in our current theta matrix is equal to the number of nodes in our current layer (including the bias unit). The number of rows in our current theta matrix is equal to the number of nodes in the next layer (excluding the bias unit). As before with logistic regression, we square every term.

Note:

?\bullet the double sum simply adds up the logistic regression costs calculated for each cell in the output layer

?\bullet the triple sum simply adds up the squares of all the individual \theta ?in the entire network.

\bullet ?the i in the triple sum does?not?refer to training example i

來源:coursera 斯坦福 吳恩達 機器學習

?著作權歸作者所有,轉載或內容合作請聯系作者
【社區(qū)內容提示】社區(qū)部分內容疑似由AI輔助生成,瀏覽時請結合常識與多方信息審慎甄別。
平臺聲明:文章內容(如有圖片或視頻亦包括在內)由作者上傳并發(fā)布,文章內容僅代表作者本人觀點,簡書系信息發(fā)布平臺,僅提供信息存儲服務。

相關閱讀更多精彩內容

友情鏈接更多精彩內容