寫在前面

最近在學(xué)習(xí)NLP的課程，下面的代碼，基本來自我的NLP課程作業(yè)，當(dāng)然大部分都是模仿老師寫的，使用Python完成，感興趣的可以去我的github上面查看：https://github.com/LiuPineapple/Learning-NLP/tree/master/Assignments/lesson-02
作者水平有限，如果有文章中有錯誤的地方，歡迎指正！如有侵權(quán)，請聯(lián)系作者刪除。

Machine Learning--Gradient Descent（機器學(xué)習(xí)--梯度下降）

??機器學(xué)習(xí)是什么，不同的人可能給出不同的定義。我的理解是，使用算法讓機器從數(shù)據(jù)中學(xué)習(xí)，進(jìn)而得到比人為設(shè)計更好的模型，去做某些諸如分類、預(yù)測的事情。
??這里，我們研究波士頓房價預(yù)測這一問題，來對機器學(xué)習(xí)做一個簡單的實踐。

from sklearn.datasets import load_boston
data = load_boston()
X, y = data['data'], data['target']
X[1]
array([2.7310e-02, 0.0000e+00, 7.0700e+00, 0.0000e+00, 4.6900e-01,
       6.4210e+00, 7.8900e+01, 4.9671e+00, 2.0000e+00, 2.4200e+02,
       1.7800e+01, 3.9690e+02, 9.1400e+00])
len(y)
506
len(X[:, 0])
506
X_rm = X[:, 5]

上段代碼中需要注意的地方有：

y代表著不同房子的房價，X代表著房子的各種變量，如大小，犯罪率等?？梢钥吹?，我們一共使用了506棟房子的數(shù)據(jù)。
為了簡單起見，我們僅僅研究X的第6個參數(shù)與房價的關(guān)系，所以需要把第六個變量在各個房子上的取值單獨拿出來為X_rm。

??我們假設(shè)自變量與因變量之間是線性關(guān)系，即 $y = kx+b$ ， $k,b$ 為未知參數(shù)，定義price()函數(shù)，來計算給定自變量與參數(shù)值后的y值。我們的任務(wù)就是，找到一個合適的 $k,b$ 參數(shù)值，使得當(dāng)我們給定一個 $x$ ,使用上式得到的預(yù)測值與真實值之間的差距盡可能的小。如果我們能夠找到比較合適的 $k,b$ 參數(shù)值，那么就有可能得到準(zhǔn)確率比較高的預(yù)測結(jié)果。
??那么我們?nèi)绾味x我們得到的預(yù)測值與真實值之間的差距呢？我們使用如下定義：

圖片1

def price(rm, k, b):
    """f(x) = k * x + b"""
    return k * rm + b

def loss(y, y_hat): # to evaluate the performance 
    return sum((y_i - y_hat_i)**2 for y_i, y_hat_i in zip(list(y), list(y_hat))) / len(list(y))

# 也可以使用numpy來更簡單的定義損失函數(shù)
import numpy as np
def loss(y,y_hat):
    e = np.array(y)-np.array(y_hat)
    return (e@e.T)/len(y)

上段代碼中需要注意的地方有：

Python3 zip() 函數(shù) https://www.runoob.com/python3/python3-func-zip.html

??我們的任務(wù)就是，找到一個合適的 $k,b$ 參數(shù)值，使得loss盡可能小。那么按照機器學(xué)習(xí)的思想，我們要做的是先隨機生成一個 $k,b$ ，然后通過數(shù)據(jù)去讓程序自動的去調(diào)整 $k,b$ ，直到迭代多少次或者損失小于某個值。

Gradient Descent（梯度下降）

??我們可以看到， $x,y$ 是確定的值，loss其實是以 $k,b$ 為變量的函數(shù)，我們求loss關(guān)于 $k,b$ 的偏導(dǎo)數(shù)以及相應(yīng)代碼如下所示：

圖片2

def partial_k(x, y, y_hat):
    n = len(y)

    gradient = 0
    
    for x_i, y_i, y_hat_i in zip(list(x), list(y), list(y_hat)):
        gradient += (y_i - y_hat_i) * x_i
    
    return -2 / n * gradient


def partial_b(x, y, y_hat):
    n = len(y)

    gradient = 0
    
    for y_i, y_hat_i in zip(list(y), list(y_hat)):
        gradient += (y_i - y_hat_i)
    
    return -2 / n * gradient

??我們在隨機得到 $k,b$ 后，計算loss以及l(fā)oss關(guān)于 $k,b$ 的偏導(dǎo)數(shù)，一般來說，隨機得到的 $k,b$ 都會使得loss比較大，那么我們應(yīng)該怎么變化 $k,b$ ，才能使得loss不斷減小呢？偏導(dǎo)數(shù)為我們提供了變化的方向，我們定義一個正的學(xué)習(xí)率 $\alpha$ ，在計算完偏導(dǎo)數(shù)后，我們對 $k,b$ 的值做如下變化：
$k = k -\alpha\times \frac{\partial loss}{\partial k}$
$k = b -\alpha\times \frac{\partial loss}{\partial b}$
??得到新的 $k,b$ 后，我們帶回去計算loss，如果新的到的loss比之前的loss小，那么最小的loss就是新的到的loss， $k,b$ 也是比之前的 $k,b$ 更為合適的取值，接下來再重復(fù)上述過程，直到重復(fù)了某個次數(shù)或者損失小于某個值。注意， $k,b$ 一定要同步更新，不能先更新 $k$ 再用更新了的 $k$ 去計算函數(shù)關(guān)于 $b$ 的偏導(dǎo)數(shù)去更新 $b$ 。代碼如下：

import random
trying_times = 2000
min_loss = float('inf') 
current_k = random.random() * 200 - 100
current_b = random.random() * 200 - 100
learning_rate = 1e-04
for i in range(trying_times):
    
    price_by_k_and_b = [price(r, current_k, current_b) for r in X_rm]
    
    current_loss = loss(y, price_by_k_and_b)

    if current_loss < min_loss: # performance became better
        min_loss = current_loss
        
        if i % 50 == 0: 
            print('When time is : {}, get best_k: {} best_b: {}, and the loss is: {}'.format(i, best_k, best_b, min_loss))

    k_gradient = partial_k(X_rm, y, price_by_k_and_b)
    
    b_gradient = partial_b(X_rm, y, price_by_k_and_b)
    
    current_k = current_k + (-1 * k_gradient) * learning_rate

    current_b = current_b + (-1 * b_gradient) * learning_rate

上段代碼中需要注意的地方有：

Python中可以用如下方式表示正負(fù)無窮：float("inf"), float("-inf"),利用 inf 做加、乘算術(shù)運算仍會得到 inf。除了inf外的其他數(shù)除以inf，會得到0。
Python random() 函數(shù)。https://www.runoob.com/python/func-number-random.html。注意區(qū)分random模塊中的random和numpy模塊中的random。
Python format 格式化函數(shù)。https://www.runoob.com/python/att-string-format.html
1e-04代表 $1\times10^{-4}$

??最后得到的結(jié)果如下所示：

When time is : 0, get best_k: 11.431551629413757 best_b: -49.52403584539048, and the loss is: 575.5349822522099
When time is : 50, get best_k: 11.431551629413757 best_b: -49.52403584539048, and the loss is: 277.9378161169662
When time is : 100, get best_k: 11.431551629413757 best_b: -49.52403584539048, and the loss is: 147.24895628021088
When time is : 150, get best_k: 11.431551629413757 best_b: -49.52403584539048, and the loss is: 89.8572545975801
When time is : 200, get best_k: 11.431551629413757 best_b: -49.52403584539048, and the loss is: 64.65372567052019
When time is : 250, get best_k: 11.431551629413757 best_b: -49.52403584539048, and the loss is: 53.58551239815359
When time is : 300, get best_k: 11.431551629413757 best_b: -49.52403584539048, and the loss is: 48.72477014152337
When time is : 350, get best_k: 11.431551629413757 best_b: -49.52403584539048, and the loss is: 46.59001559478237
When time is : 400, get best_k: 11.431551629413757 best_b: -49.52403584539048, and the loss is: 45.65236839246802
When time is : 450, get best_k: 11.431551629413757 best_b: -49.52403584539048, and the loss is: 45.24042644341104
When time is : 500, get best_k: 11.431551629413757 best_b: -49.52403584539048, and the loss is: 45.059346031766644
When time is : 550, get best_k: 11.431551629413757 best_b: -49.52403584539048, and the loss is: 44.97964764306714
When time is : 600, get best_k: 11.431551629413757 best_b: -49.52403584539048, and the loss is: 44.94447083305862
When time is : 650, get best_k: 11.431551629413757 best_b: -49.52403584539048, and the loss is: 44.928845550418174
When time is : 700, get best_k: 11.431551629413757 best_b: -49.52403584539048, and the loss is: 44.921806290539294
When time is : 750, get best_k: 11.431551629413757 best_b: -49.52403584539048, and the loss is: 44.918537593098634
When time is : 800, get best_k: 11.431551629413757 best_b: -49.52403584539048, and the loss is: 44.91692476670531
When time is : 850, get best_k: 11.431551629413757 best_b: -49.52403584539048, and the loss is: 44.91603915253814
When time is : 900, get best_k: 11.431551629413757 best_b: -49.52403584539048, and the loss is: 44.91547293354079
When time is : 950, get best_k: 11.431551629413757 best_b: -49.52403584539048, and the loss is: 44.91504701836891
When time is : 1000, get best_k: 11.431551629413757 best_b: -49.52403584539048, and the loss is: 44.914682759718445
When time is : 1050, get best_k: 11.431551629413757 best_b: -49.52403584539048, and the loss is: 44.91434561990997
When time is : 1100, get best_k: 11.431551629413757 best_b: -49.52403584539048, and the loss is: 44.9140204318406
When time is : 1150, get best_k: 11.431551629413757 best_b: -49.52403584539048, and the loss is: 44.91370053492356
When time is : 1200, get best_k: 11.431551629413757 best_b: -49.52403584539048, and the loss is: 44.91338300417686
When time is : 1250, get best_k: 11.431551629413757 best_b: -49.52403584539048, and the loss is: 44.91306655509527
When time is : 1300, get best_k: 11.431551629413757 best_b: -49.52403584539048, and the loss is: 44.912750623583214
When time is : 1350, get best_k: 11.431551629413757 best_b: -49.52403584539048, and the loss is: 44.912434961909526
When time is : 1400, get best_k: 11.431551629413757 best_b: -49.52403584539048, and the loss is: 44.91211946127419
When time is : 1450, get best_k: 11.431551629413757 best_b: -49.52403584539048, and the loss is: 44.91180407388745
When time is : 1500, get best_k: 11.431551629413757 best_b: -49.52403584539048, and the loss is: 44.9114887787528
When time is : 1550, get best_k: 11.431551629413757 best_b: -49.52403584539048, and the loss is: 44.9111735666393
When time is : 1600, get best_k: 11.431551629413757 best_b: -49.52403584539048, and the loss is: 44.91085843348287
When time is : 1650, get best_k: 11.431551629413757 best_b: -49.52403584539048, and the loss is: 44.91054337748873
When time is : 1700, get best_k: 11.431551629413757 best_b: -49.52403584539048, and the loss is: 44.910228397858496
When time is : 1750, get best_k: 11.431551629413757 best_b: -49.52403584539048, and the loss is: 44.9099134942312
When time is : 1800, get best_k: 11.431551629413757 best_b: -49.52403584539048, and the loss is: 44.909598666438264
When time is : 1850, get best_k: 11.431551629413757 best_b: -49.52403584539048, and the loss is: 44.90928391439542
When time is : 1900, get best_k: 11.431551629413757 best_b: -49.52403584539048, and the loss is: 44.90896923805536
When time is : 1950, get best_k: 11.431551629413757 best_b: -49.52403584539048, and the loss is: 44.908654637387244

??一個簡單的機器學(xué)習(xí)--梯度下降模型就完成啦，當(dāng)然這其中還有很多問題，比如初始值的選取、學(xué)習(xí)率的選取等等，這些就是我們后面探討的內(nèi)容啦。

最后，歡迎大家訪問我的GitHub查看更多代碼：https://github.com/LiuPineapple
歡迎大家訪問我的簡書主頁查看更多文章：http://www.itdecent.cn/u/31e8349bd083

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

NLP筆記(3) -- 基于機器學(xué)習(xí)的模型，簡單的梯度下降實踐

NLP筆記(3) -- 基于機器學(xué)習(xí)的模型，簡單的梯度下降實踐

寫在前面

Machine Learning--Gradient Descent（機器學(xué)習(xí)--梯度下降）

Gradient Descent（梯度下降）

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

NLP筆記(3) -- 基于機器學(xué)習(xí)的模型，簡單的梯度下降實踐

寫在前面

Machine Learning--Gradient Descent（機器學(xué)習(xí)--梯度下降）

Gradient Descent（梯度下降）

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

NLP筆記(3) -- 基于機器學(xué)習(xí)的模型，簡單的梯度下降實踐