欧美操逼视频二区,2017激情在线

感知機(jī)算法

《統(tǒng)計(jì)學(xué)習(xí)方法》系列筆記的第二篇，對(duì)應(yīng)原著第二章。大量引用原著講解，加入了自己的理解。對(duì)書中算法采用Python實(shí)現(xiàn)，并用Matplotlib可視化了動(dòng)畫出來(lái)，應(yīng)該算是很硬派了。一套干貨下來(lái)，很是辛苦，要是能堅(jiān)持下去就好。

概念

感知機(jī)是二分類模型，輸入實(shí)例的特征向量，輸出實(shí)例的±類別。

感知機(jī)模型

定義

假設(shè)輸入空間是

輸出空間是

image

x和y分屬這兩個(gè)空間，那么由輸入空間到輸出空間的如下函數(shù)：

image

稱為感知機(jī)。其中，w和b稱為感知機(jī)模型參數(shù)，

image

叫做權(quán)值或權(quán)值向量，

image

叫做偏置，w·x表示向量w和x的內(nèi)積。sign是一個(gè)函數(shù)：

image

感知機(jī)的幾何解釋是，線性方程

image

將特征空間劃分為正負(fù)兩個(gè)部分：

image

這個(gè)平面（2維時(shí)退化為直線）稱為分離超平面。

感知機(jī)學(xué)習(xí)策略

數(shù)據(jù)集的線性可分性

定義

給定數(shù)據(jù)集

image

其中

image

如果存在某個(gè)超平面S

image

能夠完全正確地將正負(fù)實(shí)例點(diǎn)全部分割開來(lái)，則稱T線性可分，否則稱T線性不可分

感知機(jī)學(xué)習(xí)策略

假定數(shù)據(jù)集線性可分，我們希望找到一個(gè)合理的損失函數(shù)。

一個(gè)樸素的想法是采用誤分類點(diǎn)的總數(shù)，但是這樣的損失函數(shù)不是參數(shù)w，b的連續(xù)可導(dǎo)函數(shù)，不可導(dǎo)自然不能把握函數(shù)的變化，也就不易優(yōu)化（不知道什么時(shí)候該終止訓(xùn)練，或終止的時(shí)機(jī)不是最優(yōu)的）。

另一個(gè)想法是選擇所有誤分類點(diǎn)到超平面S的總距離。為此，先定義點(diǎn)x₀到平面S的距離：

image

分母

image

是w的L₂范數(shù)，所謂L₂范數(shù)，指的是向量各元素的平方和然后求平方根（長(zhǎng)度）。這個(gè)式子很好理解，回憶中學(xué)學(xué)過的點(diǎn)到平面的距離：

image

此處的點(diǎn)到超平面S的距離的幾何意義就是上述距離在多維空間的推廣。

又因?yàn)?，如果點(diǎn)i被誤分類，一定有

image

成立，所以我們?nèi)サ袅私^對(duì)值符號(hào)，得到誤分類點(diǎn)到超平面S的距離公式：

image

假設(shè)所有誤分類點(diǎn)構(gòu)成集合M，那么所有誤分類點(diǎn)到超平面S的總距離為

image

分母作用不大，反正一定是正的，不考慮分母，就得到了感知機(jī)學(xué)習(xí)的損失函數(shù)：

image

感知機(jī)學(xué)習(xí)算法

原始形式

感知機(jī)學(xué)習(xí)算法是對(duì)以下最優(yōu)化問題的算法：

image

感知機(jī)學(xué)習(xí)算法是誤分類驅(qū)動(dòng)的，先隨機(jī)選取一個(gè)超平面，然后用梯度下降法不斷極小化上述損失函數(shù)。損失函數(shù)的梯度由：

image

給出。所謂梯度，是一個(gè)向量，指向的是標(biāo)量場(chǎng)增長(zhǎng)最快的方向，長(zhǎng)度是最大變化率。所謂標(biāo)量場(chǎng)，指的是空間中任意一個(gè)點(diǎn)的屬性都可以用一個(gè)標(biāo)量表示的場(chǎng)（個(gè)人理解該標(biāo)量為函數(shù)的輸出）。

隨機(jī)選一個(gè)誤分類點(diǎn)i，對(duì)參數(shù)w，b進(jìn)行更新：

image

上式

image

是學(xué)習(xí)率。損失函數(shù)的參數(shù)加上梯度上升的反方向，于是就梯度下降了。所以，上述迭代可以使損失函數(shù)不斷減小，直到為0。于是得到了原始形式的感知機(jī)學(xué)習(xí)算法：

image

對(duì)于此算法，使用下面的例子作為測(cè)試數(shù)據(jù)：

image

給出Python實(shí)現(xiàn)和可視化代碼如下：
終于到了最激動(dòng)人心的時(shí)刻了，有了上述知識(shí)，就可以完美地可視化這個(gè)簡(jiǎn)單的算法：

    import copy
    from matplotlib import pyplot as plt
    from matplotlib import animation
     
    training_set = [[(3, 3), 1], [(4, 3), 1], [(1, 1), -1]]
    w = [0, 0]
    b = 0
    history = []
     
     
    def update(item):
        """
        update parameters using stochastic gradient descent
        :param item: an item which is classified into wrong class
        :return: nothing
        """
        global w, b, history
        w[0] += 1 * item[1] * item[0][0]
        w[1] += 1 * item[1] * item[0][1]
        b += 1 * item[1]
        print w, b
        history.append([copy.copy(w), b])
        # you can uncomment this line to check the process of stochastic gradient descent
     
     
    def cal(item):
        """
        calculate the functional distance between 'item' an the dicision surface. output yi(w*xi+b).
        :param item:
        :return:
        """
        res = 0
        for i in range(len(item[0])):
            res += item[0][i] * w[i]
        res += b
        res *= item[1]
        return res
     
     
    def check():
        """
        check if the hyperplane can classify the examples correctly
        :return: true if it can
        """
        flag = False
        for item in training_set:
            if cal(item) <= 0:
                flag = True
                update(item)
        # draw a graph to show the process
        if not flag:
            print "RESULT: w: " + str(w) + " b: " + str(b)
        return flag
     
     
    if __name__ == "__main__":
        for i in range(1000):
            if not check(): break
     
        # first set up the figure, the axis, and the plot element we want to animate
        fig = plt.figure()
        ax = plt.axes(xlim=(0, 2), ylim=(-2, 2))
        line, = ax.plot([], [], 'g', lw=2)
        label = ax.text([], [], '')
     
        # initialization function: plot the background of each frame
        def init():
            line.set_data([], [])
            x, y, x_, y_ = [], [], [], []
            for p in training_set:
                if p[1] > 0:
                    x.append(p[0][0])
                    y.append(p[0][1])
                else:
                    x_.append(p[0][0])
                    y_.append(p[0][1])
     
            plt.plot(x, y, 'bo', x_, y_, 'rx')
            plt.axis([-6, 6, -6, 6])
            plt.grid(True)
            plt.xlabel('x')
            plt.ylabel('y')
            plt.title('Perceptron Algorithm (www.hankcs.com)')
            return line, label
     
     
        # animation function.  this is called sequentially
        def animate(i):
            global history, ax, line, label
     
            w = history[i][0]
            b = history[i][1]
            if w[1] == 0: return line, label
            x1 = -7
            y1 = -(b + w[0] * x1) / w[1]
            x2 = 7
            y2 = -(b + w[0] * x2) / w[1]
            line.set_data([x1, x2], [y1, y2])
            x1 = 0
            y1 = -(b + w[0] * x1) / w[1]
            label.set_text(history[i])
            label.set_position([x1, y1])
            return line, label
     
        # call the animator.  blit=true means only re-draw the parts that have changed.
        print history
        anim = animation.FuncAnimation(fig, animate, init_func=init, frames=len(history), interval=1000, repeat=True,
                                       blit=True)
        plt.show()
        anim.save('perceptron.gif', fps=2, writer='imagemagick')

可視化

image.png

可見超平面被誤分類點(diǎn)所吸引，朝著它移動(dòng)，使得兩者距離逐步減小，直到正確分類為止。通過這個(gè)動(dòng)畫，是不是對(duì)感知機(jī)的梯度下降算法有了更直觀的感悟呢？

算法的收斂性

記輸入向量加進(jìn)常數(shù)1的拓充形式

image

，其最大長(zhǎng)度為

image

，記感知機(jī)的參數(shù)向量

image

，設(shè)滿足條件

image

的超平面可以將數(shù)據(jù)集完全正確地分類，定義最小值伽馬：

image

感知機(jī)學(xué)習(xí)算法的對(duì)偶形式

對(duì)偶指的是，將w和b表示為測(cè)試數(shù)據(jù)i的線性組合形式，通過求解系數(shù)得到w和b。具體說(shuō)來(lái)，如果對(duì)誤分類點(diǎn)i逐步修改wb修改了n次，則w，b關(guān)于i的增量分別為

image

，這里

image

，則最終求解到的參數(shù)分別表示為：

image

于是有算法2.2：

image

則誤分類次數(shù)k滿足：

image

證明請(qǐng)參考《統(tǒng)計(jì)學(xué)習(xí)方法》P31。

感知機(jī)對(duì)偶算法代碼

涉及到比較多的矩陣計(jì)算，于是用NumPy比較多：

    # -*- coding:utf-8 -*-
    # Filename: train2.2.py
    # Author：hankcs
    # Date: 2015/1/31 15:15
    import numpy as np
    from matplotlib import pyplot as plt
    from matplotlib import animation
     
    # An example in that book, the training set and parameters' sizes are fixed
    training_set = np.array([[[3, 3], 1], [[4, 3], 1], [[1, 1], -1]])
     
    a = np.zeros(len(training_set), np.float)
    b = 0.0
    Gram = None
    y = np.array(training_set[:, 1])
    x = np.empty((len(training_set), 2), np.float)
    for i in range(len(training_set)):
        x[i] = training_set[i][0]
    history = []
     
    def cal_gram():
        """
        calculate the Gram matrix
        :return:
        """
        g = np.empty((len(training_set), len(training_set)), np.int)
        for i in range(len(training_set)):
            for j in range(len(training_set)):
                g[i][j] = np.dot(training_set[i][0], training_set[j][0])
        return g
     
     
    def update(i):
        """
        update parameters using stochastic gradient descent
        :param i:
        :return:
        """
        global a, b
        a[i] += 1
        b = b + y[i]
        history.append([np.dot(a * y, x), b])
        # print a, b # you can uncomment this line to check the process of stochastic gradient descent
     
     
    # calculate the judge condition
    def cal(i):
        global a, b, x, y
     
        res = np.dot(a * y, Gram[i])
        res = (res + b) * y[i]
        return res
     
     
    # check if the hyperplane can classify the examples correctly
    def check():
        global a, b, x, y
        flag = False
        for i in range(len(training_set)):
            if cal(i) <= 0:
                flag = True
                update(i)
        if not flag:
     
            w = np.dot(a * y, x)
            print "RESULT: w: " + str(w) + " b: " + str(b)
            return False
        return True
     
     
    if __name__ == "__main__":
        Gram = cal_gram()  # initialize the Gram matrix
        for i in range(1000):
            if not check(): break
     
        # draw an animation to show how it works, the data comes from history
        # first set up the figure, the axis, and the plot element we want to animate
        fig = plt.figure()
        ax = plt.axes(xlim=(0, 2), ylim=(-2, 2))
        line, = ax.plot([], [], 'g', lw=2)
        label = ax.text([], [], '')
     
        # initialization function: plot the background of each frame
        def init():
            line.set_data([], [])
            x, y, x_, y_ = [], [], [], []
            for p in training_set:
                if p[1] > 0:
                    x.append(p[0][0])
                    y.append(p[0][1])
                else:
                    x_.append(p[0][0])
                    y_.append(p[0][1])
     
            plt.plot(x, y, 'bo', x_, y_, 'rx')
            plt.axis([-6, 6, -6, 6])
            plt.grid(True)
            plt.xlabel('x')
            plt.ylabel('y')
            plt.title('Perceptron Algorithm 2 (www.hankcs.com)')
            return line, label
     
     
        # animation function.  this is called sequentially
        def animate(i):
            global history, ax, line, label
     
            w = history[i][0]
            b = history[i][1]
            if w[1] == 0: return line, label
            x1 = -7.0
            y1 = -(b + w[0] * x1) / w[1]
            x2 = 7.0
            y2 = -(b + w[0] * x2) / w[1]
            line.set_data([x1, x2], [y1, y2])
            x1 = 0.0
            y1 = -(b + w[0] * x1) / w[1]
            label.set_text(str(history[i][0]) + ' ' + str(b))
            label.set_position([x1, y1])
            return line, label
     
        # call the animator.  blit=true means only re-draw the parts that have changed.
        anim = animation.FuncAnimation(fig, animate, init_func=init, frames=len(history), interval=1000, repeat=True,
                                       blit=True)
        plt.show()
        # anim.save('perceptron2.gif', fps=2, writer='imagemagick')

可視化

image

與算法1的結(jié)果相同，我們也可以將數(shù)據(jù)集改一下：

training_set = np.array([[[3, 3], 1], [[4, 3], 1], [[1, 1], -1], [[5, 2], -1]])

</pre>

會(huì)得到一個(gè)復(fù)雜一些的結(jié)果：

image

讀后感

通過最簡(jiǎn)單的模型，學(xué)習(xí)到ML中的常用概念和常見流程。

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

李航統(tǒng)計(jì)學(xué)習(xí)方法（二）-感知機(jī)算法

李航統(tǒng)計(jì)學(xué)習(xí)方法（二）-感知機(jī)算法

感知機(jī)算法

概念

感知機(jī)模型

感知機(jī)學(xué)習(xí)策略

感知機(jī)學(xué)習(xí)算法

原始形式

可視化

算法的收斂性

感知機(jī)學(xué)習(xí)算法的對(duì)偶形式

可視化

讀后感

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

李航統(tǒng)計(jì)學(xué)習(xí)方法（二）-感知機(jī)算法

感知機(jī)算法

概念

感知機(jī)模型

感知機(jī)學(xué)習(xí)策略

感知機(jī)學(xué)習(xí)算法

原始形式

可視化

算法的收斂性

感知機(jī)學(xué)習(xí)算法的對(duì)偶形式

可視化

讀后感

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av