亚洲欧美内射,吊av在线

SVM簡介

支持向量機（support vector machines, SVM）是一種二分類模型，它的基本模型是定義在特征空間上的間隔最大的線性分類器，間隔最大使它有別于感知機；SVM還包括核技巧，這使它成為實質(zhì)上的非線性分類器。SVM的的學(xué)習(xí)策略就是間隔最大化，可形式化為一個求解凸二次規(guī)劃的問題，也等價于正則化的合頁損失函數(shù)的最小化問題。SVM的的學(xué)習(xí)算法就是求解凸二次規(guī)劃的最優(yōu)化算法。

SVM 基本概念

將實例的特征向量（以二維為例）映射為空間中的一些點，就是如下圖的紅色點和黃色點，它們屬于不同的兩類。

那么 SVM 的目的就是想要畫出一條線，以“最好地”區(qū)分這兩類點，以至如果以后有了新的點，這條線也能做出很好的分類。

能夠畫出多少條線對樣本點進行區(qū)分？
線是有無數(shù)條可以畫的,區(qū)別在于泛化能力。比如綠線就比其他兩條線區(qū)分效果好，我們希望找到的這條效果最好的線叫作劃分超平面；

為什么要叫作“超平面”呢？
因為樣本的特征很可能是高維的，此時樣本空間的劃分就需要“超平面”；

什么才叫這條線的效果好？
SVM 將會尋找可以區(qū)分兩個類別并且能使邊際（margin）最大的超平面，如下圖的兩條虛線直接的距離叫做邊際（margin），虛線是由距離中央實線最近的點所確定出來的；

為什么要讓 margin 盡量大？
因為大 margin 泛化能力強；
如下圖所示即為分離超平面，對于線性可分的數(shù)據(jù)集來說，這樣的超平面有無窮多個（即感知機），但是幾何間隔最大的分離超平面卻是唯一的。

以上圖為例，超平面為 $\omega x + b = 0$ ,超平面以上的數(shù)據(jù)Label為+1，超平面以下的數(shù)據(jù)Label為-1，則分類正確需要滿足:
$\left\{\begin{matrix} & \frac{\omega x + b}{\left \| \omega \right \|}\geqslant d \quad \forall y^{i} = 1 & \\ & \frac{\omega x + b}{\left \| \omega \right \|}\leqslant -d \quad \forall y^{i} = -1 & \end{matrix}\right.$
不等式兩邊同時除一個d，又因為 $\left \| \omega \right \|*d$ 為一個大于0的數(shù)據(jù)，可以將式子化簡為：
$\left\{\begin{matrix} & \omega x + b\geqslant 1 \quad \forall y^{i} = 1 & \\ & \omega x + b\leqslant -1 \quad \forall y^{i} = -1 & \end{matrix}\right.$
注意：此時的 $\omega \$ 和 b 與原有的 $\omega \$ 和 b放大了 $\left \| \omega \right \|*d$ 倍簡寫成 $\omega \$ 和 b，這項距離（書上一般寫的是幾何間隔）是不會變的，我們做出了一個強硬的化簡： $y^{i}(\omega x + b) \geqslant 1$
為什么可以這樣做呢？實際上我們求距離是用幾何間隔，縮放函數(shù)間隔不影響幾何間隔，舉個例子,樣本點??1是離超平面最近的點 $y^{i}(\omega x_1+ b) = 5$ ,我們把 $\omega \$ 和 b同時放縮5倍， $y^{i}(\omega x + b) =1$ ，幾何間隔和以前沒變的一樣，因為上下是同時縮放的，我們再來看看超平面 $\omega x + b = 0$ ，也沒什么變化，5??+5=0和??+1=0還是代表同一個超平面。這樣縮放后， $magin(\omega , b) = \frac{1}{\left \| \omega \right \|}$ ,參數(shù)b不見了，可以去掉b好了，最小函數(shù)間隔為1，所以約束條件也發(fā)生了變化。這樣簡化后，我們再來看看模型變成啥樣了

$max\frac{1}{\left \| \omega \right \|}$
$s.t. \quad y^{i}(\omega x + b) \geqslant 1$
等價于最大化 $\frac{1}{\left \| \omega \right \|}$ ，也就等價于最小化 $\frac{1}{2}\left \| \omega \right \|^{2}$ ,是為了后面求導(dǎo)以后形式簡潔，不影響結(jié)果），因此SVM模型的求解最大分割超平面問題又可以表示為以下約束最優(yōu)化問題:
$min\frac{1}{2}\left \| \omega \right \|^{2}$
$s.t. \quad y^{i}(\omega x + b) \geqslant 1$

支持向量
支持向量（support vector）就是剛好貼在邊際所在的平面上的點，它們是用來定義邊際的，是距離劃分超平面最近的點。訓(xùn)練完成后，大部分的訓(xùn)練樣本都不需要保留，最終模型僅與支持向量有關(guān)。只要支持向量沒變，其他的數(shù)據(jù)不影響。

左邊是60個點的結(jié)果，右邊的是120個點的結(jié)果

#Author:Sunshine丶天
from sklearn.datasets.samples_generator import make_blobs
from sklearn.svm import SVC
import matplotlib.pyplot as plt
import numpy as np

# 繪圖函數(shù)
def plot_svc_decision_function(model, ax=None, plot_support=True):
    """Plot the decision function for a 2D SVC"""
    if ax is None:
        ax = plt.gca()
    xlim = ax.get_xlim()
    ylim = ax.get_ylim()

    # create grid to evaluate model
    x = np.linspace(xlim[0], xlim[1], 30)
    y = np.linspace(ylim[0], ylim[1], 30)
    Y, X = np.meshgrid(y, x)
    xy = np.vstack([X.ravel(), Y.ravel()]).T
    P = model.decision_function(xy).reshape(X.shape)

    # plot decision boundary and margins
    ax.contour(X, Y, P, colors='k',
               levels=[-1, 0, 1], alpha=0.5,
               linestyles=['--', '-', '--'])

    # plot support vectors
    if plot_support:
        ax.scatter(model.support_vectors_[:, 0],
                   model.support_vectors_[:, 1],
                   s=300, linewidth=1, facecolors='none')
    ax.set_xlim(xlim)
    ax.set_ylim(ylim)

def plot_svm(N=10, ax=None):
    X, y = make_blobs(n_samples=200, centers=2,
                      random_state=0, cluster_std=0.60)
    X = X[:N]
    y = y[:N]
    model = SVC(kernel='linear', C=1E10)
    model.fit(X, y)

    ax = ax or plt.gca()
    ax.scatter(X[:, 0], X[:, 1], c=y, s=50, cmap='autumn')
    ax.set_xlim(-1, 4)
    ax.set_ylim(-1, 6)
    plot_svc_decision_function(model, ax)


fig, ax = plt.subplots(1, 2, figsize=(16, 6))
fig.subplots_adjust(left=0.0625, right=0.95, wspace=0.1)
for axi, N in zip(ax, [60, 120]):
    plot_svm(N, axi)
    axi.set_title('N = {0}'.format(N))

plt.show()

軟間隔
到這里都是基于訓(xùn)練集數(shù)據(jù)線性可分的假設(shè)下進行的，但是實際情況下幾乎不存在完全線性可分的數(shù)據(jù)，為了解決這個問題，引入了“軟間隔”的概念，即允許某些點不滿足約束
$s.t. \quad y^{i}(\omega x + b) \geqslant 1$
采用hinge損失，將原優(yōu)化問題改寫為
$min\frac{1}{2}\left \| \omega \right \|^{2} + C\sum \varepsilon _{i}$
$s.t. \quad y^{i}(\omega x + b) \geqslant 1 - \varepsilon _{i}$
$\varepsilon _{i} > 0$
其中 $\varepsilon _{i}$ 為“松弛變量” $\varepsilon _{i} = max(0, 1-y^{i}(\omega x + b))$ ，即一個hinge損失函數(shù)。每一個樣本都有一個對應(yīng)的松弛變量，表征該樣本不滿足約束的程度。C > 0稱為懲罰參數(shù)，C值越大，對分類的懲罰越大。跟線性可分求解的思路一致。

左邊是C=10的結(jié)果，右邊的是C=0.1的結(jié)果

#Author:Sunshine丶天
from sklearn.datasets.samples_generator import make_blobs
from sklearn.svm import SVC
import matplotlib.pyplot as plt
import numpy as np

# 繪圖函數(shù)
def plot_svc_decision_function(model, ax=None, plot_support=True):
    """Plot the decision function for a 2D SVC"""
    if ax is None:
        ax = plt.gca()
    xlim = ax.get_xlim()
    ylim = ax.get_ylim()

    # create grid to evaluate model
    x = np.linspace(xlim[0], xlim[1], 30)
    y = np.linspace(ylim[0], ylim[1], 30)
    Y, X = np.meshgrid(y, x)
    xy = np.vstack([X.ravel(), Y.ravel()]).T
    P = model.decision_function(xy).reshape(X.shape)

    # plot decision boundary and margins
    ax.contour(X, Y, P, colors='k',
               levels=[-1, 0, 1], alpha=0.5,
               linestyles=['--', '-', '--'])

    # plot support vectors
    if plot_support:
        ax.scatter(model.support_vectors_[:, 0],
                   model.support_vectors_[:, 1],
                   s=300, linewidth=1, facecolors='none')
    ax.set_xlim(xlim)
    ax.set_ylim(ylim)

X, y = make_blobs(n_samples=100, centers=2,
                  random_state=0, cluster_std=0.8)

fig, ax = plt.subplots(1, 2, figsize=(16, 6))
fig.subplots_adjust(left=0.0625, right=0.95, wspace=0.1)

for axi, C in zip(ax, [10.0, 0.1]):
    model = SVC(kernel='linear', C=C).fit(X, y)
    axi.scatter(X[:, 0], X[:, 1], c=y, s=50, cmap='autumn')
    plot_svc_decision_function(model, axi)
    axi.scatter(model.support_vectors_[:, 0],
                model.support_vectors_[:, 1],
                s=300, lw=1, facecolors='none')
    axi.set_title('C = {0:.1f}'.format(C), size=14)

plt.show()

非線性SVM算法原理

對于輸入空間中的非線性分類問題，可以通過非線性變換將它轉(zhuǎn)化為某個維特征空間中的線性分類問題，在高維特征空間中學(xué)習(xí)線性支持向量機。由于在線性支持向量機學(xué)習(xí)的對偶問題里，目標(biāo)函數(shù)和分類決策函數(shù)都只涉及實例和實例之間的內(nèi)積，所以不需要顯式地指定非線性變換，而是用核函數(shù)替換當(dāng)中的內(nèi)積。核函數(shù)表示，通過一個非線性轉(zhuǎn)換后的兩個實例間的內(nèi)積。具體地， $K(x,z)$ 是一個函數(shù)，或正定核，意味著存在一個從輸入空間到特征空間的映射 $\phi (x)$ ，對任意輸入空間中的 $x,z$ 有 $K(x,z) = \phi (x)\phi (z)$

#Author:Sunshine丶天
import numpy as np
import matplotlib.pyplot as plt
from sklearn  import datasets
from sklearn.preprocessing import StandardScaler, PolynomialFeatures
from sklearn.svm import LinearSVC
from sklearn.pipeline import Pipeline

X, y = datasets.make_moons(noise=0.15)

def plot_decision_boundary(model, axis):
    x0, x1 = np.meshgrid(
        np.linspace(axis[0], axis[1], int(axis[1] - axis[0]*100)).reshape(-1),
        np.linspace(axis[2], axis[3], int(axis[3] - axis[2] * 100)).reshape(-1)
    )
    x_new = np.c_[x0.ravel(), x1.ravel()]

    y_predict = model.predict(x_new)
    zz = y_predict.reshape(x0.shape)

    from matplotlib.colors import ListedColormap
    custom_cmap = ListedColormap(['#EF9A9A', '#FFF59D', '#90CAF9'])
    plt.contourf(x0, x1, zz, linewidth=5, cmap=custom_cmap)

# 使用多項式特征svm
def PolynomialSVC(degree, C=1.0):
    return Pipeline([
        ('poly', PolynomialFeatures(degree=degree)),
        ('std_scaler', StandardScaler()),
        ('LinearSVC', LinearSVC(C=C))
    ])

# ploy_svc = PolynomialSVC(degree=3)
# ploy_svc.fit(X, y)
#
# plot_decision_boundary(ploy_svc, axis=[-1.5, 2.5, -1.0, 1.5])
# plt.scatter(X[y==0, 0], X[y==0, 1],color='red')
# plt.scatter(X[y==1, 0], X[y==1, 1],color='blue')
# plt.show()

# =====================多項式核
# from sklearn.svm import SVC

# def PolynomialKernelSVC(degree, C=1.0):
#     return Pipeline([
#         ('std_scaler', StandardScaler()),
#         ('kernelSVC',SVC(kernel='poly', degree=degree, C=C)),
#     ])
#
# ploy_kernel_svc = PolynomialKernelSVC(degree=3, C=1000)
# ploy_kernel_svc.fit(X, y)
#
# plot_decision_boundary(ploy_kernel_svc, axis=[-1.5, 2.5, -1.0, 1.5])
# plt.scatter(X[y==0, 0], X[y==0, 1],color='red')
# plt.scatter(X[y==1, 0], X[y==1, 1],color='blue')
# plt.show()


# =====================高斯核
from sklearn.svm import SVC

def RBFKernelSVC(gamma=1.0):
    return Pipeline([
        ('std_scaler', StandardScaler()),
        ('kernelSVC',SVC(kernel='rbf', gamma=gamma)),
    ])
rbfSVC = RBFKernelSVC(gamma=1.0)
rbfSVC.fit(X, y)

plot_decision_boundary(rbfSVC, axis=[-1.5, 2.5, -1.0, 1.5])
plt.scatter(X[y==0, 0], X[y==0, 1],color='red')
plt.scatter(X[y==1, 0], X[y==1, 1],color='blue')
plt.show()

參考文章：
https://zhuanlan.zhihu.com/p/31886934
https://www.cnblogs.com/fydeblog/p/9440474.html

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

機器學(xué)習(xí)之SVM算法

機器學(xué)習(xí)之SVM算法

SVM簡介

SVM 基本概念

非線性SVM算法原理

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

機器學(xué)習(xí)之SVM算法

SVM簡介

SVM 基本概念

非線性SVM算法原理

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av