1、Linear SVC

支持向量機(jī)（SVM）是一種功能強(qiáng)大且功能多樣的機(jī)器學(xué)習(xí)模型，能夠執(zhí)行線性或非線性分類，回歸，甚至異常值檢測。它是機(jī)器學(xué)習(xí)中最受歡迎的模型之一。SVM特別適用于復(fù)雜但小型或中型數(shù)據(jù)集的分類。

下面我們嘗試在經(jīng)典的iris數(shù)據(jù)集上實(shí)現(xiàn)SVM，首先是SVC。

一如既往的，我們先配置環(huán)境：

# To support both python 2 and python 3
# 讓這份筆記同步支持 python 2 和 python 3
from __future__ import division, print_function, unicode_literals

# Common imports
import numpy as np
import os

# to make this notebook's output stable across runs
# 讓筆記全程輸入穩(wěn)定
np.random.seed(42)

# To plot pretty figures
# 導(dǎo)入繪圖工具
%matplotlib inline
import matplotlib
import matplotlib.pyplot as plt
plt.rcParams['axes.labelsize'] = 14
plt.rcParams['xtick.labelsize'] = 12
plt.rcParams['ytick.labelsize'] = 12

# Where to save the figures
# 設(shè)定圖片保存路徑，這里寫了一個(gè)函數(shù)，后面直接調(diào)用即可
PROJECT_ROOT_DIR = "C:\Hands-on"
CHAPTER_ID = "Support Vector Machines"
IMAGES_PATH = os.path.join(PROJECT_ROOT_DIR, "images", CHAPTER_ID)
def save_fig(fig_id, tight_layout=True, fig_extension="png", resolution=300):
    path = os.path.join(IMAGES_PATH, fig_id + "." + fig_extension)
    print("Saving figure", fig_id)
    if tight_layout:
        plt.tight_layout()
    plt.savefig(path, format=fig_extension, dpi=resolution)

# Ignore useless warnings (see SciPy issue #5998)
# 忽略無用警告
import warnings
warnings.filterwarnings(action="ignore", message="^internal gelsd")

接下來導(dǎo)入iris數(shù)據(jù)集，并根據(jù)需要取petal length和petal width兩個(gè)特征，以及0和1兩類的數(shù)據(jù)（即去掉了2類的數(shù)據(jù)），以便于我們基于二維特征向量做二分類。用sklearn中的SVC建立模型并擬合數(shù)據(jù)。

from sklearn.svm import SVC
from sklearn import datasets

iris = datasets.load_iris()
X = iris["data"][:, (2, 3)]  # petal length, petal width
y = iris["target"]

setosa_or_versicolor = (y == 0) | (y == 1)
X = X[setosa_or_versicolor]
y = y[setosa_or_versicolor]

# SVM Classifier model
svm_clf = SVC(kernel="linear", C=float("inf"))
svm_clf.fit(X, y)

out:
SVC(C=inf, cache_size=200, class_weight=None, coef0=0.0,
  decision_function_shape='ovr', degree=3, gamma='auto_deprecated',
  kernel='linear', max_iter=-1, probability=False, random_state=None,
  shrinking=True, tol=0.001, verbose=False)

從輸出中我們可以看到SVC的參數(shù)，其含義可以參考文檔https://ogrisel.github.io/scikit-learn.org/sklearn-tutorial/modules/generated/sklearn.svm.SVC.html

下面我們畫出SVC求得的分界線以及幾個(gè)其他模型產(chǎn)生的分界線，以此證明SVC的優(yōu)越性。

# Bad models
x0 = np.linspace(0, 5.5, 200)
pred_1 = 5*x0 - 20
pred_2 = x0 - 1.8
pred_3 = 0.1 * x0 + 0.5

def plot_svc_decision_boundary(svm_clf, xmin, xmax):
    #coef_代表各特征的權(quán)重，這里我們有兩個(gè)類，所以權(quán)重只有一行（因?yàn)橹挥幸粭l分界線）
    #這里coef_[0]就是取第一行的系數(shù)，
    w = svm_clf.coef_[0]
    #intercept_代表截距，這里分界線只有一條，因此只有一個(gè)截距
    b = svm_clf.intercept_[0]

    # At the decision boundary, w0*x0 + w1*x1 + b = 0
    # => x1 = -w0/w1 * x0 - b/w1
    x0 = np.linspace(xmin, xmax, 200)
    decision_boundary = -w[0]/w[1] * x0 - b/w[1]
    #這里margin的分母是w[1]只是因?yàn)槲覀內(nèi)1做為縱軸
    #若取x0作為縱軸，則這里分母為w[0]
    margin = 1/w[1]
    gutter_up = decision_boundary + margin
    gutter_down = decision_boundary - margin
    
    svs = svm_clf.support_vectors_
    #使用scatter()函數(shù)，并向它傳遞一對x和y坐標(biāo)，它將在指定位置繪制一個(gè)點(diǎn)
    #這里實(shí)際上是繪制出了所有支持向量對應(yīng)的點(diǎn)
    plt.scatter(svs[:, 0], svs[:, 1], s=180, facecolors='#FFAAAA')
    plt.plot(x0, decision_boundary, "k-", linewidth=2)
    plt.plot(x0, gutter_up, "k--", linewidth=2)
    plt.plot(x0, gutter_down, "k--", linewidth=2)

plt.figure(figsize=(12,2.7))

plt.subplot(121)
plt.plot(x0, pred_1, "g--", linewidth=2)
plt.plot(x0, pred_2, "m-", linewidth=2)
plt.plot(x0, pred_3, "r-", linewidth=2)
plt.plot(X[:, 0][y==1], X[:, 1][y==1], "bs", label="Iris-Versicolor")
plt.plot(X[:, 0][y==0], X[:, 1][y==0], "yo", label="Iris-Setosa")
plt.xlabel("Petal length", fontsize=14)
plt.ylabel("Petal width", fontsize=14)
plt.legend(loc="upper left", fontsize=14)
plt.axis([0, 5.5, 0, 2])

plt.subplot(122)
plot_svc_decision_boundary(svm_clf, 0, 5.5)
plt.plot(X[:, 0][y==1], X[:, 1][y==1], "bs")
plt.plot(X[:, 0][y==0], X[:, 1][y==0], "yo")
plt.xlabel("Petal length", fontsize=14)
plt.axis([0, 5.5, 0, 2])

save_fig("large_margin_classification_plot")
plt.show()

上面我們在做SVC的過程中直接采用了原數(shù)據(jù)，這是因?yàn)槲覀冞x取的特征量綱相同且數(shù)值范圍類似，事實(shí)上數(shù)據(jù)點(diǎn)在不同的維度上的量綱不同，會(huì)使得距離的計(jì)算有問題，數(shù)據(jù)標(biāo)準(zhǔn)化應(yīng)該作為數(shù)據(jù)預(yù)處理的第一個(gè)步驟，否則我們得到的模型效果會(huì)很差。下面展示了一個(gè)簡單的例子：

import numpy as np

X=np.array([[1,500],[3,300],[2,700],[4,600]])
y=np.array([1,1,0,0])
svm_clf = SVC(kernel="linear", C=float("inf"))
svm_clf.fit(X, y)

plt.figure(figsize=(8,4))
plt.plot(X[:, 0][y==1], X[:, 1][y==1], "g^", label="1")
plt.plot(X[:, 0][y==0], X[:, 1][y==0], "bs", label="0")
plot_svc_decision_boundary(svm_clf, 0, 5)
plt.xlabel("x0", fontsize=14)
plt.ylabel("x1", fontsize=14)
plt.legend(loc="upper left", fontsize=14)
plt.title("$C = {}$".format(svm_clf.C), fontsize=16)
plt.axis([0, 5, 0, 1000])

樣本的兩種特征，如果相差太大，使用 SVM 經(jīng)過計(jì)算得到的決策邊界幾乎為一條水平的直線——因?yàn)閮煞N特征的數(shù)據(jù)量綱相差太大，水平方向的距離可以忽略，因此，得到的最大的 Margin 就是兩條虛線的垂直距離。

為了避免上述情況的出現(xiàn)，我們應(yīng)在訓(xùn)練模型前先對訓(xùn)練數(shù)據(jù)進(jìn)行標(biāo)準(zhǔn)化。

我們可以直接對數(shù)據(jù)進(jìn)行標(biāo)準(zhǔn)化：

from sklearn.preprocessing import StandardScaler

standardScaler = StandardScaler()
#用于計(jì)算訓(xùn)練數(shù)據(jù)的均值和方差， 后面就會(huì)用均值和方差來轉(zhuǎn)換訓(xùn)練數(shù)據(jù)
standardScaler.fit(X)
#這一步再用scaler中的均值和方差來轉(zhuǎn)換X，使X標(biāo)準(zhǔn)化
X_standard = standardScaler.transform(X)

有沒有更方便的做法呢？這里我們介紹一下sklearn中的Pipeline，Pipeline 的中間過程由sklearn相適配的轉(zhuǎn)換器（transformer）構(gòu)成，最后一步是一個(gè)estimator（模型）。中間的節(jié)點(diǎn)都可以執(zhí)行fit和transform方法，這樣預(yù)處理都可以封裝進(jìn)去；最后節(jié)點(diǎn)只需要實(shí)現(xiàn)fit方法，通常就是我們的模型。

下面我們用Pipeline將StandardScaler和LinearSVC組合起來：

scaler=StandardScaler()
svm_clf1=LinearSVC(C=1,loss="hinge",random_state=42)
svm_clf2=LinearSVC(C=100,loss="hinge",random_state=42)

scaled_svm_clf1=Pipeline([
        ("scaler",scaler),
        ("linear_svc",svm_clf1),
    ])
scaled_svm_clf2 = Pipeline([
        ("scaler", scaler),
        ("linear_svc", svm_clf2),
    ])

scaled_svm_clf1.fit(X,y)
scaled_svm_clf2.fit(X,y)

以上代碼使用了LinearSVM，LinearSVC和使用SVC且kernel傳入linear結(jié)果是一致的。但是由于LinearSVC只能計(jì)算線性核，而SVC可以計(jì)算任意核，所以，他們的底層計(jì)算方式不一樣，這使得同樣使用線性核的SVC，用LinearSVC的計(jì)算速度，要比用SVC且kernel傳入linear參數(shù)快很多。

然后參數(shù)列表中我們使用了hinge loss，并把誤差項(xiàng)的系數(shù)分別設(shè)置為1和100，從而訓(xùn)練兩個(gè)Soft-Margin SVM模型并進(jìn)行比較。

但需要注意的是，我們把數(shù)據(jù)標(biāo)準(zhǔn)化以后求得的各特征的權(quán)重以及截距和用于原數(shù)據(jù)的權(quán)重和截距不同，為了畫出分界線，我們需要先把求得的參數(shù)值還原成標(biāo)準(zhǔn)化之前的參數(shù)值：

#Convert to unscaled parameters

#decision_function()的功能：計(jì)算樣本點(diǎn)到分割超平面的函數(shù)距離
#scaler.mean_:每個(gè)特征的均值，scaler.scale_:每個(gè)特征的標(biāo)準(zhǔn)差

#標(biāo)準(zhǔn)化后模型擬合的分界線方程為w*(x-mean)/scale+b=0
#要想在原量綱尺度的坐標(biāo)軸上畫圖，我們需要調(diào)整參數(shù)
#上式變形后得到：w*x/scale+b-w*mean/scale=0
#所以新的w1=w/scale,新的b1=b-w*mean/scale(這恰好是-scaler.mean_/scaler.scale_到分界線的距離)
b1=svm_clf1.decision_function([-scaler.mean_/scaler.scale_])
b2=svm_clf2.decision_function([-scaler.mean_/scaler.scale_])
w1=svm_clf1.coef_[0]/scaler.scale_
w2=svm_clf2.coef_[0] / scaler.scale_
#把模型參數(shù)更新為上述計(jì)算結(jié)果
svm_clf1.intercept_=np.array([b1])
svm_clf2.intercept_=np.array([b2])
svm_clf1.coef_=np.array([w1])
svm_clf2.coef_=np.array([w2])

# Find support vectors (LinearSVC does not do this automatically)
t = y * 2 - 1#(0,1)-->(-1,1)
#這里求解的支持向量其實(shí)是違反了margin的那部分點(diǎn)
support_vectors_idx1 = (t * (X.dot(w1) + b1) < 1).ravel()
support_vectors_idx2 = (t * (X.dot(w2) + b2) < 1).ravel()
svm_clf1.support_vectors_ = X[support_vectors_idx1]
svm_clf2.support_vectors_ = X[support_vectors_idx2]

更新好參數(shù)，我們就可以畫圖了：

plt.figure(figsize=(12,3.2))

plt.subplot(121)
plt.plot(X[:, 0][y==1], X[:, 1][y==1], "g^", label="Iris-Virginica")
plt.plot(X[:, 0][y==0], X[:, 1][y==0], "bs", label="Iris-Versicolor")
plot_svc_decision_boundary(svm_clf1, 4, 6)
plt.xlabel("Petal length", fontsize=14)
plt.ylabel("Petal width", fontsize=14)
plt.legend(loc="upper left", fontsize=14)
plt.title("$C = {}$".format(svm_clf1.C), fontsize=16)
plt.axis([4, 6, 0.8, 2.8])

plt.subplot(122)
plt.plot(X[:, 0][y==1], X[:, 1][y==1], "g^")
plt.plot(X[:, 0][y==0], X[:, 1][y==0], "bs")
plot_svc_decision_boundary(svm_clf2, 4, 6)
plt.xlabel("Petal length", fontsize=14)
plt.title("$C = {}$".format(svm_clf2.C), fontsize=16)
plt.axis([4, 6, 0.8, 2.8])

save_fig("regularization_plot")

在左側(cè)，使用較低的C值，間隔要大得多，但很多實(shí)例最終會(huì)出現(xiàn)在間隔之內(nèi)。
在右側(cè)，使用較高的C值，分類器會(huì)減少誤分類，最終會(huì)有較小間隔。

2、SVC with kernel

討論完線性可分的情形，下面我們要考慮更復(fù)雜的情況，許多數(shù)據(jù)集都不能線性分離。處理非線性數(shù)據(jù)集的一種方法是添加更多特征，例如多項(xiàng)式特征。在某些情況下，這可能會(huì)得到線性可分的數(shù)據(jù)集。

我們可以通過一個(gè)簡單的例子說明這一點(diǎn)：

X1D = np.linspace(-4, 4, 9).reshape(-1, 1)
X2D = np.c_[X1D, X1D**2]
y = np.array([0, 0, 1, 1, 1, 1, 1, 0, 0])

plt.figure(figsize=(11, 4))


plt.subplot(121)
#顯示網(wǎng)格
plt.grid(True, which='both')
#繪制平行于x軸的水平參考線
#plt.axhline(y=0.0, c="r", ls="--", lw=2)
#y：水平參考線的出發(fā)點(diǎn)
#c：參考線的線條顏色
#ls：參考線的線條風(fēng)格
#lw：參考線的線條寬度
plt.axhline(y=0, color='k')
plt.plot(X1D[:, 0][y==0], np.zeros(4), "bs")
plt.plot(X1D[:, 0][y==1], np.zeros(5), "g^")
#ax = plt.gca()獲得子圖的對象
#get_yaxis()獲取y坐標(biāo)軸上的值
#set_ticks可以設(shè)置刻度等
plt.gca().get_yaxis().set_ticks([])
plt.xlabel(r"$x_1$", fontsize=20)
plt.axis([-4.5, 4.5, -0.2, 0.2])


plt.subplot(122)
plt.grid(True, which='both')
plt.axhline(y=0, color='k')
plt.axvline(x=0, color='k')
plt.plot(X2D[:, 0][y==0], X2D[:, 1][y==0], "bs")
plt.plot(X2D[:, 0][y==1], X2D[:, 1][y==1], "g^")
plt.xlabel(r"$x_1$", fontsize=20)
plt.ylabel(r"$x_2$", fontsize=20, rotation=0)
plt.gca().get_yaxis().set_ticks([0, 4, 8, 12, 16])
plt.plot([-4.5, 4.5], [6.5, 6.5], "r--", linewidth=3)
plt.axis([-4.5, 4.5, -1, 17])

plt.subplots_adjust(right=1)
plt.title('Figure 5-5. Adding features to make a dataset linearly separable')
save_fig("higher_dimensions_plot", tight_layout=False)
plt.show()

左圖表示只有一個(gè)特征 $x_1$ 的簡單數(shù)據(jù)集?？梢钥吹?，此數(shù)據(jù)集不是線性可分的。但是，如果你添加一個(gè)二次特征 $x_2=x_1^2$ ，得到的2D數(shù)據(jù)集可線性分離。

下面我們對一個(gè)線性不可分的數(shù)據(jù)集moons dataset上測試一下，首先生成數(shù)據(jù)：

from sklearn.datasets import make_moons
#生成半環(huán)形圖
X, y = make_moons(n_samples=100, noise=0.15, random_state=42)

def plot_dataset(X, y, axes):
    plt.plot(X[:, 0][y==0], X[:, 1][y==0], "bs")
    plt.plot(X[:, 0][y==1], X[:, 1][y==1], "g^")
    plt.axis(axes)
    plt.grid(True, which='both')
    plt.xlabel(r"$x_1$", fontsize=20)
    plt.ylabel(r"$x_2$", fontsize=20, rotation=0)

plot_dataset(X, y, [-1.5, 2.5, -1, 1.5])
plt.show()

接下來我們使用PolynomialFeatures來添加多項(xiàng)式項(xiàng)的特征，并對添加了特征的數(shù)據(jù)使用LinearSVC訓(xùn)練模型。

from sklearn.datasets import make_moons
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import PolynomialFeatures

polynomial_svm_clf=Pipeline((("poly_features",PolynomialFeatures(degree=3)),
                            ("scaler",StandardScaler()),
                            ("svm_clf",LinearSVC(C=100,loss="hinge"))
                            ))

polynomial_svm_clf.fit(X,y)

最后我們將模型得到的分界線可視化：

def plot_predictions(clf, axes):
    x0s = np.linspace(axes[0], axes[1], 100)
    x1s = np.linspace(axes[2], axes[3], 100)
    x0, x1 = np.meshgrid(x0s, x1s)
    X = np.c_[x0.ravel(), x1.ravel()]
    y_pred = clf.predict(X).reshape(x0.shape)
    y_decision = clf.decision_function(X).reshape(x0.shape)
    plt.contourf(x0, x1, y_pred, cmap=plt.cm.brg, alpha=0.2)
    plt.contourf(x0, x1, y_decision, cmap=plt.cm.brg, alpha=0.1)

plot_predictions(polynomial_svm_clf, [-1.5, 2.5, -1, 1.5])
plot_dataset(X, y, [-1.5, 2.5, -1, 1.5])
plt.title('Figure 5-6. Linear SVM classifier using polynomial features')

save_fig("moons_polynomial_svc_plot")
plt.show()

添加多項(xiàng)式特征很容易實(shí)現(xiàn)，并且可以適用于各種類型的機(jī)器學(xué)習(xí)算法（不僅僅是SVM），但是

在低次多項(xiàng)式時(shí)它不能處理非常復(fù)雜的數(shù)據(jù)集。
在高次多項(xiàng)式下它會(huì)產(chǎn)生大量的特征，使得模型太慢。

幸運(yùn)的是，當(dāng)使用SVM時(shí)，我們可以應(yīng)用核技巧。即使使用非常高的多項(xiàng)式，也無需實(shí)際添加它們。因此沒有特征數(shù)量的組合爆炸。這個(gè)技巧由SVC類實(shí)現(xiàn)。我們在moons dataset上測試它：

from sklearn.svm import SVC

poly_kernel_svm_clf = Pipeline([
        ("scaler", StandardScaler()),
        ("svm_clf", SVC(kernel="poly", degree=3, coef0=1, C=5))
    ])

poly_kernel_svm_clf.fit(X, y)


poly100_kernel_svm_clf = Pipeline([
        ("scaler", StandardScaler()),
        ("svm_clf", SVC(kernel="poly", degree=10, coef0=100, C=5))
    ])

poly100_kernel_svm_clf.fit(X, y)

上面的代碼使用 3 次多項(xiàng)式內(nèi)核訓(xùn)練SVM分類器。coef0 表示核函數(shù)的常數(shù)項(xiàng)（其它參數(shù)的意義參考https://www.cnblogs.com/pinard/p/6117515.html），多項(xiàng)式核函數(shù)中這個(gè)參數(shù)對應(yīng) $K(x,z)=(γx?z+r)^d$ 中的 $r$ 。一般需要通過交叉驗(yàn)證選擇合適的 $r$ 。

將上面兩個(gè)模型分類結(jié)果可視化：

plt.figure(figsize=(11, 4))

plt.subplot(121)
plot_predictions(poly_kernel_svm_clf, [-1.5, 2.5, -1, 1.5])
plot_dataset(X, y, [-1.5, 2.5, -1, 1.5])
plt.title(r"$d=3, r=1, C=5$", fontsize=18)

plt.subplot(122)
plot_predictions(poly100_kernel_svm_clf, [-1.5, 2.5, -1, 1.5])
plot_dataset(X, y, [-1.5, 2.5, -1, 1.5])
plt.title(r"$d=10, r=100, C=5$", fontsize=18)


save_fig("moons_kernelized_polynomial_svc_plot")
plt.show()

上面我們直接制定了超參數(shù)的值，解決實(shí)際問題時(shí)這樣的模型不見得是最好的，我們需要尋找合適的超參數(shù)值，找到正確的超參數(shù)值的常用方法是使用網(wǎng)格搜索。

首先進(jìn)行非常粗略的網(wǎng)格搜索通常會(huì)更快，
然后圍繞找到的最佳值進(jìn)行更精細(xì)的網(wǎng)格搜索。

以上就是使用多項(xiàng)式核函數(shù)解決非線性可分問題的示范。除了多項(xiàng)式核函數(shù)，我們還可以添加使用相似度函數(shù)計(jì)算的特征，該函數(shù)計(jì)算每個(gè)樣本與特定地標(biāo)（landmark）的相似度。例如，讓我們來看看前面討論過的一維數(shù)據(jù)集，并在x1=-2和x1=1之間增加兩個(gè)地標(biāo)。接下來，我們定義一個(gè)相似函數(shù)，即高斯徑向基函數(shù)（RBF），設(shè)置γ = 0.3。

def gaussian_rbf(x, landmark, gamma):
    return np.exp(-gamma * np.linalg.norm(x - landmark, axis=1)**2)

gamma = 0.3

x1s = np.linspace(-4.5, 4.5, 200).reshape(-1, 1)
x2s = gaussian_rbf(x1s, -2, gamma)
x3s = gaussian_rbf(x1s, 1, gamma)

XK = np.c_[gaussian_rbf(X1D, -2, gamma), gaussian_rbf(X1D, 1, gamma)]
yk = np.array([0, 0, 1, 1, 1, 1, 1, 0, 0])

plt.figure(figsize=(11, 4))

plt.subplot(121)
plt.grid(True, which='both')
plt.axhline(y=0, color='k')
plt.scatter(x=[-2, 1], y=[0, 0], s=150, alpha=0.5, c="red")
plt.plot(X1D[:, 0][yk==0], np.zeros(4), "bs")
plt.plot(X1D[:, 0][yk==1], np.zeros(5), "g^")
plt.plot(x1s, x2s, "g--")
plt.plot(x1s, x3s, "b:")
plt.gca().get_yaxis().set_ticks([0, 0.25, 0.5, 0.75, 1])
plt.xlabel(r"$x_1$", fontsize=20)
plt.ylabel(r"Similarity", fontsize=14)
#在圖形中添加注釋，主要是起到提示作用
plt.annotate(r'$\mathbf{x}$',
             xy=(X1D[3, 0], 0),
             xytext=(-0.5, 0.20),
             ha="center",
             arrowprops=dict(facecolor='black', shrink=0.1),
             fontsize=18,
            )
plt.text(-2, 0.9, "$x_2$", ha="center", fontsize=20)
plt.text(1, 0.9, "$x_3$", ha="center", fontsize=20)
plt.axis([-4.5, 4.5, -0.1, 1.1])

plt.subplot(122)
plt.grid(True, which='both')
plt.axhline(y=0, color='k')
plt.axvline(x=0, color='k')
plt.plot(XK[:, 0][yk==0], XK[:, 1][yk==0], "bs")
plt.plot(XK[:, 0][yk==1], XK[:, 1][yk==1], "g^")
plt.xlabel(r"$x_2$", fontsize=20)
plt.ylabel(r"$x_3$  ", fontsize=20, rotation=0)
plt.annotate(r'$\phi\left(\mathbf{x}\right)$',
             xy=(XK[3, 0], XK[3, 1]),
             xytext=(0.65, 0.50),
             ha="center",
             arrowprops=dict(facecolor='black', shrink=0.1),
             fontsize=18,
            )
plt.plot([-0.1, 1.1], [0.57, -0.1], "r--", linewidth=3)
plt.axis([-0.1, 1.1, -0.1, 1.1])
    
plt.subplots_adjust(right=1)

save_fig("kernel_method_plot")
plt.show()

從上面的例子可以看出，如何選擇地標(biāo)很關(guān)鍵。最簡單的方法是在數(shù)據(jù)集中每個(gè)實(shí)例的位置創(chuàng)建一個(gè)地標(biāo)。這樣創(chuàng)建了許多維度，從而增加了轉(zhuǎn)換訓(xùn)練集可線性分離的機(jī)會(huì)。缺點(diǎn)是具有m個(gè)實(shí)例和n個(gè)特征的訓(xùn)練集被轉(zhuǎn)換為具有m個(gè)實(shí)例和m個(gè)特征的訓(xùn)練集（假設(shè)刪除了原始特征）。如果訓(xùn)練集非常大，那么最終會(huì)獲得相同數(shù)量的特征。

就像多項(xiàng)式特征方法一樣，相似特征方法可用于任何機(jī)器學(xué)習(xí)算法，但計(jì)算所有附加特征可能計(jì)算成本很高，特別是在大型訓(xùn)練集上。然而，核技巧可以獲得類似的結(jié)果，就像添加了許多相似性特征一樣，而不必實(shí)際添加它們。接下來我們嘗試使用SVC類中的RBF內(nèi)核。

from sklearn.svm import SVC

gamma1, gamma2 = 0.1, 5
C1, C2 = 0.001, 1000
hyperparams = (gamma1, C1), (gamma1, C2), (gamma2, C1), (gamma2, C2)

svm_clfs = []
for gamma, C in hyperparams:
    rbf_kernel_svm_clf = Pipeline([
            ("scaler", StandardScaler()),
            ("svm_clf", SVC(kernel="rbf", gamma=gamma, C=C))
        ])
    rbf_kernel_svm_clf.fit(X, y)
    svm_clfs.append(rbf_kernel_svm_clf)

plt.figure(figsize=(11, 7))

for i, svm_clf in enumerate(svm_clfs):
    plt.subplot(221 + i)
    plot_predictions(svm_clf, [-1.5, 2.5, -1, 1.5])
    plot_dataset(X, y, [-1.5, 2.5, -1, 1.5])
    gamma, C = hyperparams[i]
    plt.title(r"$\gamma = {}, C = {}$".format(gamma, C), fontsize=16)

save_fig("moons_rbf_svc_plot")
plt.show()

上圖顯示了使用不同的超參數(shù)值gammaγ（γ）和 C 訓(xùn)練的模型。

增加 gamma（γ）使鐘形曲線變窄，因此每個(gè)實(shí)例的影響范圍都較?。簺Q策邊界最終變得更不規(guī)則，在個(gè)別實(shí)例周圍擺動(dòng)。
減少 gamma γ 值使鐘形曲線變寬，因此實(shí)例具有更大的影響范圍，并且決策邊界更加平滑。

所以γ就像一個(gè)正則化超參數(shù)：

如果你的模型過擬合，你應(yīng)該減少它，
如果它是欠擬合，你應(yīng)該增加它（類似于C超參數(shù)）。

我個(gè)人的理解是，這里的γ和KNN方法中的K的作用類似。γ設(shè)置較大使鐘形曲線變窄，就相當(dāng)于K變小，因?yàn)橹挥芯嚯x當(dāng)前地標(biāo)（即均值點(diǎn)）最近的那部分點(diǎn)才有比較高的相似性；γ設(shè)置較小使鐘形曲線變寬，就相當(dāng)于K變大，因?yàn)榫嚯x當(dāng)前地標(biāo)（即均值點(diǎn)）較近的那部分點(diǎn)都有比較高的相似性。

3、SVR

SVM 算法應(yīng)用廣泛：不僅僅支持線性和非線性的分類任務(wù)，還支持線性和非線性的回歸任務(wù)。技巧在于逆轉(zhuǎn)我們的目標(biāo)：限制間隔違規(guī)的情況下，不是試圖在兩個(gè)類別之間找到盡可能大的“街道”（即間隔）。SVM 回歸任務(wù)是限制間隔違規(guī)情況下，盡量放置更多的樣本在“街道”上。

“街道”的寬度由超參數(shù)?控制。接下來我們在一些隨機(jī)生成的線性數(shù)據(jù)上，展示兩個(gè)線性 SVR的訓(xùn)練情況。一個(gè)有較大的間隔（?=1.5），另一個(gè)間隔較?。?=0.5）。

我們使用Scikit-Learn的LinearSVR類來執(zhí)行線性SVR。

首先隨機(jī)生成有擾動(dòng)的線性數(shù)據(jù)：

np.random.seed(42)
m = 50
X = 2 * np.random.rand(m, 1)
y = (4 + 3 * X + np.random.randn(m, 1)).ravel()

然后我們訓(xùn)練兩個(gè)線性 SVR，間隔?分別設(shè)置為1.5和0.5，并找到所有“街道”外的樣本點(diǎn)，記錄下其索引。

from sklearn.svm import LinearSVR

svm_reg1 = LinearSVR(epsilon=1.5, random_state=42)
svm_reg2 = LinearSVR(epsilon=0.5, random_state=42)
svm_reg1.fit(X, y)
svm_reg2.fit(X, y)

def find_support_vectors(svm_reg, X, y):
    y_pred = svm_reg.predict(X)
    off_margin = (np.abs(y - y_pred) >= svm_reg.epsilon)
    #返回非0的數(shù)組元組的索引
    return np.argwhere(off_margin)

svm_reg1.support_ = find_support_vectors(svm_reg1, X, y)
svm_reg2.support_ = find_support_vectors(svm_reg2, X, y)

將上述結(jié)果可視化：

def plot_svm_regression(svm_reg, X, y, axes):
    x1s = np.linspace(axes[0], axes[1], 100).reshape(100, 1)
    y_pred = svm_reg.predict(x1s)
    plt.plot(x1s, y_pred, "k-", linewidth=2, label=r"$\hat{y}$")
    plt.plot(x1s, y_pred + svm_reg.epsilon, "k--")
    plt.plot(x1s, y_pred - svm_reg.epsilon, "k--")
    #s表示點(diǎn)的輪廓的寬窄，facecolors表示點(diǎn)的輪廓顏色
    plt.scatter(X[svm_reg.support_], y[svm_reg.support_], s=200, facecolors='#FFAAAA')
    plt.plot(X, y, "bo")
    plt.xlabel(r"$x_1$", fontsize=18)
    plt.legend(loc="upper left", fontsize=18)
    plt.axis(axes)

plt.figure(figsize=(9, 4))
plt.subplot(121)
plot_svm_regression(svm_reg1, X, y, [0, 2, 3, 11])
plt.title(r"$\epsilon = {}$".format(svm_reg1.epsilon), fontsize=18)
plt.ylabel(r"$y$", fontsize=18, rotation=0)
plt.annotate(
        '', xy=(eps_x1, eps_y_pred), xycoords='data',
        xytext=(eps_x1, eps_y_pred - svm_reg1.epsilon),
        textcoords='data', arrowprops={'arrowstyle': '<->', 'linewidth': 1.5}
    )
plt.text(0.91, 5.6, r"$\epsilon$", fontsize=20)
plt.subplot(122)
plot_svm_regression(svm_reg2, X, y, [0, 2, 3, 11])
plt.title(r"$\epsilon = {}$".format(svm_reg2.epsilon), fontsize=18)
save_fig("svm_regression_plot")
plt.show()

可以看到，添加更多的數(shù)據(jù)樣本在間隔之內(nèi)并不太會(huì)影響模型的預(yù)測，因此，這個(gè)模型認(rèn)為是不敏感的（?-insensitive）。

要處理非線性回歸任務(wù)，我們依然使用Scikit-Learn的SVR類，和SVC類似，我們要添加非線性的內(nèi)核并通過C進(jìn)行正則化。

#生成數(shù)據(jù)
np.random.seed(42)
m = 100
X = 2 * np.random.rand(m, 1) - 1
y = (0.2 + 0.1 * X + 0.5 * X**2 + np.random.randn(m, 1)/10).ravel()

#同時(shí)訓(xùn)練兩個(gè)模型
from sklearn.svm import SVR

svm_poly_reg1 = SVR(kernel="poly", degree=2, C=100, epsilon=0.1)
svm_poly_reg2 = SVR(kernel="poly", degree=2, C=0.01, epsilon=0.1)
svm_poly_reg1.fit(X, y)
svm_poly_reg2.fit(X, y)

#模型可視化
plt.figure(figsize=(9, 4))
plt.subplot(121)
plot_svm_regression(svm_poly_reg1, X, y, [-1, 1, 0, 1])
plt.title(r"$degree={}, C={}, \epsilon = {}$".format(svm_poly_reg1.degree, svm_poly_reg1.C, svm_poly_reg1.epsilon), fontsize=18)
plt.ylabel(r"$y$", fontsize=18, rotation=0)
plt.subplot(122)
plot_svm_regression(svm_poly_reg2, X, y, [-1, 1, 0, 1])
plt.title(r"$degree={}, C={}, \epsilon = {}$".format(svm_poly_reg2.degree, svm_poly_reg2.C, svm_poly_reg2.epsilon), fontsize=18)
save_fig("svm_with_polynomial_kernel_plot")
plt.show()

左圖幾乎沒有正則化（即大的C值），右圖有更多的正則化（即小的C值）。從圖形來看左圖效果更佳，這是很自然的，因?yàn)槲覀兊暮撕瘮?shù)是二次的，而樣本本身就是二次函數(shù)上的點(diǎn)加隨機(jī)擾動(dòng)，所以模型是不需要正則化的。

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

SVM實(shí)現(xiàn)

SVM實(shí)現(xiàn)

1、Linear SVC

2、SVC with kernel

3、SVR

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

SVM實(shí)現(xiàn)

1、Linear SVC

2、SVC with kernel

3、SVR

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av