SVM可以做線性或者非線性的分類，回歸，甚至異常值檢測。

1. 線性SVM分類

from sklearn.svm import SVC
from sklearn import datasets
import pandas as pd

iris = datasets.load_iris()
X = iris["data"][:, (2, 3)]  # petal length, petal width
y = iris["target"]

setosa_or_versicolor = (y == 0) | (y == 1)
X = X[setosa_or_versicolor]
y = y[setosa_or_versicolor]

# SVM Classifier model
svm_clf = SVC(kernel="linear", C=float("inf"))
svm_clf.fit(X, y)

# Bad models
x0 = np.linspace(0, 5.5, 200)
pred_1 = 5*x0 - 20
pred_2 = x0 - 1.8
pred_3 = 0.1 * x0 + 0.5

def plot_svc_decision_boundary(svm_clf, xmin, xmax):
    w = svm_clf.coef_[0]
    b = svm_clf.intercept_[0]

    # At the decision boundary, w0*x0 + w1*x1 + b = 0
    # => x1 = -w0/w1 * x0 - b/w1
    x0 = np.linspace(xmin, xmax, 200)
    decision_boundary = -w[0]/w[1] * x0 - b/w[1]

    margin = 1/w[1]
    gutter_up = decision_boundary + margin
    gutter_down = decision_boundary - margin

    svs = svm_clf.support_vectors_
    plt.scatter(svs[:, 0], svs[:, 1], s=180, facecolors='#FFAAAA')
    plt.plot(x0, decision_boundary, "k-", linewidth=2)
    plt.plot(x0, gutter_up, "k--", linewidth=2)
    plt.plot(x0, gutter_down, "k--", linewidth=2)

plt.figure(figsize=(12,2.7))

plt.subplot(121)
plt.plot(x0, pred_1, "g--", linewidth=2)
plt.plot(x0, pred_2, "m-", linewidth=2)
plt.plot(x0, pred_3, "r-", linewidth=2)
plt.plot(X[:, 0][y==1], X[:, 1][y==1], "bs", label="Iris-Versicolor")
plt.plot(X[:, 0][y==0], X[:, 1][y==0], "yo", label="Iris-Setosa")
plt.xlabel("Petal length", fontsize=14)
plt.ylabel("Petal width", fontsize=14)
plt.legend(loc="upper left", fontsize=14)
plt.axis([0, 5.5, 0, 2])

plt.subplot(122)
plot_svc_decision_boundary(svm_clf, 0, 5.5)
plt.plot(X[:, 0][y==1], X[:, 1][y==1], "bs")
plt.plot(X[:, 0][y==0], X[:, 1][y==0], "yo")
plt.xlabel("Petal length", fontsize=14)
plt.axis([0, 5.5, 0, 2])

plt.show()

iris數(shù)據(jù)集

左圖顯示了三種可能的線性分類器的判定邊界。其中用虛線表示的線性模型判定邊界很差，甚至不能正確地劃分類別。另外兩個線性模型在這個數(shù)據(jù)集表現(xiàn)的很好，但是它們的判定邊界很靠近樣本點，在新的數(shù)據(jù)上可能不會表現(xiàn)的很好。相比之下，右邊圖中SVM 分類器的判定邊界實線，不僅分開了兩種類別，而且還盡可能地遠離了最靠近的訓練數(shù)據(jù)點。可以認為 SVM 分類器在兩種類別之間保持了一條盡可能寬敞的街道（圖中平行的虛線），其被稱為最大間隔分類。
注意到添加更多的樣本點在“街道”外并不會影響到判定邊界，因為判定邊界是由位于“街道”邊緣的樣本點確定的，這些樣本點被稱為“支持向量”（右圖中被圈出來的點）

SVM 對特征縮放比較敏感

軟間隔分類

如果我們嚴格地規(guī)定所有的數(shù)據(jù)都不在“街道”上，都在正確地兩邊，稱為硬間隔分類，硬間隔分類有兩個問題，第一，只對線性可分的數(shù)據(jù)起作用，第二，對異常點敏感。下圖顯示了只有一個異常點的鳶尾花數(shù)據(jù)集：左邊的圖中很難找到硬間隔，它很難一般化。

硬間隔分類

為了避免上述的問題，我們更傾向于使用更加軟性的模型。目的在保持“街道”盡可能大和避免間隔違規(guī)（例如：數(shù)據(jù)點出現(xiàn)在“街道”中央或者甚至在錯誤的一邊）之間找到一個良好的平衡。這就是軟間隔分類。

在 Scikit-Learn 庫的SVM類，可以用 C 超參數(shù)（懲罰系數(shù)）來控制這種平衡：較小的 C 會導致更寬的“街道”，但更多的間隔違規(guī)。

import numpy as np
from sklearn import datasets
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.svm import LinearSVC

scaler = StandardScaler()
svm_clf1 = LinearSVC(C=1, loss="hinge", random_state=42)
svm_clf2 = LinearSVC(C=100, loss="hinge", random_state=42)

scaled_svm_clf1 = Pipeline([
        ("scaler", scaler),
        ("linear_svc", svm_clf1),
    ])
scaled_svm_clf2 = Pipeline([
        ("scaler", scaler),
        ("linear_svc", svm_clf2),
    ])

scaled_svm_clf1.fit(X, y)
scaled_svm_clf2.fit(X, y)
# decision_function(X): Distance of the samples X to the separating hyperplane.
# Convert to unscaled parameters
b1 = svm_clf1.decision_function([-scaler.mean_ / scaler.scale_])
b2 = svm_clf2.decision_function([-scaler.mean_ / scaler.scale_])
w1 = svm_clf1.coef_[0] / scaler.scale_
w2 = svm_clf2.coef_[0] / scaler.scale_
svm_clf1.intercept_ = np.array([b1])
svm_clf2.intercept_ = np.array([b2])
svm_clf1.coef_ = np.array([w1])
svm_clf2.coef_ = np.array([w2])

# Find support vectors (LinearSVC does not do this automatically)
t = y * 2 - 1
support_vectors_idx1 = (t * (X.dot(w1) + b1) < 1).ravel()
support_vectors_idx2 = (t * (X.dot(w2) + b2) < 1).ravel()
svm_clf1.support_vectors_ = X[support_vectors_idx1]
svm_clf2.support_vectors_ = X[support_vectors_idx2]

plt.figure(figsize=(12,3.2))
plt.subplot(121)
plt.plot(X[:, 0][y==1], X[:, 1][y==1], "g^", label="Iris-Virginica")
plt.plot(X[:, 0][y==0], X[:, 1][y==0], "bs", label="Iris-Versicolor")
plot_svc_decision_boundary(svm_clf1, 4, 6)
plt.xlabel("Petal length", fontsize=14)
plt.ylabel("Petal width", fontsize=14)
plt.legend(loc="upper left", fontsize=14)
plt.title("$C = {}$".format(svm_clf1.C), fontsize=16)
plt.axis([4, 6, 0.8, 2.8])

plt.subplot(122)
plt.plot(X[:, 0][y==1], X[:, 1][y==1], "g^")
plt.plot(X[:, 0][y==0], X[:, 1][y==0], "bs")
plot_svc_decision_boundary(svm_clf2, 4, 6)
plt.xlabel("Petal length", fontsize=14)
plt.title("$C = {}$".format(svm_clf2.C), fontsize=16)
plt.axis([4, 6, 0.8, 2.8])

軟間隔分類

作為一種選擇，可以使用SVC(kernel="linear", C=1) ，但是它比較慢，尤其在較大的訓練集上，所以一般不被推薦。另一個選擇是使用 SGDClassifier 類，即 SGDClassifier(loss="hinge", alpha=1/(m*C)) 。它應用了隨機梯度下降來訓練一個線性 SVM 分類器。盡管它不會和 LinearSVC 一樣快速收斂，但是對于處理那些不適合放在內存的大數(shù)據(jù)集是非常有用的，或者處理在線分類任務同樣有用。

LinearSVC 要使偏置項規(guī)范化，首先應該集中訓練集減去它的平均數(shù)。如果你使用了 StandardScaler ，那么它會自動處理。此外，確保你設置 loss 參數(shù)為 hinge ，因為它不是默認值。最后，為了得到更好的效果，需要將 dual 參數(shù)設置為 False ，除非特征數(shù)比樣本量多。

2. 非線性SVM分類

線性不可分vs線性可分

from sklearn.datasets import make_moons
X, y = make_moons(n_samples=100, noise=0.15, random_state=42)
'''
A simple toy dataset to visualize clustering and classification
algorithms. Read more in the :ref:`User Guide <sample_generators>`.

Parameters
----------
n_samples : int, optional (default=100)
    The total number of points generated.

shuffle : bool, optional (default=True)
    Whether to shuffle the samples.

noise : double or None (default=None)
    Standard deviation of Gaussian noise added to the data.

random_state : int, RandomState instance or None (default)
    Determines random number generation for dataset shuffling and noise.
    Pass an int for reproducible output across multiple function calls.
    See :term:`Glossary <random_state>`.

Returns
-------
X : array of shape [n_samples, 2]
    The generated samples.

y : array of shape [n_samples]
    The integer labels (0 or 1) for class membership of each sample.
'''
def plot_dataset(X, y, axes):
    plt.plot(X[:, 0][y==0], X[:, 1][y==0], "bs")
    plt.plot(X[:, 0][y==1], X[:, 1][y==1], "g^")
    plt.axis(axes)
    plt.grid(True, which='both')
    plt.xlabel(r"$x_1$", fontsize=20)
    plt.ylabel(r"$x_2$", fontsize=20, rotation=0)

plot_dataset(X, y, [-1.5, 2.5, -1, 1.5])
plt.show()

衛(wèi)星數(shù)據(jù)集

from sklearn.datasets import make_moons
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import PolynomialFeatures

polynomial_svm_clf = Pipeline([
        ("poly_features", PolynomialFeatures(degree=3)),
        ("scaler", StandardScaler()),
        ("svm_clf", LinearSVC(C=10, loss="hinge", random_state=42))
    ])

polynomial_svm_clf.fit(X, y)

def plot_predictions(clf, axes):
    x0s = np.linspace(axes[0], axes[1], 100)
    x1s = np.linspace(axes[2], axes[3], 100)
    x0, x1 = np.meshgrid(x0s, x1s)
    X = np.c_[x0.ravel(), x1.ravel()]
    y_pred = clf.predict(X).reshape(x0.shape)
    y_decision = clf.decision_function(X).reshape(x0.shape)
    plt.contourf(x0, x1, y_pred, cmap=plt.cm.brg, alpha=0.2)
    plt.contourf(x0, x1, y_decision, cmap=plt.cm.brg, alpha=0.1)

plot_predictions(polynomial_svm_clf, [-1.5, 2.5, -1, 1.5])
plot_dataset(X, y, [-1.5, 2.5, -1, 1.5])

plt.show()

image.png

多項式核

添加多項式特征很容易實現(xiàn)，不僅僅在 SVM，在各種機器學習算法都有不錯的表現(xiàn)，但是低次數(shù)的多項式不能處理非常復雜的數(shù)據(jù)集，而高次數(shù)的多項式卻產(chǎn)生了大量的特征，會使模型變得慢。
SVM中，可以運用一個被稱為“核技巧”（kernel trick）的神奇數(shù)學技巧。它可以取得就像你添加了許多多項式，甚至有高次數(shù)的多項式，一樣好的結果。但是不會大量特征導致的組合爆炸，因為并沒有增加任何特征。
投影（映射）就是一個函數(shù)。z = f(x, y) 就是把x，y投影到z。內核函數(shù)就是投影所具體使用的函數(shù)。

from sklearn.svm import SVC

# 參數(shù) coef0 控制了高階多項式與低階多項式對模型的影響。
poly_kernel_svm_clf = Pipeline([
        ("scaler", StandardScaler()),
        ("svm_clf", SVC(kernel="poly", degree=3, coef0=1, C=5))
    ])
poly_kernel_svm_clf.fit(X, y)

poly100_kernel_svm_clf = Pipeline([
        ("scaler", StandardScaler()),
        ("svm_clf", SVC(kernel="poly", degree=10, coef0=100, C=5))
    ])
poly100_kernel_svm_clf.fit(X, y)

plt.figure(figsize=(11, 4))

plt.subplot(121)
plot_predictions(poly_kernel_svm_clf, [-1.5, 2.5, -1, 1.5])
plot_dataset(X, y, [-1.5, 2.5, -1, 1.5])
plt.title(r"$d=3, r=1, C=5$", fontsize=18)

plt.subplot(122)
plot_predictions(poly100_kernel_svm_clf, [-1.5, 2.5, -1, 1.5])
plot_dataset(X, y, [-1.5, 2.5, -1, 1.5])
plt.title(r"$d=10, r=100, C=5$", fontsize=18)

plt.show()

多項式核

增加相似特性

核函數(shù)博文
另一種解決非線性問題的方法是使用相似函數(shù)（similarity funtion）計算每個樣本與特定地標（landmark）的相似度。
例如，讓我們來看看前面討論過的一維數(shù)據(jù)集，并在 x1=-2 和 x1=1 之間增加兩個地標。接下來，我們定義一個相似函數(shù)，即高斯徑向基函數(shù)（Gaussian Radial Basis Function，RBF），設置 γ = 0.3

RBF

它是個從 0 到 1 的鐘型函數(shù)，值為 0 的離地標很遠，值為 1 的在地標上?，F(xiàn)在我們準備計算新特征。例如，我們看一下樣本 x1=-1 ：它距離第一個地標距離是 1，距離第二個地標是 2。因此它的新特征為 x2=exp(-0.3 × (1^2))≈0.74 和 x3=exp(-0.3 × (2^2))≈0.30
右邊的圖顯示了特征轉換后的數(shù)據(jù)集（刪除了原始特征），正如你看到的，它現(xiàn)在是線性可分了。

相似性

gamma的作用，其實就是控制數(shù)據(jù)在向高維度投影后的縮放比例
而這個縮放比例就會影響線性分割面的運算結果（不同的loss function對距離的懲罰度不一樣）。這也是SVM對數(shù)據(jù) Scaling 和 Normalization 是敏感的原因之一。因為最后都是算的一個 Linear Model。這就是為什么，有人說如果原始數(shù)據(jù)比較分散，gamma可以小一點。反之，如果原始數(shù)據(jù)很密集，gamma可以大一點。當然，這不是絕對的，所以我們才要做 GridSearch。

如何選擇地標？最簡單的方法是在數(shù)據(jù)集中的每一個樣本的位置創(chuàng)建地標。

高斯RBF核

rbf_kernel_svm_clf = Pipeline([
        ("scaler", StandardScaler()),
        ("svm_clf", SVC(kernel="rbf", gamma=5, C=0.001))
    ])
rbf_kernel_svm_clf.fit(X, y)

from sklearn.svm import SVC

gamma1, gamma2 = 0.1, 5
C1, C2 = 0.001, 1000
hyperparams = (gamma1, C1), (gamma1, C2), (gamma2, C1), (gamma2, C2)

svm_clfs = []
for gamma, C in hyperparams:
    rbf_kernel_svm_clf = Pipeline([
            ("scaler", StandardScaler()),
            ("svm_clf", SVC(kernel="rbf", gamma=gamma, C=C))
        ])
    rbf_kernel_svm_clf.fit(X, y)
    svm_clfs.append(rbf_kernel_svm_clf)

plt.figure(figsize=(11, 7))

for i, svm_clf in enumerate(svm_clfs):
    plt.subplot(221 + i)
    plot_predictions(svm_clf, [-1.5, 2.5, -1, 1.5])
    plot_dataset(X, y, [-1.5, 2.5, -1, 1.5])
    gamma, C = hyperparams[i]
    plt.title(r"$\gamma = {}, C = {}$".format(gamma, C), fontsize=16)

plt.show()

RBF

復雜度分析

復雜度

3. SVM回歸

SVM 算法應用廣泛：不僅僅支持線性和非線性的分類任務，還支持線性和非線性的回歸任務。技巧在于逆轉我們的目標：限制間隔違規(guī)的情況下，不是試圖在兩個類別之間找到盡可能大的“街道”（即間隔）。SVM 回歸任務是限制間隔違規(guī)情況下，盡量放置更多的樣本在“街道”上?！敖值馈钡膶挾扔沙瑓?shù) ? 控制

np.random.seed(42)
m = 50
X = 2 * np.random.rand(m, 1)
y = (4 + 3 * X + np.random.randn(m, 1)).ravel()

from sklearn.svm import LinearSVR

svm_reg1 = LinearSVR(epsilon=1.5, random_state=42)
svm_reg2 = LinearSVR(epsilon=0.5, random_state=42)
svm_reg1.fit(X, y)
svm_reg2.fit(X, y)

def find_support_vectors(svm_reg, X, y):
    y_pred = svm_reg.predict(X)
    off_margin = (np.abs(y - y_pred) >= svm_reg.epsilon)
    return np.argwhere(off_margin)

svm_reg1.support_ = find_support_vectors(svm_reg1, X, y)
svm_reg2.support_ = find_support_vectors(svm_reg2, X, y)

def plot_svm_regression(svm_reg, X, y, axes):
    x1s = np.linspace(axes[0], axes[1], 100).reshape(100, 1)
    y_pred = svm_reg.predict(x1s)
    plt.plot(x1s, y_pred, "k-", linewidth=2, label=r"$\hat{y}$")
    plt.plot(x1s, y_pred + svm_reg.epsilon, "k--")
    plt.plot(x1s, y_pred - svm_reg.epsilon, "k--")
    plt.scatter(X[svm_reg.support_], y[svm_reg.support_], s=180, facecolors='#FFAAAA')
    plt.plot(X, y, "bo")
    plt.xlabel(r"$x_1$", fontsize=18)
    plt.legend(loc="upper left", fontsize=18)
    plt.axis(axes)

plt.figure(figsize=(9, 4))
plt.subplot(121)
plot_svm_regression(svm_reg1, X, y, [0, 2, 3, 11])
plt.title(r"$\epsilon = {}$".format(svm_reg1.epsilon), fontsize=18)
plt.ylabel(r"$y$", fontsize=18, rotation=0)
#plt.plot([eps_x1, eps_x1], [eps_y_pred, eps_y_pred - svm_reg1.epsilon], "k-", linewidth=2)
plt.annotate(
        '', xy=(eps_x1, eps_y_pred), xycoords='data',
        xytext=(eps_x1, eps_y_pred - svm_reg1.epsilon),
        textcoords='data', arrowprops={'arrowstyle': '<->', 'linewidth': 1.5}
    )
plt.text(0.91, 5.6, r"$\epsilon$", fontsize=20)
plt.subplot(122)
plot_svm_regression(svm_reg2, X, y, [0, 2, 3, 11])
plt.title(r"$\epsilon = {}$".format(svm_reg2.epsilon), fontsize=18)
plt.show()

SVM回歸原理

np.random.seed(42)
m = 100
X = 2 * np.random.rand(m, 1) - 1
y = (0.2 + 0.1 * X + 0.5 * X**2 + np.random.randn(m, 1)/10).ravel()

from sklearn.svm import SVR

svm_poly_reg1 = SVR(kernel="poly", degree=2, C=100, epsilon=0.1)
svm_poly_reg2 = SVR(kernel="poly", degree=2, C=0.01, epsilon=0.1)
svm_poly_reg1.fit(X, y)
svm_poly_reg2.fit(X, y)

plt.figure(figsize=(9, 4))
plt.subplot(121)
plot_svm_regression(svm_poly_reg1, X, y, [-1, 1, 0, 1])
plt.title(r"$degree={}, C={}, \epsilon = {}$".format(svm_poly_reg1.degree, svm_poly_reg1.C, svm_poly_reg1.epsilon), fontsize=18)
plt.ylabel(r"$y$", fontsize=18, rotation=0)
plt.subplot(122)
plot_svm_regression(svm_poly_reg2, X, y, [-1, 1, 0, 1])
plt.title(r"$degree={}, C={}, \epsilon = {}$".format(svm_poly_reg2.degree, svm_poly_reg2.C, svm_poly_reg2.epsilon), fontsize=18)

plt.show()

SVM回歸

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

機器學習之支持向量機

機器學習之支持向量機

SVM可以做線性或者非線性的分類，回歸，甚至異常值檢測。

1. 線性SVM分類

軟間隔分類

2. 非線性SVM分類

多項式核

增加相似特性

高斯RBF核

復雜度分析

3. SVM回歸

相關閱讀更多精彩內容

友情鏈接更多精彩內容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

機器學習 之 支持向量機

SVM可以做線性或者非線性的分類，回歸，甚至異常值檢測。

1. 線性SVM分類

軟間隔分類

2. 非線性SVM分類

多項式核

增加相似特性

高斯RBF核

復雜度分析

3. SVM回歸

相關閱讀更多精彩內容

友情鏈接更多精彩內容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

機器學習之支持向量機

SVM可以做線性或者非線性的分類，回歸，甚至異常值檢測。