引言

使用威斯康星乳腺腫瘤數(shù)據(jù)，用高斯樸素貝葉斯進(jìn)行建模。數(shù)據(jù)集包括569個(gè)兵力的數(shù)據(jù)樣本，每個(gè)樣本有30個(gè)特征值，樣本分為兩類：惡性（Malignant）和良性（Benign）。

建模

import numpy as np
from sklearn.datasets import load_breast_cancer #導(dǎo)入威斯康星乳腺腫瘤數(shù)據(jù)
from sklearn.naive_bayes import GaussianNB #高斯貝葉斯
from sklearn.model_selection import train_test_split #導(dǎo)入數(shù)據(jù)集拆分工具
from sklearn.model_selection import learning_curve #導(dǎo)入學(xué)習(xí)曲線庫
from sklearn.model_selection import ShuffleSplit #導(dǎo)入隨機(jī)拆分工具
import matplotlib.pyplot as plt

#載入數(shù)據(jù)
cancer = load_breast_cancer()

#將數(shù)據(jù)集的數(shù)值和分類目標(biāo)賦值給 X 和 y
X, y = cancer.data, cancer.target

#拆分訓(xùn)練集和數(shù)據(jù)集
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=38)

gnb = GaussianNB()
gnb.fit(X_train, y_train)

print('代碼運(yùn)行結(jié)果為：')
print('==========================')
print("訓(xùn)練數(shù)據(jù)得分：{:.2f}".format(gnb.score(X_train, y_train)))
print("測試數(shù)據(jù)得分：{:.2f}".format(gnb.score(X_test, y_test)))

#定義一個(gè)函數(shù)繪制學(xué)習(xí)曲線
def plot_learning_curve(estimator, title, X, y, ylim=None, cv=None, n_jobs=1, train_sizes=np.linspace(.1, 1.0, 5)):
    plt.figure()
    plt.title(title)
    if ylim is not None:
        plt.ylim(*ylim)
    plt.xlabel("Training examples")
    plt.ylabel("Score")
    train_sizes, train_scores, test_scores = learning_curve(
        estimator, X, y, cv=cv, n_jobs=n_jobs, train_sizes=train_sizes)
    train_scores_mean = np.mean(train_scores, axis=1)
    test_scores_mean = np.mean(test_scores, axis=1)
    plt.grid()
    plt.plot(train_sizes, train_scores_mean, 'o-', color="r",
             label="Training score")
    plt.plot(train_sizes, test_scores_mean, 'o-', color="g",
             label="Cross-validation score")

    plt.legend(loc="lower right")
    return plt

title = "Learning Curves (Naive Bayes)"
#設(shè)定拆分?jǐn)?shù)量
cv =ShuffleSplit(n_splits=100, test_size=0.2, random_state=0)
estimator = GaussianNB()

plot_learning_curve(estimator, title, X, y, ylim=(0.9, 1.01), cv=cv, n_jobs=4)

plt.show()

執(zhí)行結(jié)果如下：

代碼運(yùn)行結(jié)果為：
==========================
訓(xùn)練數(shù)據(jù)得分：0.95
測試數(shù)據(jù)得分：0.94

從執(zhí)行結(jié)果可以看出：
訓(xùn)練樣本中，隨著樣本數(shù)量的增加，模型的得分逐漸降低，因?yàn)殡S著樣本數(shù)量的增加，模型要擬合的數(shù)據(jù)增加，難度也加大。
模型的交叉驗(yàn)證得分并沒有隨著樣本數(shù)的增加而有較大的變化，說明高斯樸素貝葉斯在預(yù)測方面，對(duì)于樣本數(shù)量的要求并不苛刻。如果樣本數(shù)量較少的話，可以考慮高斯樸素貝葉斯建模。

5.4Breast_cancer.png

總結(jié)

高斯樸素貝葉斯適合于任何連續(xù)性數(shù)值的數(shù)據(jù)集當(dāng)中，如果時(shí)符合正態(tài)分布的數(shù)據(jù)集的話，模型的得分會(huì)更高。

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

樸素貝葉斯算法項(xiàng)目實(shí)踐---判斷腫瘤良性還是惡性

樸素貝葉斯算法項(xiàng)目實(shí)踐---判斷腫瘤良性還是惡性

引言

建模

總結(jié)

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

樸素貝葉斯算法項(xiàng)目實(shí)踐---判斷腫瘤良性還是惡性

引言

建模

總結(jié)

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av