国产午夜操逼1区2区,男人的天堂黄色,久久青青性爱视频

1.錯誤類型

過擬合
欠擬合

2.模型復(fù)雜度圖表

3.交叉驗證集
用語選擇模型

4.K折交叉驗證

一個非常有用的循環(huán)利用數(shù)據(jù)的方法
在K折交叉驗證中，將數(shù)據(jù)分為K個包

如上圖所示，這里K = 4,然后我們將模型培訓(xùn)K次

每次將不同的包用作測試集，剩下的作為訓(xùn)練集，然后求結(jié)果的平均值，得到最終模型。

from sklearn.model_selection import KFold
kf = KFold(12,3，shuffle = True) #參數(shù)為數(shù)據(jù)大小和測試集的大小,shuffle = True 表示隨機(jī)
for train_indices, test_indices in kf:
    print train_indices, test_indicies

建議隨機(jī)初始化數(shù)據(jù)，以消除任何可能的偏差。

學(xué)習(xí)曲線

通過學(xué)習(xí)曲線檢測過擬合和欠擬合
將使用三個模型來訓(xùn)練下面的圓形數(shù)據(jù)集

決策樹模型
邏輯回歸模型
支持向量機(jī)模型

其中一個模型會過擬合，一個欠擬合，還有一個正常。首先，我們將編寫代碼為每個模型繪制學(xué)習(xí)曲線，最后我們將查看這些學(xué)習(xí)曲線，判斷每個模型對應(yīng)哪個曲線
首先，請記住三個模型的學(xué)習(xí)曲線外觀如下所示：

網(wǎng)格搜索

在sklearn 中的網(wǎng)格搜索
在 sklearn 中的網(wǎng)格搜索非常簡單。我們將用一個例子來說明一下。假設(shè)我們想要訓(xùn)練支持向量機(jī)，并且我們想在以下參數(shù)之間做出決定：

kernel：poly或rbf。
C：0.1,1 或 10。
具體步驟如下所示:

導(dǎo)入 GridSearchCV

from sklearn.model_selection import GridSearchCV

2.選擇參數(shù)
現(xiàn)在我們來選擇我們想要選擇的參數(shù)，并形成一個字典。在這本字典中，鍵 (keys) 將是參數(shù)的名稱，值（values) 將是每個參數(shù)可能值的列表。

parameters = {'kernel':['poly', 'rbf'],'C':[0.1, 1, 10]}

3.創(chuàng)建一個評分機(jī)制（scorer）
我們需要確認(rèn)將使用什么指標(biāo)來為每個候選模型評分。這里，我們將使用 F1 分?jǐn)?shù)。

from sklearn.metrics import make_scorer
from sklearn.metrics import f1_score
scorer = make_scorer(f1_score)

使用參數(shù) (parameter) 和評分機(jī)制 (scorer) 創(chuàng)建一個 GridSearch 對象。使用此對象與數(shù)據(jù)保持一致（fit the data)

# Create the object.
grid_obj = GridSearchCV(clf, parameters, scoring=scorer)
# Fit the data
grid_fit = grid_obj.fit(X, y)

5.獲得最佳估算器 (estimator)

best_clf = grid_fit.best_estimator_

例子

使用網(wǎng)格搜索來完善模型
1.首先，定義一些參數(shù)來執(zhí)行網(wǎng)格搜索。我們建議使用max_depth, min_samples_leaf, 和 min_samples_split。

2.使用f1_score，為模型制作記分器。

3.使用參數(shù)和記分器，在分類器上執(zhí)行網(wǎng)格搜索。

4.將數(shù)據(jù)擬合到新的分類器中。

5.繪制模型并找到 f1_score。

6.如果模型不太好，請嘗試更改參數(shù)的范圍并再次擬合。

from sklearn.metrics import make_scorer
from sklearn.model_selection import GridSearchCV

clf = DecisionTreeClassifier(random_state=42)

# TODO: Create the parameters list you wish to tune.
parameters = {'max_depth':[2,4,6,8,10],'min_samples_leaf':[2,4,6,8,10], 'min_samples_split':[2,4,6,8,10]}

# TODO: Make an fbeta_score scoring object.
scorer = make_scorer(f1_score)

# TODO: Perform grid search on the classifier using 'scorer' as the scoring method.
grid_obj = GridSearchCV(clf, parameters, scoring=scorer)

# TODO: Fit the grid search object to the training data and find the optimal parameters.
grid_fit = grid_obj.fit(X_train, y_train)

# Get the estimator.
best_clf = grid_fit.best_estimator_

# Fit the new model.
best_clf.fit(X_train, y_train)

# Make predictions using the new model.
best_train_predictions = best_clf.predict(X_train)
best_test_predictions = best_clf.predict(X_test)

# Calculate the f1_score of the new model.
print('The training F1 Score is', f1_score(best_train_predictions, y_train))
print('The testing F1 Score is', f1_score(best_test_predictions, y_test))

# Plot the new model.
plot_model(X, y, best_clf)

# Let's also explore what parameters ended up being used in the new model.
best_clf

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

第四天-模型選擇

第四天-模型選擇

學(xué)習(xí)曲線

網(wǎng)格搜索

例子

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

第四天-模型選擇

學(xué)習(xí)曲線

網(wǎng)格搜索

例子

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av