回歸

RMSE（Root Mean Square Error）均方根誤差

衡量觀(guān)測(cè)值與真實(shí)值之間的偏差。常用來(lái)作為機(jī)器學(xué)習(xí)模型預(yù)測(cè)結(jié)果衡量的標(biāo)準(zhǔn)。~~如果存在個(gè)別偏離程度非常大的離群點(diǎn)（ Outlier）時(shí)，即使離群點(diǎn)數(shù)量非常少，也會(huì)讓RMSE指標(biāo)變得很差。~~
$RMSE = \sqrt{\frac{1}{m} \sum_{i=1}^{m}(\hat{y_i}-y_i)^2}$

MSE（Mean Square Error）均方誤差

通過(guò)平方的形式便于求導(dǎo)，所以常被用作線(xiàn)性回歸的損失函數(shù)。
$MSE = \frac{1}{m} \sum_{i=1}^{m} (\hat{y_i}-y_i)^2$

~~L2 loss對(duì)異常敏感~~，用了MSE為代價(jià)函數(shù)的模型因?yàn)橐钚』@個(gè)異常值帶來(lái)的誤差，就會(huì)盡量貼近異常值，也就是對(duì)outliers（異常值）賦予更大的權(quán)重。這樣就會(huì)影響總體的模型效果。

MAE（Mean Absolute Error）平均絕對(duì)誤差

是絕對(duì)誤差的平均值。可以更好地反映預(yù)測(cè)值誤差的實(shí)際情況。
$MAE = \frac{1}{m} \sum_{i=1}^{m} |\hat{y_i}-y_i|$

相比MSE來(lái)說(shuō)，MAE在數(shù)據(jù)里有不利于預(yù)測(cè)結(jié)果異常值的情況下~~魯棒性更好~~。

SD（Standard Deviation）標(biāo)準(zhǔn)差

方差的算術(shù)平均根。用于衡量~~一組數(shù)值的離散程度~~。
$SD = \sqrt{\frac{1}{m} \sum_{i=1}^{m} (avg(x)-x_i)^2}$

R2(R- Square）擬合優(yōu)度

R2=SSR/SST=1-SSE/SST
其中：SST=SSR+SSE，

SST(total sum of squares)為總離差平方和， $S S_{\text {tot}}=\sum\left(y_{i}-\overline{y}_{i}\right)^{2}$
SSR(regression sum of squares)為回歸平方和， $S S_{\text {reg}}=\sum\left(\hat{y_{i}}-\overline{y}_{i}\right)^{2}$
SSE(error sum of squares) 為殘差平方和， $S S_{\text {res}}=\sum\left(y_{i}-\hat{y}_{i}\right)^{2}$

其中 $\overline{y}$ 表示 $y$ 的平均值得到 $R^2$ 表達(dá)式為：
$R^{2}=1-\frac{S S_{\text {res}}}{S S_{\text {tot}}}=1-\frac{\sum\left(y_{i}-\hat{y}_{i}\right)^{2}}{\sum\left(y_{i}-\overline{y}\right)^{2}}$

$R^2$ 因變量的變異能通過(guò)回歸關(guān)系被由自変量解釋的比例取值范國(guó)是0~1，R越近1表明回歸平方和占總平方和的比例越大回歸線(xiàn)與各觀(guān)則點(diǎn)越接近，回歸的擬合程度就越好。所以R也稱(chēng)為擬合優(yōu)度（ Goodness of Fit）的統(tǒng)計(jì)量

Error = Bias + Variance

Error反映的是整個(gè)模型的準(zhǔn)確度，Bias反映的是模型在樣本上的輸出與真實(shí)值之間的誤差，即模型本身的精準(zhǔn)度，Variance反映的是模型每一次輸出結(jié)果與模型輸出期望之間的誤差，即模型的穩(wěn)定性。

分類(lèi)

對(duì)數(shù)損失不適用于樣本不均衡時(shí)的分類(lèi)評(píng)估指標(biāo)
ROC-AUC可作為樣本正負(fù)不均衡時(shí)的分類(lèi)評(píng)估指標(biāo)
如果我們想讓少數(shù)情況被正確預(yù)測(cè)，就用ROC-AUC作為評(píng)估指標(biāo)
F1- Score和PR曲線(xiàn)在正樣本極少時(shí)適用于作為分類(lèi)評(píng)估指標(biāo)
F1- Score和PR曲線(xiàn)在FP比FN更重要時(shí)，適用于作為分類(lèi)評(píng)估指標(biāo)

第一個(gè)字母T或F，代表這個(gè)分類(lèi)結(jié)果是否正確，第二個(gè)字母P或N，代表分類(lèi)器認(rèn)為是正例還是負(fù)例。

1.準(zhǔn)確率（accuracy）

所有預(yù)測(cè)正確的樣本/總的樣本 = （TP+TN）/總

from sklearn.metrics import accuracy
accuracy = accuracy_score(y_test, y_predict)

2.查準(zhǔn)率（precision)

預(yù)測(cè)為正的樣本中有多少是真的正樣本。兩種可能，一種就是把正類(lèi)預(yù)測(cè)為正類(lèi)(TP)，另一種就是把負(fù)類(lèi)預(yù)測(cè)為正類(lèi)(FP)
$TPR = \frac{TP}{TP+FP}$

from sklearn.metrics import precision_score
precision = precision_score(y_test, y_predict)

3.查全率/召回率(recall)

樣本中的正樣本有多少被預(yù)測(cè)正確了。兩種可能，一種是把原來(lái)的正類(lèi)預(yù)測(cè)成正類(lèi)(TP)，另一種就是把原來(lái)的正類(lèi)預(yù)測(cè)為負(fù)類(lèi)(FN)：
$FPR = \frac{TP}{TP+FN}$

from sklearn.metrics import recall_score
recall = recall_score(y_test, y_predict)
#recall得到的是一個(gè)list，是每一類(lèi)的召回率

4.F1

是準(zhǔn)確率和召回率的調(diào)和平均
$F_{1}=2 \cdot \frac{\text { precision } \cdot \text {recall}}{\text {precision}+\text {recall}}$

$F1=\frac{ 2TP }{ 2TP+FP+FN }$

from sklearn.metrics import f1_score
f1_score(y_test, y_predict)

在一個(gè)總樣本中，正樣本占90%，負(fù)樣本占10%，樣本是嚴(yán)重不平衡的，只需要將全部樣本預(yù)測(cè)為正樣本
準(zhǔn)確率為90%
查準(zhǔn)率為90%
召回率100%
F1 為18/19

~~正負(fù)樣本數(shù)量往往很不均衡。，P-R曲線(xiàn)的變化就會(huì)非常大，而ROC曲線(xiàn)則能夠更加穩(wěn)定地反映模型本身的好壞。~~
如果研究者希望更多地看到模型在~~特定數(shù)據(jù)集上的表現(xiàn)，P-R曲線(xiàn)則能夠更直觀(guān)地反映其性能。~~

5.PR曲線(xiàn)

PR曲線(xiàn)是準(zhǔn)確率和召回率的點(diǎn)連成的線(xiàn)。

曲線(xiàn)越靠近右上角性能越好

PR曲線(xiàn)與ROC曲線(xiàn)的相同點(diǎn)是都采用了TPR (Recall)，都可以用AUC來(lái)衡量分類(lèi)器的效果。不同點(diǎn)是ROC曲線(xiàn)使用了FPR，而PR曲線(xiàn)使用了Precision，
因此PR曲線(xiàn)的兩個(gè)指標(biāo)都聚焦于正例。類(lèi)別不平衡問(wèn)題中由于主要關(guān)心正例，所以在此情況下PR曲線(xiàn)被廣泛認(rèn)為優(yōu)于ROC曲線(xiàn)。

6.ROC(Receiver Operating Characteristic）曲線(xiàn)，又稱(chēng)接受者操作特征曲線(xiàn)

通過(guò)動(dòng)態(tài)地調(diào)整截?cái)帱c(diǎn)，從最高的得分開(kāi)始（實(shí)際上是從正無(wú)窮開(kāi)始，對(duì)應(yīng)著ROC曲線(xiàn)的零點(diǎn)），逐漸調(diào)整到最低得分，每一個(gè)截?cái)帱c(diǎn)都會(huì)對(duì)應(yīng)一個(gè)FPR和TPR，在ROC圖上繪制出每個(gè)截?cái)帱c(diǎn)對(duì)應(yīng)的位置再連接所有點(diǎn)就得到最終的ROC曲線(xiàn)。

ROC的含義為概率曲線(xiàn)，AUC的含義為正負(fù)類(lèi)可正確分類(lèi)的程度。

左上角最好

TPR(True Positive Rate)真正例率/查準(zhǔn)率P
真實(shí)的正例中，被預(yù)測(cè)為正例的比例：TPR = TP/(TP+FN)。

FPR(False Positive Rate)假正例率****
真實(shí)的反例中，被預(yù)測(cè)為正例的比例：FPR = FP/(TN+FP)。

理想分類(lèi)器TPR=1，F(xiàn)PR=0。ROC曲線(xiàn)越接近左上角，代表模型越好，即ACU接近1

截?cái)帱c(diǎn)thresholds
指的就是區(qū)分正負(fù)預(yù)測(cè)結(jié)果的閾值

7.AUC

計(jì)算：分別隨機(jī)從正負(fù)樣本集中抽取一個(gè)正樣本，一個(gè)負(fù)樣本，~~正樣本的預(yù)測(cè)值大于負(fù)樣本的概率。~~

例題：對(duì)于樣本 (A, B, C, D, E) ,
已知其對(duì)應(yīng)的label為 (0, 1, 1 ,0 ,1)，
模型A的預(yù)估值為 (0.2, 0.4, 0.7, 0.3, 0.5),
模型 B 的預(yù)估值為(0.1, 0.3, 0.9, 0.2, 0.5)，
模型 A 和模型 B 的 AUC 一樣
本題樣本對(duì)（一個(gè)正樣本，一個(gè)負(fù)樣本組成一個(gè)樣本對(duì)）共有3*2=6個(gè)，
分別是（B，A）（B，D）（C，A）（C，D）（E，A）（E，D）。
模型A對(duì)應(yīng)概率為(0.4,0.2)，(0.4,0.3)，(0.7,0.2)，(0.7,0.3)，(0.5,0.2)，(0.5,0.3)，
可得其對(duì)應(yīng)AUC為：(1+1+1+1+1+1)/6 = 1。同理，模型B也等于1。

~~AUC值為ROC曲線(xiàn)所覆蓋的區(qū)域面積，~~顯然,AUC越大,分類(lèi)器分類(lèi)效果越好。

AUC = 1，是完美分類(lèi)器，采用這個(gè)預(yù)測(cè)模型時(shí)，不管設(shè)定什么閾值都能得出完美預(yù)測(cè)。絕大多數(shù)預(yù)測(cè)的場(chǎng)合，不存在完美分類(lèi)器。0.5 < AUC < 1，優(yōu)于隨機(jī)猜測(cè)。AUC = 0.5，跟隨機(jī)猜測(cè)一樣。AUC < 0.5，比隨機(jī)猜測(cè)還差。

Binary-class classification

import numpy as np
np.random.seed(10)
import matplotlib.pyplot as plt
from sklearn.datasets import make_classification
from sklearn.preprocessing import label_binarize
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import roc_curve

X, y = make_classification(n_samples=80000)
# print(X[0], y[0])
# (80000, 20) (80000,)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5)

X_train, X_train_lr, y_train, y_train_lr = train_test_split(X_train,                                                            y_train,                                                       test_size=0.5)

from keras.models import Sequential
from keras.layers import Dense
from sklearn.metrics import auc

model = Sequential()
model.add(Dense(20, input_dim=20, activation='relu'))
model.add(Dense(40, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=5, batch_size=100, verbose=1)

y_pred = model.predict(X_test).ravel()
print(y_pred.shape)

fpr, tpr, thresholds = roc_curve(y_test, y_pred)

roc_auc = auc(fpr, tpr)



plt.figure(1)
plt.plot([0, 1], [0, 1], 'k--')
plt.plot(fpr, tpr, label='Keras (area = {:.3f})'.format(roc_auc))
plt.xlabel('False positive rate')
plt.ylabel('True positive rate')
plt.title('ROC curve')
plt.legend(loc='best')
plt.show()
# Zoom in view of the upper left corner.
plt.figure(2)
plt.xlim(0, 0.2)
plt.ylim(0.8, 1)
plt.plot([0, 1], [0, 1], 'k--')
plt.plot(fpr, tpr, label='Keras (area = {:.3f})'.format(roc_auc))
plt.xlabel('False positive rate')
plt.ylabel('True positive rate')
plt.title('ROC curve (zoomed in at top left)')
plt.legend(loc='best')
plt.show()


# (Optional) Prediction probability density function(PDF)

import numpy as np
from scipy.interpolate import UnivariateSpline
from matplotlib import pyplot as plt

def plot_pdf(y_pred, y_test, name=None, smooth=500):
    positives = y_pred[y_test == 1]
    negatives = y_pred[y_test == 0]
    N = positives.shape[0]
    n = N//smooth
    s = positives
    p, x = np.histogram(s, bins=n) # bin it into n = N//10 bins
    x = x[:-1] + (x[1] - x[0])/2   # convert bin edges to centers
    f = UnivariateSpline(x, p, s=n)
    plt.plot(x, f(x))

    N = negatives.shape[0]
    n = N//smooth
    s = negatives
    p, x = np.histogram(s, bins=n) # bin it into n = N//10 bins
    x = x[:-1] + (x[1] - x[0])/2   # convert bin edges to centers
    f = UnivariateSpline(x, p, s=n)
    plt.plot(x, f(x))
    plt.xlim([0.0, 1.0])
    plt.xlabel('density')
    plt.ylabel('density')
    plt.title('PDF-{}'.format(name))
    plt.show()
plot_pdf(y_pred, y_test, 'Keras')

宏平均（Macro-averaging）和微平均（Micro-averaging）：

用途：用于多個(gè)類(lèi)別的分類(lèi)
宏平均：是先對(duì)每一個(gè)類(lèi)統(tǒng)計(jì)指標(biāo)值，然后在對(duì)所有類(lèi)求算術(shù)平均值。

微平均：是對(duì)數(shù)據(jù)集中的每一個(gè)實(shí)例不分類(lèi)別進(jìn)行統(tǒng)計(jì)建立全局混淆矩陣，然后計(jì)算相應(yīng)指標(biāo)。

Multi-class classification


from sklearn.datasets import make_classification
from sklearn.preprocessing import label_binarize
from keras.models import Sequential
from keras.layers import Dense
import numpy as np
from scipy import interp
import matplotlib.pyplot as plt
from itertools import cycle
from sklearn.model_selection import train_test_split
from sklearn.metrics import roc_curve, auc

# 標(biāo)簽共三類(lèi)
n_classes = 3

X, y = make_classification(n_samples=80000, n_features=20, n_informative=3, n_redundant=0, n_classes=n_classes,
    n_clusters_per_class=2)
# print(X.shape, y.shape)
# print(X[0], y[0])
# (80000, 20) (80000,)
# [-1.90920853 -1.30052757 -0.76903467 -3.2546519  -0.02947816  0.14105006
#   0.43556031 -0.81300607 -0.94553296 -0.92774495  1.49041451 -0.4443121
#  -1.16342165 -0.32997815 -1.02907045 -0.39950447 -0.711287    0.51382424
#   2.88822258 -2.0935274 ] 
# 1

# Binarize the output相當(dāng)于one_hot
y = label_binarize(y, classes=[0, 1, 2])
# print(y.shape, y[0])
# (80000, 3) [0 1 0]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5)
model = Sequential()
model.add(Dense(20, input_dim=20, activation='relu'))
model.add(Dense(40, activation='relu'))
model.add(Dense(3, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=1, batch_size=100, verbose=1)

y_pred = model.predict(X_test)
# print(y_pred.shape)
# (40000, 3)

# Compute ROC curve and ROC area for each class
fpr = dict()
tpr = dict()
roc_auc = dict()
for i in range(n_classes):
    # scores = np.array([0.1, 0.4, 0.35, 0.8])
    # fpr, tpr, thresholds = metrics.roc_curve(y, scores, pos_label=2)
    # y 就是標(biāo)準(zhǔn)值，scores 是每個(gè)預(yù)測(cè)值對(duì)應(yīng)的陽(yáng)性概率，比如0.1就是指第一個(gè)數(shù)預(yù)測(cè)為陽(yáng)性的概率為0.1，很顯然，
    # y 和 socres應(yīng)該有相同多的元素，都等于樣本數(shù)。pos_label=2 是指在y中標(biāo)簽為2的是標(biāo)準(zhǔn)陽(yáng)性標(biāo)簽，其余值是陰性。
    # 接下來(lái)選取一個(gè)閾值計(jì)算TPR/FPR,閾值的選取規(guī)則是在scores值中從大到小的以此選取，于是第一個(gè)選取的閾值是0.8
    # label=[1,1,2,2] scores=[0.1,0.4,0.35,0.8] thresholds=[0.8,0.4,0.35,0.1] 以threshold為0.8為例，將0.8與
    # scores 中所有值比較大小得到預(yù)測(cè)值，[0,0,0,1].對(duì)于label中兩個(gè)1，其概率分別為0.1，0.4，小于閾值0.8，判定為
    # 負(fù)樣本，而他們的label是1，說(shuō)明他們確實(shí)是負(fù)樣本，判斷正確，是兩個(gè)TN；兩個(gè)2，對(duì)應(yīng)概率為0.35，0.8，0.35小于
    # 0.8，判定為負(fù)樣本，但是label是2，應(yīng)該是個(gè)正樣本，所以這是個(gè)FN；最后0.8>=0.8,這是個(gè)TP，所以最后的結(jié)果是
    # ：1個(gè)TP，2個(gè)TN，1個(gè)FN，0個(gè)FP
    fpr[i], tpr[i], thresholds = roc_curve(y_test[:, i], y_pred[:, i])  # (40000,)
    # print(fpr[i].shape)# (5491,)# (6562,)# (4271,)
    roc_auc[i] = auc(fpr[i], tpr[i])
    

# 計(jì)算microROC曲線(xiàn)和ROC面積 
# .ravel()將多維數(shù)組轉(zhuǎn)換為一維數(shù)組
fpr["micro"], tpr["micro"]  , thresholds = roc_curve(y_test.ravel(), y_pred.ravel())  #  (120000,)
roc_auc["micro"] = auc(fpr["micro"], tpr["micro"])

# 計(jì)算macroROC曲線(xiàn)和ROC面積
# 首先，匯總所有的假陽(yáng)性率
# np.unique() 該函數(shù)是去除數(shù)組中的重復(fù)數(shù)字，并進(jìn)行排序之后輸出。
# print(np.concatenate([fpr[i] for i in range(n_classes)]).shape) (16324,)
all_fpr = np.unique(np.concatenate([fpr[i] for i in range(n_classes)]))  # (7901,)
# 然后插值所有的ROC曲線(xiàn)在這一點(diǎn)
# np.zeros_like() 這個(gè)函數(shù)的意思就是生成一個(gè)和你所給數(shù)組a相同shape的全0數(shù)組。
mean_tpr = np.zeros_like(all_fpr)
for i in range(n_classes):
    mean_tpr += interp(all_fpr, fpr[i], tpr[i])
    
# 最后求平均值并計(jì)算AUC
mean_tpr /= n_classes
fpr["macro"] = all_fpr
tpr["macro"] = mean_tpr
roc_auc["macro"] = auc(fpr["macro"], tpr["macro"])

# Plot all ROC curves
plt.figure(1)
plt.plot(fpr["micro"], tpr["micro"], color='deeppink', linestyle=':', linewidth=4,
         label='micro-average ROC curve (area = {0:0.2f})'.format(roc_auc["micro"]))

plt.plot(fpr["macro"], tpr["macro"],color='navy', linestyle=':', linewidth=4,
         label='macro-average ROC curve (area = {0:0.2f})'.format(roc_auc["macro"]))

colors = cycle(['aqua', 'darkorange', 'cornflowerblue'])
for i, color in zip(range(n_classes), colors):
    plt.plot(fpr[i], tpr[i], color=color, linewidth=2,
             label='ROC curve of class {0} (area = {1:0.2f})'.format(i, roc_auc[i]))

plt.plot([0, 1], [0, 1], 'k--', linewidth=2)
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Some extension of Receiver Operating Characteristic to multi-class')
plt.legend(loc='best')
plt.show()


# Zoom in view of the upper left corner.
plt.figure(2)
plt.xlim(0, 0.2)
plt.ylim(0.8, 1)
plt.plot(fpr["micro"], tpr["micro"],color='deeppink', linestyle=':', linewidth=4,
         label='micro-average ROC curve (area = {0:0.2f})'.format(roc_auc["micro"]))

plt.plot(fpr["macro"], tpr["macro"],color='navy', linestyle=':', linewidth=4,
         label='macro-average ROC curve (area = {0:0.2f})'.format(roc_auc["macro"]))

colors = cycle(['aqua', 'darkorange', 'cornflowerblue'])
for i, color in zip(range(n_classes), colors):
    plt.plot(fpr[i], tpr[i], color=color, linewidth=2,
             label='ROC curve of class {0} (area = {1:0.2f})'.format(i, roc_auc[i]))

plt.plot([0, 1], [0, 1], 'k--', linewidth=2)
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('ROC curve (zoomed in at top left)')
plt.legend(loc='best')
plt.show()

8.混淆矩陣confusion matrix：

混淆矩陣的每一列代表了~~預(yù)測(cè)類(lèi)別~~每一行代表了數(shù)據(jù)的~~真實(shí)類(lèi)別~~

def plot_confusion_matrix(title, y_true, y_pred, labels):
    import matplotlib.pyplot as plt
    from sklearn.metrics import confusion_matrix
    
    cm = confusion_matrix(y_true, y_pred)
    
    # np.newaxis的作用就是在這一位置增加一個(gè)一維，這一位置指的是np.newaxis所在的位置，比較抽象，需要配合例子理解。
    # x1 = np.array([1, 2, 3, 4, 5])
    # the shape of x1 is (5,)
    # x1_new = x1[:, np.newaxis]
# now, the shape of x1_new is (5, 1)


    cm_normalized = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
    # print (cm, '\n\n', cm_normalized)
    # [[1 0 0 0 0]                           
    #  [0 1 0 0 0]
    #  [0 0 1 0 0]
    #  [0 0 0 1 0]
    #  [0 0 0 0 1]]

    #  [[1. 0. 0. 0. 0.]
    #  [0. 1. 0. 0. 0.]
    #  [0. 0. 1. 0. 0.]
    #  [0. 0. 0. 1. 0.]
    #  [0. 0. 0. 0. 1.]]
    tick_marks = np.array(range(len(labels))) + 0.5
    #  [0.5 1.5 2.5 3.5 4.5 5.5]
    np.set_printoptions(precision=2)
    
    plt.figure(figsize=(10, 8), dpi=120)
    ind_array = np.arange(len(labels))
    x, y = np.meshgrid(ind_array, ind_array)
    # print(ind_ａrray, '\n\n', x, '\n\n', y)
    # [0 1 2 3 4 5] 

    #  [[0 1 2 3 4 5]
    #  [0 1 2 3 4 5]
    #  [0 1 2 3 4 5]
    #  [0 1 2 3 4 5]
    #  [0 1 2 3 4 5]
    #  [0 1 2 3 4 5]] 

    #  [[0 0 0 0 0 0]
    #  [1 1 1 1 1 1]
    #  [2 2 2 2 2 2]
    #  [3 3 3 3 3 3]
    #  [4 4 4 4 4 4]
    #  [5 5 5 5 5 5]]
    intFlag = 0 # 標(biāo)記在圖片中對(duì)文字是整數(shù)型還是浮點(diǎn)型
    for x_val, y_val in zip(x.flatten(), y.flatten()):
        # plt.text()函數(shù)用于設(shè)置文字說(shuō)明。

        if (intFlag):
            c = cm[y_val][x_val]
            plt.text(x_val, y_val, "%d" % (c,), color='red', fontsize=8, va='center', ha='center')

        else:
            c = cm_normalized[y_val][x_val]
            if (c > 0.01):
                plt.text(x_val, y_val, "%0.2f" % (c,), color='red', fontsize=7, va='center', ha='center')
            else:
                plt.text(x_val, y_val, "%d" % (0,), color='red', fontsize=7, va='center', ha='center')
    cmap = plt.cm.binary
    if(intFlag):
        plt.imshow(cm, interpolation='nearest', cmap=cmap)
    else:
        plt.imshow(cm_normalized, interpolation='nearest', cmap=cmap)
    plt.gca().set_xticks(tick_marks, minor=True)
    plt.gca().set_yticks(tick_marks, minor=True)
    plt.gca().xaxis.set_ticks_position('none')
    plt.gca().yaxis.set_ticks_position('none')
    plt.grid(True, which='minor', linestyle='-')
    plt.gcf().subplots_adjust(bottom=0.15)
    plt.title(title)
    plt.colorbar()
    xlocations = np.array(range(len(labels)))
    plt.xticks(xlocations, labels, rotation=90)
    plt.yticks(xlocations, labels)
    plt.ylabel('Index of True Classes')
    plt.xlabel('Index of Predict Classes')
    plt.savefig('confusion_matrix.jpg', dpi=300)
    plt.show()
title='Confusion Matrix'
labels = ['A', 'B', 'C', 'F', 'G']
y_true = [1, 2, 3, 4, 5]# np.loadtxt(r'/home/dingtom/a.txt')
y_pred = [1, 2, 3, 4, 5]# np.loadtxt(r'/home/dingtom/b.txt')
plot＿confusion_matrix(title, y_true,y_pred, labels)

參考：
https://github.com/Tony607/ROC-Keras/blob/master/ROC-Keras.ipynb

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

模型評(píng)估指標(biāo)（RMSE、MSE、MAE、R2準(zhǔn)確率、召回率、F1、ROC曲線(xiàn)、AUC曲線(xiàn)、PR曲線(xiàn)）

模型評(píng)估指標(biāo)（RMSE、MSE、MAE、R2準(zhǔn)確率、召回率、F1、ROC曲線(xiàn)、AUC曲線(xiàn)、PR曲線(xiàn)）

回歸

RMSE（Root Mean Square Error）均方根誤差

MSE（Mean Square Error）均方誤差

MAE（Mean Absolute Error）平均絕對(duì)誤差

SD（Standard Deviation）標(biāo)準(zhǔn)差

R2(R- Square）擬合優(yōu)度

Error = Bias + Variance

分類(lèi)

1.準(zhǔn)確率（accuracy）

2.查準(zhǔn)率（precision)

3.查全率/召回率(recall)

4.F1

5.PR曲線(xiàn)

6.ROC(Receiver Operating Characteristic）曲線(xiàn)，又稱(chēng)接受者操作特征曲線(xiàn)

7.AUC

Binary-class classification

宏平均（Macro-averaging）和微平均（Micro-averaging）：

Multi-class classification

8.混淆矩陣confusion matrix：

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

模型評(píng)估指標(biāo)（RMSE、MSE、MAE、R2準(zhǔn)確率、召回率、F1、ROC曲線(xiàn)、AUC曲線(xiàn)、PR曲線(xiàn)）

回歸

RMSE（Root Mean Square Error）均方根誤差

MSE（Mean Square Error）均方誤差

MAE（Mean Absolute Error）平均絕對(duì)誤差

SD（Standard Deviation）標(biāo)準(zhǔn)差

R2(R- Square）擬合優(yōu)度

Error = Bias + Variance

分類(lèi)

1.準(zhǔn)確率（accuracy）

2.查準(zhǔn)率（precision)

3.查全率/召回率(recall)

4.F1

5.PR曲線(xiàn)

6.ROC(Receiver Operating Characteristic）曲線(xiàn)，又稱(chēng)接受者操作特征曲線(xiàn)

7.AUC

Binary-class classification

宏平均（Macro-averaging）和微平均（Micro-averaging）：

Multi-class classification

8.混淆矩陣confusion matrix：

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

模型評(píng)估指標(biāo)（RMSE、MSE、MAE、R2準(zhǔn)確率、召回率、F1、ROC曲線(xiàn)、AUC曲線(xiàn)、PR曲線(xiàn)）

6.ROC(Receiver Operating Characteristic）曲線(xiàn)，又稱(chēng)接受者操作特征曲線(xiàn)