超超碰碰人人操,亚洲精品无码一二三区

注意：畫pr曲線可以參考http://www.cnblogs.com/aquastone/p/random-classifier.html

對matrix或array型數(shù)據(jù)做2分類時，如何畫出其roc曲線？

1）首先看一下roc_curve的定義：

ROC曲線的全稱是“受試者工作特性”曲線（Receiver Operating Characteristic），源于二戰(zhàn)中用于敵機檢測的雷達信號分析技術。是反映敏感性和特異性的綜合指標。它通過將連續(xù)變量設定出多個不同的臨界值，從而計算出一系列敏感性和特異性，再以敏感性為縱坐標、（1-特異性）為橫坐標繪制成曲線，曲線下面積越大，判別的準確性越高。在ROC曲線上，最靠近坐標圖左上方的點為敏感性和特異性均較高的臨界值。

2）如何作出ROC曲線：

? ? ? 根據(jù)機器學習中分類器的預測得分對樣例進行排序，按照順序逐個把樣本作為正例進行預測，計算出FPR和TPR。分別以FPR、TPR為橫縱坐標作圖即可得到ROC曲線。所以作ROC曲線時，需要先求出FPR和TPR。這兩個變量的定義：

FPR = TP/(TP+FN)

TPR = TP/(TP+FP)

TP、FN、FP的定義見下表，表中描述了是一個二分類問題的混淆矩陣：

fig.1.分類結果混淆矩陣

表格來源：http://blog.csdn.net/ai_vivi/article/details/43836641

TP：正確肯定——實際是正例，識別為正例

FN：錯誤否定（漏報）——實際是正例，卻識別成了負例

FP：錯誤肯定（誤報）——實際是負例，卻識別成了正例

TN：正確否定——實際是負例，識別為負例

3）ROC曲線示意圖

fig.2.ROC曲線示意圖

? ? ? 將樣本輸入分類器，每個樣本將得到一個預測得分。我們通過設置不同的截斷點，即可截取不同的信息。對應此示例圖中，每個閾值的識別結果對應一個點(FPR，TPR)。當閾值取最大時，所有樣本都被識別成負樣本，對應于坐下角的點(0,0); 當閾值取最小時，所有樣本都被識別成正樣本，對應于右上角的點(1,1)，隨著閾值從最大變化到最小，TP和FP都逐漸大；

? ? ? 那么得到曲線后我們將用什么指標來衡量ROC曲線的好壞呢？這里給出AUC這個指標。AUC表示ROC曲線下方的面積值AUC（Area Under ROC Curve）：如果分類器能完美的將樣本進行區(qū)分，那么它的AUG = 1 ; 如果模型是個簡單的隨機猜測模型，那么它的AUG = 0.5，對應圖中的直線（y=x）。此外，如果一個分類器優(yōu)于另一個，則它的曲線下方面積相對較大。

4）如何用python的sklearn畫ROC曲線

sklearn.metrics.roc_curve函數(shù)提供了很好的解決方案。

首先看一下這個函數(shù)的用法：

fpr, tpr, thresholds=

sklearn.metrics.roc_curve(y_true,y_score,pos_label=None,sample_weight=None,

drop_intermediate=True)

參數(shù)解析（來源sklearn官網(wǎng)）：

y_true: array, shape = [n_samples]

True binary labels in range {0, 1} or {-1, 1}. If labels are not binary, pos_label should be explicitly given.

即真實標簽矩陣。

y_score : array, shape = [n_samples]

Target scores, can either be probability estimates of the positive class, confidence values, or non-thresholded measure of decisions (as returned by “decision_function” on some classifiers).

即模型的預測結果矩陣。

pos_label : int or str, default=None

Label considered as positive and others are considered negative.

即標簽中認定為正的label個數(shù)。

例如label= [1,2,3,4]，如果設置pos_label = 2,則認為3,4為positive，其他均為negtive。

若label= ['a','a','b','c'], 設置pos_label =’則認為'b'為positive，其他均為negtive。

sample_weight: array-like of shape = [n_samples]

optional Sample weights.

即采樣權重，可選擇取其中的一部分進行計算。

drop_intermediate: boolean, optional (default=True)

Whether to drop some suboptimal thresholds which would not appear on a plotted ROC curve. This is useful in order to create lighter ROC curves.

即可選擇去掉一些對于ROC性能不利的閾值，使得得到的曲線有更好的表現(xiàn)性能。

返回值Returns:

thresholds: array, shape = [n_thresholds]

Decreasing thresholds on the decision function used to compute fpr and tpr. thresholds[0] represents no instances being predicted and is arbitrarily set to max(y_score) + 1.

即所選擇不同的閾值，按照降序排列。

fpr : array, shape = [>2]

Increasing false positive rates such that element i is the false positive rate of predictions with score >= thresholds[i].

根據(jù)不同閾值求出的fpr。

tpr: array, shape = [>2]

Increasing true positive rates such that element i is the true positive rate of

predictions with score >= thresholds[i].

根據(jù)不同閾值求出來的tpr上述方法可得到一組tpr和fpr，在此基礎上即可作出roc曲線。求AUC可通過函數(shù)auc(fpr,tpr)，其返回值即為AUC的值。

5）實例分析如下：

import numpy as np

from sklearn import metrics

import matplotlib.pyplot as plt

from sklearn.metrics import auc

y = np.array([1,1,2,3])

#y為數(shù)據(jù)的真實標簽

scores = np.array([0.1, 0.2, 0.35, 0.8])

#scores為分類其預測的得分

fpr, tpr, thresholds = metrics.roc_curve(y, scores, pos_label=2)

#得到fpr,tpr, thresholds

返回值對應如下：

得到一組fpr和tpr之后即可畫出該次測試對應的roc曲線

plt.plot(fpr,tpr,marker = 'o')

plt.show()

得到ROC曲線：

fig.4.ROC曲線

求出AUC：

from sklearn.metrics import auc

AUC = auc(fpr, tpr)

最終得到AUC=0.67

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

利用sklearn做ROC曲線

利用sklearn做ROC曲線

相關閱讀更多精彩內容

友情鏈接更多精彩內容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

利用sklearn做ROC曲線

相關閱讀更多精彩內容

友情鏈接更多精彩內容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av