人妻在线大香蕉,亚洲伦理久久久,少妇AV一区

使用Jupyter notebook

%matplotlib qt
import numpy as np
from sklearn import metrics
from sklearn.neighbors import KNeighborsClassifier

讀取txt數(shù)據(jù)，最后一列為標(biāo)簽

data = []
labels = []
with open('data\\datingTestSet.txt') as f:
    for line in f:
        tokens = line.strip().split('\t')
        data.append([float(tk) for tk in tokens[:-1]])
        labels.append(tokens[-1])

data[1:10]
np.unique(labels)
array(['didntLike', 'largeDoses', 'smallDoses'],
dtype='|S10')

處理字符標(biāo)簽為數(shù)字標(biāo)簽

x = np.array(data)
labels = np.array(labels)
y = np.zeros(labels.shape)
y[labels=='didntLike'] = 1
y[labels=='smallDoses'] = 2
y[labels=='largeDoses'] = 3

數(shù)據(jù)未歸一化前

model = KNeighborsClassifier(n_neighbors=3)
model.fit(x,y)
print(model)
expected = y
predicted = model.predict(x)
print metrics.classification_report(expected,predicted,target_names=['didntLike','smallDoses','largeDoses'])
print metrics.confusion_matrix(expected,predicted)

結(jié)果：

KNeighborsClassifier(algorithm='auto', leaf_size=30, metric='minkowski',
metric_params=None, n_jobs=1, n_neighbors=3, p=2,
weights='uniform')
precision recall f1-score support

didntLike 0.89 0.85 0.87 342
smallDoses 0.93 0.98 0.96 331
largeDoses 0.82 0.83 0.82 327

avg / total 0.88 0.88 0.88 1000

[[289 0 53]
[ 1 325 5]
[ 33 24 270]]

數(shù)據(jù)歸一化到[0-1范圍]

from sklearn import preprocessing
min_max_scaler = preprocessing.MinMaxScaler()
X_train_minmax = min_max_scaler.fit_transform(x)
X_train_minmax
array([[ 0.44832535,  0.39805139,  0.56233353],
       [ 0.15873259,  0.34195467,  0.98724416],
       [ 0.28542943,  0.06892523,  0.47449629],
       ..., 
       [ 0.29115949,  0.50910294,  0.51079493],
       [ 0.52711097,  0.43665451,  0.4290048 ],
       [ 0.47940793,  0.3768091 ,  0.78571804]])

拆分訓(xùn)練數(shù)據(jù)與測(cè)試數(shù)據(jù)

from sklearn.cross_validation import train_test_split  
''''' 拆分訓(xùn)練數(shù)據(jù)與測(cè)試數(shù)據(jù) '''  
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size = 0.2)

歸一化后結(jié)果
n_neighbors = 3 K近鄰的K取值為3

x_train, x_test, y_train, y_test = train_test_split(X_train_minmax, y, test_size = 0.2)  
model = KNeighborsClassifier(n_neighbors=3)
model.fit(x_train,y_train)
print(model)
expected = y_test
predicted = model.predict(x_test)
print metrics.classification_report(expected,predicted,target_names=['didntLike','smallDoses','largeDoses'])
print metrics.confusion_matrix(expected,predicted)

結(jié)果：

KNeighborsClassifier(algorithm='auto', leaf_size=30, metric='minkowski',
metric_params=None, n_jobs=1, n_neighbors=3, p=2,
weights='uniform')
precision recall f1-score support

didntLike 0.97 1.00 0.99 68
smallDoses 0.93 1.00 0.96 51
largeDoses 1.00 0.93 0.96 81

avg / total 0.97 0.97 0.97 200

[[68 0 0]
[ 0 51 0]
[ 2 4 75]]

小結(jié)：
歸一化后的結(jié)果，與歸一化前相差很大

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

ML: KNN筆記

ML: KNN筆記

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

ML: KNN筆記

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av