欧美人人草,99热色五月

數(shù)值映射和獨熱編碼

在機器學習和深度學習中，當輸入特征為類別型 Categorical 數(shù)據(jù)時，為了實現(xiàn)特征擴充使得這些特征可以參與網(wǎng)絡的線性求和及后續(xù)的激活，可以根據(jù)類別特征是否具有量值屬性而將其按照如下兩種方式進行處理：

如果類別特征具有量值屬性，且可以在后續(xù)計算中應該以不同大小的數(shù)值形式參與計算，如尺寸，型號等，那么可以直接以映射的形式分配數(shù)值編碼
如果類別特征沒有量值屬性，可以將分類設置成相應數(shù)量的多個特征，并將輸入的值在對應特征分類下設置為 1，如此不僅有效的處理了類別特征，還可以使這些特征有效的參與計算，這種方法稱為獨熱編碼 One-hot encoding

one-hot is a group of bits among which the legal combinations of values are only those with a single high (1) bit and all the others low (0).^[1] A similar implementation in which all bits are '1' except one '0' is sometimes called one-cold. - Wiki

One-hot Encoding for categorical data

上述兩種方式都可以方便的通過 Pandas 進行：

import pandas as pd

df = pd.DataFrame([
    ['green', 'M', 10.1, 'class1'],   
    ['red', 'L', 13.5, 'class2'],   
    ['blue', 'XL', 15.3, 'class1']], 
    columns=['color', 'size', 'prize', 'class_label'])  
df

Out[2]:
    color   size prize  class_label
0   green   M    10.1   class1
1   red     L    13.5   class2
2   blue    XL   15.3   class1

對尺寸這個具有量值意義的特征進行量值映射，在此等級這個屬性也不具有量值意義，但由于只有兩個分類，因此在此演示采用映射的形式進行，需要注意的是也可以通過后續(xù)對于顏色的處理方式進行：

In [3]:
# mapping the size
size_mapping = {'XL': 3, 'L': 2,  'M': 1} 
df['size'] = df['size'].map(size_mapping)  

# mapping the class 
class_mapping = {label: index for index, label in enumerate(set(df['class_label']))}
df['class_label'] = df['class_label'].map(class_mapping)  

df
Out[3]:
    color   size    prize   class_label
0   green   1       10.1    1
1   red     2       13.5    0
2   blue    3       15.3    1

對顏色這列沒有量值意義的分類使用 pd.get_dummies( ) 進行獨熱編碼，并在編碼后去掉原數(shù)據(jù)中的 color 列:

In [8]:
one_hot_encoded = pd.concat([df, pd.get_dummies(df['color'], prefix='color')], axis=1)
one_hot_encoded
Out[8]:
    color   size    prize   class_label color_blue  color_green color_red
0   green   1       10.1    1           0           1           0
1   red     2       13.5    0           0           0           1
2   blue    3       15.3    1           1           0           0

In [10]:
one_hot_encoded.drop('color', axis=1)
Out[10]:
    size    prize   class_label color_blue  color_green color_red
0   1       10.1    1           0           1           0
1   2       13.5    0           0           0           1
2   3       15.3    1           1           0           0

在 Keras 中利用 np_utils.to_categorical( ) 進行 One-hot key encoding 的實現(xiàn)過程如下：

In [1]
from keras.utils import np_utils

# print first ten (integer-valued) training labels
print('Integer-valued labels:')
print(y_train[:10])

# one-hot encode the labels
y_train = np_utils.to_categorical(y_train, 10)
y_test = np_utils.to_categorical(y_test, 10)

# print first ten (one-hot) training labels
print('One-hot labels:')
print(y_train[:10])

Out[1]

Integer-valued labels:
[5 0 4 1 9 2 1 3 1 4]
One-hot labels:
[[ 0.  0.  0.  0.  0.  1.  0.  0.  0.  0.]
 [ 1.  0.  0.  0.  0.  0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  1.  0.  0.  0.  0.  0.]
 [ 0.  1.  0.  0.  0.  0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  1.]
 [ 0.  0.  1.  0.  0.  0.  0.  0.  0.  0.]
 [ 0.  1.  0.  0.  0.  0.  0.  0.  0.  0.]
 [ 0.  0.  0.  1.  0.  0.  0.  0.  0.  0.]
 [ 0.  1.  0.  0.  0.  0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  1.  0.  0.  0.  0.  0.]]

參考閱讀

pandas 使用 get_dummies 進行 one-hot 編碼

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

深度學習中的數(shù)據(jù)處理 - 類別數(shù)據(jù)

深度學習中的數(shù)據(jù)處理 - 類別數(shù)據(jù)

數(shù)值映射和獨熱編碼

參考閱讀

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

深度學習中的數(shù)據(jù)處理 - 類別數(shù)據(jù)

數(shù)值映射和獨熱編碼

參考閱讀

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av