用CNN實(shí)現(xiàn)離散數(shù)據(jù)的分類(以圖像分類為例)
卷積計(jì)算
- 卷積計(jì)算可認(rèn)為是一種有效提取圖像特征的方法
- 一般會(huì)用一個(gè)正方形的卷積核,按指定步長(zhǎng),在輸入特征圖上滑動(dòng)。遍歷輸入特征圖中的每一個(gè)像素點(diǎn),每一個(gè)步長(zhǎng),卷積核會(huì)與輸入特征圖出現(xiàn)重合區(qū)域,重合區(qū)域?qū)?yīng)元素相乘、求和再加上偏置項(xiàng)得到輸入特征的一個(gè)像素點(diǎn)
- 輸入特征圖的深度,決定了當(dāng)前層卷積核的深度
- 當(dāng)前卷積核的個(gè)數(shù),決定了當(dāng)前層輸出特征圖的深度
感受野
- 感受野:卷積神經(jīng)網(wǎng)絡(luò)各輸出特征圖中的每個(gè)像素點(diǎn),在原始輸入圖片上映射區(qū)域等的大小
全零填充(padding)
- 卷積計(jì)算保持輸入特征圖的尺寸不變
- TF描述全零填充,用參數(shù)
padding='SAME'或padding='VALID'表示
TF描述卷積計(jì)算層
tf.keras.layers.Conv2D(
filters=卷積核個(gè)數(shù),
kernel_size=卷積核尺寸,#正方形寫(xiě)核長(zhǎng)整數(shù),或(核高h(yuǎn),核寬w)
strides=滑動(dòng)步長(zhǎng),#橫縱向相同寫(xiě)步長(zhǎng)整數(shù),或(縱向步長(zhǎng)h,橫向步長(zhǎng)w),默認(rèn)1
padding="same" or "valid",#使用全零填充是"same",不使用是"valid"(默認(rèn))
activation="relu" or "sigmoid" or "tanh" or "softmax"等,#如有BN此處不寫(xiě)
input_shape=(高,寬,通道數(shù)) #輸入特征圖維度,可省略
)
舉個(gè)例子
model=tf.keras.models.Sequential([
Conv2D(6,5,padding='valid',activation='sigmoid'),
MaxPool2D(2,2),
Conv2D(6,(5,5),padding='valid',activation='sigmoid'),
MaxPool2D(2,2),
Conv2D(filters=6,kernel_size=(5,5),padding='valid',activation='sigmoid'),
MaxPool2D(pool_size=(2,2),strides=2),
Flatten(),
Dense(10,activation='softmax')
])
批標(biāo)準(zhǔn)化(batch normalization,BN)
- 標(biāo)準(zhǔn)化:使數(shù)據(jù)符合0均值,1為標(biāo)準(zhǔn)差的分布
- 批標(biāo)準(zhǔn)化:對(duì)一小批數(shù)據(jù)(batch),做標(biāo)準(zhǔn)化處理,使數(shù)據(jù)回歸標(biāo)準(zhǔn)正態(tài)分布,常用在卷積操作和激活操作之間,
- 批標(biāo)準(zhǔn)化后,第
個(gè)卷積核的輸出特征圖中第
個(gè)像素點(diǎn)滿足公式如下
上述公式中,
:批標(biāo)準(zhǔn)化前,第k個(gè)卷積核,輸出特征圖中第i個(gè)像素點(diǎn)
:批標(biāo)準(zhǔn)化前,第k個(gè)卷積核,batch張輸出特征圖中所有像素點(diǎn)平均值
:批處理化前,第k個(gè)卷積核,batch張輸出特征圖中所有像素點(diǎn)標(biāo)準(zhǔn)差
但是這種簡(jiǎn)單的特征數(shù)據(jù)標(biāo)準(zhǔn)化使特征數(shù)據(jù)完全滿足標(biāo)準(zhǔn)正態(tài)分布,集中在激活函數(shù)中心的線性區(qū)域,使激活函數(shù)喪失了非線性特性,因此在BN操作中為每個(gè)卷積核引入了兩個(gè)可訓(xùn)練參數(shù),縮放因子和偏移因子
反向傳播時(shí),縮放因子和偏移因子
會(huì)與其他待訓(xùn)練參數(shù)一同被訓(xùn)練優(yōu)化,使標(biāo)準(zhǔn)正態(tài)分布后的特征數(shù)據(jù)通過(guò)縮放因子和偏移因子優(yōu)化了特征數(shù)據(jù)分布的寬窄和偏移量,保證了網(wǎng)絡(luò)的非線性表達(dá)力
BN層位于卷積層之后,激活層之前
TF描述批標(biāo)準(zhǔn)化tf.keras.layers.BatchNormalization()
model=tf.keras.models.Squential([
Conv2D(filters=6,kernel_size=(5,5),padding='same'),#卷積層
BatchNormalization(),#BN層
Activation('relu'),#激活層
MaxPool2D(pool_size=(2,2),strides=2,padding='same'),#池化層
Dropout(0.2), #Dropout層
])
池化(Pooling)
- 池化操作用于減少卷積神經(jīng)網(wǎng)絡(luò)中特征數(shù)據(jù)量,池化的主要方法有最大池化和均值池化,最大池化可以提取圖片紋理,均值池化可以保留背景特征
- TF描述池化
tf,keras.layers.MaxPool2D(
pool_size=池化核尺寸, #正方形寫(xiě)核長(zhǎng)整數(shù),或(核高h(yuǎn),核寬w)
strides=池化步長(zhǎng), #步長(zhǎng)整數(shù),或(縱向步長(zhǎng)h,橫向步長(zhǎng)w),默認(rèn)為pool_size
padding='valid' or 'same'#使用全零填充是"same",不使用是"valid"(默認(rèn))
)
-------------------------------------------------------------------------
tf.keras.layers.AveragePooling2D(
pool_size=池化核尺寸,#正方形寫(xiě)核長(zhǎng)整數(shù),或(核高h(yuǎn),核寬w)
strides=池化步長(zhǎng),#步長(zhǎng)整數(shù),或(縱向步長(zhǎng)h,橫向步長(zhǎng)w),默認(rèn)為pool_size
padding='valid' or 'same' #使用全零填充是"same",不使用是"valid"(默認(rèn))
)
--------------------------------------------------------------------------
model=tf.keras.models.Sequential([
Conv2D(filters=6,kernel_size=(5,5),padding='same'),#卷積層
BatchNormalization(),#BN層
Activation('relu'), #激活層
MaxPool2D(pool_size=(2,2),strides=2,padding='same'),#池化層
Dropout(0.2),#dropout層
])
舍棄(Dropout)
- 為了緩解神經(jīng)網(wǎng)絡(luò)過(guò)擬合,在神經(jīng)網(wǎng)絡(luò)訓(xùn)練過(guò)程中,常把隱藏層的部分神經(jīng)元按照一定比例從神經(jīng)網(wǎng)絡(luò)中臨時(shí)舍棄,在使用神經(jīng)網(wǎng)絡(luò)時(shí),再把所有神經(jīng)元恢復(fù)到神經(jīng)網(wǎng)絡(luò)中
- TF描述舍棄`tf.keras.layers.Dropout(舍棄的概率)
model=tf.keras.models.Sequential([
Conv2D(filters=6,kernel_size=(5,5),padding='same'),#卷積層
BatchNormalization(),#BN層
Activation('relu'), #激活層
MaxPool2D(pool_size=(2,2),strides=2,padding='same'),#池化層
Dropout(0.2),#dropout層,0.2表示隨機(jī)舍棄掉20%的神經(jīng)元
])
卷積神經(jīng)網(wǎng)絡(luò)
- 卷積神經(jīng)網(wǎng)絡(luò)就是借助卷積核對(duì)輸入特征進(jìn)行特征提取,再把提取到的特征送入全連接網(wǎng)絡(luò)進(jìn)行識(shí)別預(yù)測(cè),提取特征包括卷積、批標(biāo)準(zhǔn)化、激活、池化四步,卷積就是特征提取器,就是CBAPD,C代表卷積計(jì)算Conv2D,B代表批標(biāo)準(zhǔn)化BN,A代表激活層Avtivation,P代表池化Pooling,D代表舍棄Dropout
cifar10數(shù)據(jù)集
- 提供5萬(wàn)張
像素點(diǎn)的十分類彩色圖片和標(biāo)簽,用于訓(xùn)練
- 提供1萬(wàn)張
像素點(diǎn)的十分類彩色圖片和標(biāo)簽,用于測(cè)試
import tensorflow as tf
from matplotlib import pyplot as plt
import numpy as np
np.set_printoptions(threshold=np.inf)
cifar10 = tf.keras.datasets.cifar10
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
# 可視化訓(xùn)練集輸入特征的第一個(gè)元素
plt.imshow(x_train[0]) # 繪制圖片
plt.show()
# 打印出訓(xùn)練集輸入特征的第一個(gè)元素
#print("x_train[0]:\n", x_train[0])
# 打印出訓(xùn)練集標(biāo)簽的第一個(gè)元素
#print("y_train[0]:\n", y_train[0])
# 打印出整個(gè)訓(xùn)練集輸入特征形狀
print("x_train.shape:\n", x_train.shape)
# 打印出整個(gè)訓(xùn)練集標(biāo)簽的形狀
print("y_train.shape:\n", y_train.shape)
# 打印出整個(gè)測(cè)試集輸入特征的形狀
print("x_test.shape:\n", x_test.shape)
# 打印出整個(gè)測(cè)試集標(biāo)簽的形狀
print("y_test.shape:\n", y_test.shape)

output_1_0.png
x_train.shape:
(50000, 32, 32, 3)
y_train.shape:
(50000, 1)
x_test.shape:
(10000, 32, 32, 3)
y_test.shape:
(10000, 1)
卷積神經(jīng)網(wǎng)絡(luò)搭建示例
卷積過(guò)程
- C(核:
,步長(zhǎng):1,填充:same)
- B(yes)
- A(relu)
- P(max,核:
,步長(zhǎng):2,填充:same)
- D(0.2)
Flatten
Dense(神經(jīng)元:128,激活:relu,Dropout:0.2)
Dense(神經(jīng)元:10,激活:softmax)
實(shí)現(xiàn)LeNet、ALexNet、VGGNet、InceptionNet、ResNet五個(gè)經(jīng)典卷積網(wǎng)絡(luò)
import tensorflow as tf
import os
import numpy as np
from matplotlib import pyplot as plt
from tensorflow.keras.layers import Conv2D, BatchNormalization, Activation, MaxPool2D, Dropout, Flatten, Dense
from tensorflow.keras import Model
np.set_printoptions(threshold=np.inf)
cifar10 = tf.keras.datasets.cifar10
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
class Baseline(Model):
def __init__(self):
super(Baseline, self).__init__()
self.c1 = Conv2D(filters=6, kernel_size=(5, 5), padding='same') # 卷積層
self.b1 = BatchNormalization() # BN層
self.a1 = Activation('relu') # 激活層
self.p1 = MaxPool2D(pool_size=(2, 2), strides=2, padding='same') # 池化層
self.d1 = Dropout(0.2) # dropout層
self.flatten = Flatten()
self.f1 = Dense(128, activation='relu')
self.d2 = Dropout(0.2)
self.f2 = Dense(10, activation='softmax')
def call(self, x):
x = self.c1(x)
x = self.b1(x)
x = self.a1(x)
x = self.p1(x)
x = self.d1(x)
x = self.flatten(x)
x = self.f1(x)
x = self.d2(x)
y = self.f2(x)
return y
model = Baseline()
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
metrics=['sparse_categorical_accuracy'])
checkpoint_save_path = "./checkpoint/Baseline.ckpt"
if os.path.exists(checkpoint_save_path + '.index'):
print('-------------load the model-----------------')
model.load_weights(checkpoint_save_path)
cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_save_path,
save_weights_only=True,
save_best_only=True)
history = model.fit(x_train, y_train, batch_size=32, epochs=5, validation_data=(x_test, y_test), validation_freq=1,
callbacks=[cp_callback])
model.summary()
# print(model.trainable_variables)
file = open('./weights.txt', 'w')
for v in model.trainable_variables:
file.write(str(v.name) + '\n')
file.write(str(v.shape) + '\n')
file.write(str(v.numpy()) + '\n')
file.close()
############################################### show ###############################################
# 顯示訓(xùn)練集和驗(yàn)證集的acc和loss曲線
acc = history.history['sparse_categorical_accuracy']
val_acc = history.history['val_sparse_categorical_accuracy']
loss = history.history['loss']
val_loss = history.history['val_loss']
plt.subplot(1, 2, 1)
plt.plot(acc, label='Training Accuracy')
plt.plot(val_acc, label='Validation Accuracy')
plt.title('Training and Validation Accuracy')
plt.legend()
plt.subplot(1, 2, 2)
plt.plot(loss, label='Training Loss')
plt.plot(val_loss, label='Validation Loss')
plt.title('Training and Validation Loss')
plt.legend()
plt.show()
-------------load the model-----------------
Epoch 1/5
1563/1563 [==============================] - 15s 9ms/step - loss: 1.2475 - sparse_categorical_accuracy: 0.5558 - val_loss: 1.4672 - val_sparse_categorical_accuracy: 0.5108
Epoch 2/5
1563/1563 [==============================] - 15s 10ms/step - loss: 1.2192 - sparse_categorical_accuracy: 0.5673 - val_loss: 1.1591 - val_sparse_categorical_accuracy: 0.5880
Epoch 3/5
1563/1563 [==============================] - 15s 10ms/step - loss: 1.1987 - sparse_categorical_accuracy: 0.5734 - val_loss: 1.1963 - val_sparse_categorical_accuracy: 0.5760
Epoch 4/5
1563/1563 [==============================] - 15s 10ms/step - loss: 1.1813 - sparse_categorical_accuracy: 0.5805 - val_loss: 1.1741 - val_sparse_categorical_accuracy: 0.5820
Epoch 5/5
1563/1563 [==============================] - 15s 10ms/step - loss: 1.1692 - sparse_categorical_accuracy: 0.5839 - val_loss: 1.1454 - val_sparse_categorical_accuracy: 0.5940
Model: "baseline_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_1 (Conv2D) multiple 456
_________________________________________________________________
batch_normalization_1 (Batch multiple 24
_________________________________________________________________
activation_1 (Activation) multiple 0
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 multiple 0
_________________________________________________________________
dropout_2 (Dropout) multiple 0
_________________________________________________________________
flatten_1 (Flatten) multiple 0
_________________________________________________________________
dense_2 (Dense) multiple 196736
_________________________________________________________________
dropout_3 (Dropout) multiple 0
_________________________________________________________________
dense_3 (Dense) multiple 1290
=================================================================
Total params: 198,506
Trainable params: 198,494
Non-trainable params: 12
_________________________________________________________________

output_3_1.png