姓名:王咫毅
學(xué)號(hào):19021211150
【嵌牛導(dǎo)讀】這篇博客將用tensorflow實(shí)現(xiàn)CNN卷積神經(jīng)網(wǎng)絡(luò)去訓(xùn)練MNIST數(shù)據(jù)集,并測(cè)試一下MNIST的測(cè)試集,算出精確度。
【嵌牛鼻子】mnist cnn
【嵌牛提問】如何導(dǎo)入并訓(xùn)練mnist數(shù)據(jù)集?識(shí)別率如何定義?
【嵌牛正文】
轉(zhuǎn)載自:tensorflow筆記(五)之MNIST手寫識(shí)別系列二 - FANG_YANG - 博客園
前言#
由于這一篇博客需要要有一定的基礎(chǔ),基礎(chǔ)部分請(qǐng)看前面的tensorflow筆記,起碼MNIST手寫識(shí)別系列一和CNN初探要看一下,對(duì)于已經(jīng)講過的東西,不會(huì)再仔細(xì)復(fù)述,可能會(huì)提一下。還有一件事,我會(huì)把jupyter notebook放在這個(gè)百度云鏈接里,方便你下載調(diào)試,密碼是5dx9
實(shí)踐#
首先先導(dǎo)入我們需要的模塊
import tensorflow as tffromtensorflow.examples.tutorials.mnistimportinput_data
然后導(dǎo)入MNIST數(shù)據(jù)集
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)
運(yùn)行后如圖則導(dǎo)入成功:

MNIST數(shù)據(jù)集的導(dǎo)入不清楚的地方請(qǐng)看here,接下來我們定義兩個(gè)函數(shù),分別是生成權(quán)重和偏差的函數(shù)
def weight_variable(shape):
? ? initial = tf.truncated_normal(shape, stddev=0.1)
? ? return tf.Variable(initial)def bias_variable(shape):
? ? initial = tf.constant(0.1, shape=shape)
? ? returntf.Variable(initial)
說明:
權(quán)重在初始化時(shí)應(yīng)該加入少量的噪聲(偏差stddev=0.1)來打破對(duì)稱性以及避免0梯度。由于我們使用的是ReLU神經(jīng)元,因此比較好的做法是用一個(gè)較小的正數(shù)來初始化偏置項(xiàng),以避免神經(jīng)元節(jié)點(diǎn)輸出恒為0的問題(dead neurons)。為了不在建立模型的時(shí)候反復(fù)做初始化操作,我們定義兩個(gè)函數(shù)用于初始化。
接下來建立conv2d和max_pool_2X2這兩個(gè)函數(shù)
def conv2d(x, W):
? ? # stride [1, x_movement, y_movement, 1]# Must have strides[0] = strides[3] = 1returntf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')def max_pool_2x2(x):
? ? # stride [1, x_movement, y_movement, 1]#ksize? [1,pool_op_length,pool_op_width,1]# Must have ksize[0] = ksize[3] = 1returntf.nn.max_pool(x, ksize=[1,2,2,1], strides=[1,2,2,1], padding='SAME')
說明:
conv2d函數(shù)的輸入?yún)?shù)是要進(jìn)行卷積的圖片x和卷積核W,函數(shù)內(nèi)部strides是卷積核步長(zhǎng)的設(shè)定,上面已進(jìn)行標(biāo)注,x軸,y軸都是每隔一個(gè)像素移動(dòng)的,步長(zhǎng)都為1,padding是填充的意思,這里是SAME,意思是卷積后的圖片與原圖片一樣,有填充。
max_pool_2X2函數(shù)的輸入?yún)?shù)是卷積后的圖片x,ksize是池化算子,由于是2x2max_pool,所以長(zhǎng)度和寬度都為2,x軸和y軸的步長(zhǎng)都為2,有填充。
接下來我們用占位符定義一些輸入,有圖片集的輸入xs,相應(yīng)的標(biāo)簽ys和dropout的概率keep_prob
xs = tf.placeholder(tf.float32, [None, 784])# 28x28ys = tf.placeholder(tf.float32, [None, 10])
keep_prob = tf.placeholder(tf.float32)
由于我們要進(jìn)行卷積,為了符合tf.nn.conv2d和tf.nn.max_pool_2x2的輸入圖片需為4維tensor,我們要對(duì)xs做一個(gè)reshape,讓它符合要求
x_image = tf.reshape(xs, [-1, 28, 28, 1])# [n_samples, 28,28,1]
說明:
x_image是四維張量,分別是[batch, height, width, channels],batch要看上面xs第一維,長(zhǎng)和寬為28,通道由于是灰度圖片,所以是1,RGB為3
接下來,我們開始構(gòu)造卷積神經(jīng)網(wǎng)絡(luò),先進(jìn)行第一層的卷積層和第一層的池化層
## conv1 layer ##W_conv1 = weight_variable([5,5, 1,32])# patch 5x5, in size 1, out size 32b_conv1 = bias_variable([32])
h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)# output size 28x28x32##pool1 layer##h_pool1 = max_pool_2x2(h_conv1)# output size 14x14x32
說明:
卷積核的大小是5x5的,由于輸入size1,輸出32,可見有32個(gè)不同的卷積核,然后將W_conv1與x_image送入conv2d函數(shù)后加入偏差,最后外圍加上RELU函數(shù),RELU函數(shù)是相比其他函數(shù)(sigmiod)好很多,使用它,迭代速度會(huì)很快,因?yàn)樗拇笥?的導(dǎo)數(shù)恒等于1,而sigmiod的導(dǎo)數(shù)有可能會(huì)很小,趨近于0,我們?cè)谶M(jìn)行反向傳播迭代參數(shù)更新時(shí),如果這個(gè)導(dǎo)數(shù)太小,參數(shù)的更新就會(huì)很慢。
為了得到更高層次的特征,我們需要構(gòu)建一個(gè)更深的網(wǎng)絡(luò),再加第二層卷積層和第二層池化層,原理與上面一樣
## conv2 layer ##W_conv2 = weight_variable([5,5, 32, 64])# patch 5x5, in size 32, out size 64b_conv2 = bias_variable([64])
h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)# output size 14x14x64##pool2_layer##h_pool2 = max_pool_2x2(h_conv2)# output size 7x7x64
好了,特征提取出來了,我們開始用全連通層進(jìn)行預(yù)測(cè),在建立之前,我們需要對(duì)h_pool2進(jìn)行維度處理,因?yàn)樯窠?jīng)網(wǎng)絡(luò)的輸入并不能是4維張量。
h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])# [n_samples, 7, 7, 64] ->> [n_samples, 7*7*64]
說明:
上面將4維張量,變?yōu)?維張量,第一維是樣本數(shù),第二維是輸入特征,可見輸入神經(jīng)元的個(gè)數(shù)是7*7*64=3136
全連通層開始,先從7*7*64映射到1024個(gè)隱藏層神經(jīng)元
# fc1 layer ##W_fc1 = weight_variable([7*7*64, 1024])
b_fc1 = bias_variable([1024])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)#dropout h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)
說明:
這個(gè)跟傳統(tǒng)的神經(jīng)網(wǎng)絡(luò)一樣,但和前面見的有點(diǎn)不同,這里最后加了dropout,防止神經(jīng)網(wǎng)絡(luò)過擬合
然后再加一個(gè)全連通層,進(jìn)行1024神經(jīng)元到10個(gè)神經(jīng)元的映射,最后加一個(gè)softmax層,得出每種情況的概率
W_fc2 = weight_variable([1024, 10])
b_fc2 = bias_variable([10])
prediction = tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2)
說明:
這個(gè)跟上面原理一樣,加了一個(gè)softmax,不懂softmax請(qǐng)看往期的筆記或看鏈接中的wiki
然后我們開始算交叉熵和train_step
cross_entropy = tf.reduce_mean(-tf.reduce_sum(ys * tf.log(prediction),reduction_indices=[1]))# losstrain_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
說明:
不同之前的,這里用到了AdamOptimizer優(yōu)化器,由于這個(gè)計(jì)算量很大,用GradientDescentOptimizer優(yōu)化器下降速度太慢,所以用AdamOptimizer
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)
上面是套路了,不用多說了,下面再建立一個(gè)測(cè)量測(cè)試集精確度的函數(shù),后面會(huì)用到
def compute_accuracy(v_xs, v_ys):
? ? global prediction
? ? y_pre = sess.run(prediction, feed_dict={xs: v_xs, keep_prob: 1})
? ? correct_prediction = tf.equal(tf.argmax(y_pre,1), tf.argmax(v_ys,1))
? ? accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
? ? result = sess.run(accuracy, feed_dict={xs: v_xs, ys: v_ys, keep_prob: 1})
? ? returnresult
說明:
函數(shù)的輸入是測(cè)試集的圖片v_xs和相應(yīng)的標(biāo)簽v_ys,global prediction讓prediction代表前面的預(yù)測(cè)值,不這么做下一行會(huì)出錯(cuò),顯示找不到prediction,測(cè)試的時(shí)候不加dropout,即keep_prob等于1,后面的跟上一篇筆記一樣。最后返回精確度
好了,所有的工作準(zhǔn)備完畢,現(xiàn)在開始訓(xùn)練和測(cè)試,每訓(xùn)練50次,測(cè)試一次,這個(gè)時(shí)間會(huì)有點(diǎn)長(zhǎng),要耐心等待
foriinrange(1000):
? ? batch_xs, batch_ys = mnist.train.next_batch(100)
? ? sess.run(train_step, feed_dict={xs: batch_xs, ys: batch_ys, keep_prob: 0.5})
? ? ifi % 50 == 0:
? ? ? ? print(compute_accuracy(mnist.test.images, mnist.test.labels))
運(yùn)行結(jié)果如下:

感慨:終于運(yùn)行完了,這段程序大概跑了四十多分鐘,電腦一直處于崩潰狀態(tài),感慨還是有g(shù)pu好哦,最后精確度是97.37%,我感覺還能再提高,沒有完全收斂,你們可以再多迭代試試。我是不想在電腦跑這種程序,要跑到gpu或服務(wù)器上跑,各位跑程序要有心理準(zhǔn)備哈
完整代碼如下(直接運(yùn)行即可):
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
# number 1 to 10 data
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)
def compute_accuracy(v_xs, v_ys):
? ? global prediction
? ? y_pre = sess.run(prediction, feed_dict={xs: v_xs, keep_prob: 1})
? ? correct_prediction = tf.equal(tf.argmax(y_pre,1), tf.argmax(v_ys,1))
? ? accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
? ? result = sess.run(accuracy, feed_dict={xs: v_xs, ys: v_ys, keep_prob: 1})
? ? return result
def weight_variable(shape):
? ? initial = tf.truncated_normal(shape, stddev=0.1)
? ? return tf.Variable(initial)
def bias_variable(shape):
? ? initial = tf.constant(0.1, shape=shape)
? ? return tf.Variable(initial)
def conv2d(x, W):
? ? # stride [1, x_movement, y_movement, 1]
? ? # Must have strides[0] = strides[3] = 1
? ? return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')
def max_pool_2x2(x):
? ? # stride [1, x_movement, y_movement, 1]
? ? #ksize? [1,pool_op_length,pool_op_width,1]
? ? # Must have ksize[0] = ksize[3] = 1
? ? return tf.nn.max_pool(x, ksize=[1,2,2,1], strides=[1,2,2,1], padding='SAME')
# define placeholder for inputs to network
xs = tf.placeholder(tf.float32, [None, 784])? ? # 28x28
ys = tf.placeholder(tf.float32, [None, 10])
keep_prob = tf.placeholder(tf.float32)
x_image = tf.reshape(xs, [-1, 28, 28, 1])
# print(x_image.shape)? # [n_samples, 28,28,1]
## conv1 layer ##
W_conv1 = weight_variable([5,5, 1,32]) # patch 5x5, in size 1, out size 32
b_conv1 = bias_variable([32])
h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1) # output size 28x28x32
h_pool1 = max_pool_2x2(h_conv1)? ? ? ? ? ? ? ? ? ? ? ? ? # output size 14x14x32
## conv2 layer ##
W_conv2 = weight_variable([5,5, 32, 64]) # patch 5x5, in size 32, out size 64
b_conv2 = bias_variable([64])
h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2) # output size 14x14x64
h_pool2 = max_pool_2x2(h_conv2)? ? ? ? ? ? ? ? ? ? ? ? ? # output size 7x7x64
##flat h_pool2##
h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])? # [n_samples, 7, 7, 64] ->> [n_samples, 7*7*64]
## fc1 layer ##
W_fc1 = weight_variable([7*7*64, 1024])
b_fc1 = bias_variable([1024])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)
## fc2 layer ##
W_fc2 = weight_variable([1024, 10])
b_fc2 = bias_variable([10])
prediction = tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2)
cross_entropy = tf.reduce_mean(-tf.reduce_sum(ys * tf.log(prediction),
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? reduction_indices=[1]))? ? ? # loss
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)
for i in range(1000):
? ? batch_xs, batch_ys = mnist.train.next_batch(100)
? ? sess.run(train_step, feed_dict={xs: batch_xs, ys: batch_ys, keep_prob: 0.5})
? ? if i % 50 == 0:
? ? ? ? print(compute_accuracy(mnist.test.images, mnist.test.labels))
結(jié)尾#
MNIST數(shù)據(jù)集的識(shí)別到這里就結(jié)束了,希望看過這個(gè)博客的朋友們能有所收獲!最后,還是那句話,筆者能力有限,如果有錯(cuò)誤,還請(qǐng)不吝指教,共同學(xué)習(xí)!謝謝!
參考#
[1]?https://www.tensorflow.org/versions/r1.0/api_docs/python/
[2]?http://www.tensorfly.cn/tfdoc/tutorials/mnist_pros.html
作者:?FANG_YANG
出處:https://www.cnblogs.com/fydeblog/p/7455233.html
版權(quán):本站使用「CC BY 4.0」創(chuàng)作共享協(xié)議,轉(zhuǎn)載請(qǐng)?jiān)谖恼旅黠@位置注明作者及出處。