基于tensorflow的MNIST手寫字識(shí)別(一)--白話卷積神經(jīng)網(wǎng)絡(luò)模型
基于tensorflow的MNIST手寫數(shù)字識(shí)別(二)--入門篇
基于tensorflow的MNIST手寫數(shù)字識(shí)別(三)--神經(jīng)網(wǎng)絡(luò)篇
想想還是要說點(diǎn)什么
抱歉啊,第三篇姍姍來遲,確實(shí)是因?yàn)槲覒校皇敲κ裁吹?,所以這次再加點(diǎn)料,以表示我的歉意。廢話不多說,我就直接開始講了。
1. ?前面也講到了,使用普通的訓(xùn)練方法,也可以進(jìn)行識(shí)別,但是識(shí)別的精度不夠高,因此我們需要對(duì)其進(jìn)行提升,其實(shí)MNIST官方提供了很多的組合方法以及測試精度,并做成了表格供我們選用,谷歌官方為了保證教學(xué)的簡單性,所以用了最簡單的卷積神經(jīng)網(wǎng)絡(luò)來提升這個(gè)的識(shí)別精度,原理是通過強(qiáng)化它的特征(比如輪廓等),其實(shí)我也剛學(xué),所以能看懂就說明它確實(shí)比較簡單。
2. 我的代碼都是在0.7版本的tensorflow上實(shí)現(xiàn)的,建議看一下前兩篇文章先。
其實(shí)流程跟前面的差不多,只是在softmax前進(jìn)行了卷積神經(jīng)網(wǎng)絡(luò)的操作,所也就不仔細(xì)提出了,這里只說卷積神經(jīng)網(wǎng)絡(luò)的部分。
如第一篇文章所說,我們的卷積神經(jīng)網(wǎng)絡(luò)的,過程是卷積->池化->全連接.
# 卷積函數(shù)
# convolution
defconv2d(x, W):
return ? tf.nn.conv2d(x, W, strides=[1,1,1,1], padding='SAME')
#這里tensorflow自己帶了conv2d函數(shù)做卷積,然而我們自定義了個(gè)函數(shù),用于指定步長為1,邊緣處理為直接復(fù)制過來
# pooling
defmax_pool_2x2(x):
return ?tf.nn.max_pool(x, ksize=[1,2,2,1], strides=[1,2,2,1], padding='SAME')
tf.nn.conv2d(input, filter, strides, padding, use_cudnn_on_gpu=None, name=None)
Computes a 2-D convolution given 4-D input and filter tensors.
Given an input tensor of shape [batch, in_height, in_width, in_channels] and a filter / kernel tensor of shape [filter_height, filter_width, in_channels, out_channels], this op performs the following:
Flattens the filter to a 2-D matrix with shape [filter_height * filter_width * in_channels, output_channels].
Extracts image patches from the the input tensor to form a virtual tensor of shape [batch, out_height, out_width, filter_height * filter_width * in_channels].
For each patch, right-multiplies the filter matrix and the image patch vector.
In detail,
output[b, i, j, k] =
sum_{di, dj, q} input[b, strides[1] * i + di, strides[2] * j + dj, q] *?filter[di, dj, q, k]
Must have strides[0] = strides[3] = 1. For the most common case of the same horizontal and vertices strides, strides = [1, stride, stride, 1].
Args:
input: A Tensor. Must be one of the following types: float32, float64.
filter: A Tensor. Must have the same type as input.
strides: A list of ints. 1-D of length 4. The stride of the sliding window for each dimension of input.
padding: A string from: “SAME”, “VALID”. The type of padding algorithm to use.
use_cudnn_on_gpu: An optional bool. Defaults to True.
name: A name for the operation (optional).
Returns:
A Tensor. Has the same type as input.
tf.nn.max_pool(value, ksize, strides, padding, name=None)
Performs the max pooling on the input.
Args:
value: A 4-D Tensor with shape [batch, height, width, channels] and type float32, float64, qint8, quint8, qint32.
ksize: A list of ints that has length >= 4. The size of the window for each dimension of the input tensor.
strides: A list of ints that has length >= 4. The stride of the sliding window for each dimension of the input tensor.
padding: A string, either ‘VALID’ or ‘SAME’. The padding algorithm.
name: Optional name for the operation.
Returns:
A Tensor with the same type as value. The max pooled output tensor.
初始化權(quán)重和偏置值矩陣,值是空的,需要后期訓(xùn)練。
def weight_variable(shape):
? ? ? ?initial = tf.truncated_normal(shape,stddev=0.1)
? ? ? ?return tf.Variable(initial)
def bias_variable(shape):
? ? ?initial = tf.constant(0.1, shape = shape)
? ? # print(tf.Variable(initial).eval())
? ? ?return tf.Variable(initial)
#這是做了兩次卷積和池化
h_conv1 = tf.nn.relu(conv2d(x_image, w_conv1) + b_conv1)
h_pool1 = max_pool_2x2(h_conv1)
h_conv2 = tf.nn.relu(conv2d(h_pool1, w_conv2) + b_conv2)
h_pool2 = max_pool_2x2(h_conv2)
這里是做了全連接,還用了relu激活函數(shù)(RELU在下面會(huì)提到)
h_pool2_flat = tf.reshape(h_pool2, [-1,7*7*64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, w_fc1) + b_fc1)
#為了防止過擬合化,這里用dropout來關(guān)閉一些連接(DROP下面會(huì)提到)
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)
然后得到的結(jié)果再跟之前的一樣,使用softmax等方法訓(xùn)練即可得到參數(shù)。
激活函數(shù)有很多種,最常用的是以下三種
將數(shù)據(jù)映射到0-1范圍內(nèi)
#### 公式如下
####函數(shù)圖像如下
將數(shù)據(jù)映射到-1-1的范圍內(nèi)
函數(shù)圖像如下
小于0的值就變成0,大于0的等于它本身
具體的參考這個(gè)http://blog.csdn.net/u012526120/article/details/49149317
1.以前學(xué)習(xí)數(shù)學(xué)我們常用到一種方法,叫做待定系數(shù)法,就是給定2次函數(shù)上的幾個(gè)點(diǎn),然后求得2次函數(shù)的參數(shù)。
2.一樣的道理,我們這里用格式訓(xùn)練集訓(xùn)練,最后訓(xùn)練得到參數(shù),其實(shí)就是在求得一個(gè)模型(函數(shù)),使得它能跟原始數(shù)據(jù)的曲線進(jìn)行擬合(說白了,就是假裝原始數(shù)據(jù)都在我們計(jì)算出來的函數(shù)上)
3.但是這樣不行啊,因?yàn)槲覀冞€需要對(duì)未知數(shù)據(jù)進(jìn)行預(yù)測啊,如果原始的數(shù)據(jù)點(diǎn)都在(或者大多數(shù)都在)函數(shù)上了(這就是過擬合),那會(huì)被很多訓(xùn)練數(shù)據(jù)誤導(dǎo)的,所以其實(shí)只要一個(gè)大致的趨勢(shì)函數(shù)就可以了
4.所以Dropout函數(shù)就是用來,減少某些點(diǎn)的全連接(可以理解為把一些點(diǎn)去掉了),來防止過擬合
具體的看這個(gè)http://www.cnblogs.com/tornadomeet/p/3258122.html
水完了,看代碼吧,注釋上有寫一些變量的維度,大家可以一步步地看過去,計(jì)算過去
https://github.com/wlmnzf/tensorflow-train/blob/master/mnist/cnn_mnist.py