卷積與池化(stanford練習(xí)解析)

前言

由于是用latex寫(xiě)的,pdf我不知道怎么傳上來(lái),轉(zhuǎn)成圖片放這里了,tex我也留下了,感興趣的小伙伴如果有報(bào)告想用這個(gè)模板的可以直接用。話(huà)說(shuō)我貼latex的時(shí)候怎么有些地方被解析了?誰(shuí)能留言告訴我下怎么讓markdown的某一塊直接純文本輸出。

內(nèi)容

11132141025ConvolutionandPooling_1.Png
11132141025ConvolutionandPooling_2.Png
11132141025ConvolutionandPooling_3.Png
11132141025ConvolutionandPooling_4.Png
11132141025ConvolutionandPooling_5.Png
11132141025ConvolutionandPooling_6.Png
11132141025ConvolutionandPooling_7.Png
11132141025ConvolutionandPooling_8.Png
11132141025ConvolutionandPooling_9.Png
11132141025ConvolutionandPooling_10.Png
11132141025ConvolutionandPooling_11.Png
11132141025ConvolutionandPooling_12.Png
11132141025ConvolutionandPooling_13.Png
11132141025ConvolutionandPooling_14.Png
11132141025ConvolutionandPooling_15.Png

tex文件

\documentclass[a4paper, 11pt]{article}

%%%%%% 導(dǎo)入包 %%%%%%
\usepackage{CJKutf8}
\usepackage{graphicx}
\usepackage[unicode]{hyperref}
\usepackage{xcolor}
\usepackage{cite}
\usepackage{indentfirst}
\usepackage{listings}
\usepackage[framed,numbered,autolinebreaks,useliterate]{mcode}
\lstset{language=Matlab}%代碼語(yǔ)言使用的是matlab
\lstset{breaklines}%自動(dòng)將長(zhǎng)的代碼行換行排版
\lstset{extendedchars=false}%解決代碼跨頁(yè)時(shí),章節(jié)標(biāo)題,頁(yè)眉等漢字不顯示的問(wèn)題


%%%%%% 設(shè)置字號(hào) %%%%%%
\newcommand{\chuhao}{\fontsize{42pt}{\baselineskip}\selectfont}
\newcommand{\xiaochuhao}{\fontsize{36pt}{\baselineskip}\selectfont}
\newcommand{\yihao}{\fontsize{28pt}{\baselineskip}\selectfont}
\newcommand{\erhao}{\fontsize{21pt}{\baselineskip}\selectfont}
\newcommand{\xiaoerhao}{\fontsize{18pt}{\baselineskip}\selectfont}
\newcommand{\sanhao}{\fontsize{15.75pt}{\baselineskip}\selectfont}
\newcommand{\sihao}{\fontsize{14pt}{\baselineskip}\selectfont}
\newcommand{\xiaosihao}{\fontsize{12pt}{\baselineskip}\selectfont}
\newcommand{\wuhao}{\fontsize{10.5pt}{\baselineskip}\selectfont}
\newcommand{\xiaowuhao}{\fontsize{9pt}{\baselineskip}\selectfont}
\newcommand{\liuhao}{\fontsize{7.875pt}{\baselineskip}\selectfont}
\newcommand{\qihao}{\fontsize{5.25pt}{\baselineskip}\selectfont}

%%%% 設(shè)置 section 屬性 %%%%
\makeatletter
\renewcommand\section{\@startsection{section}{1}{\z@}%
{-1.5ex \@plus -.5ex \@minus -.2ex}%
{.5ex \@plus .1ex}%
{\normalfont\sihao\CJKfamily{hei}}}
\makeatother

%%%% 設(shè)置 subsection 屬性 %%%%
\makeatletter
\renewcommand\subsection{\@startsection{subsection}{1}{\z@}%
{-1.25ex \@plus -.5ex \@minus -.2ex}%
{.4ex \@plus .1ex}%
{\normalfont\xiaosihao\CJKfamily{hei}}}
\makeatother

%%%% 設(shè)置 subsubsection 屬性 %%%%
\makeatletter
\renewcommand\subsubsection{\@startsection{subsubsection}{1}{\z@}%
{-1ex \@plus -.5ex \@minus -.2ex}%
{.3ex \@plus .1ex}%
{\normalfont\xiaosihao\CJKfamily{hei}}}
\makeatother

%%%% 段落首行縮進(jìn)兩個(gè)字 %%%%
\makeatletter
\let\@afterindentfalse\@afterindenttrue
\@afterindenttrue
\makeatother
\setlength{\parindent}{2em}  %中文縮進(jìn)兩個(gè)漢字位


%%%% 下面的命令重定義頁(yè)面邊距,使其符合中文刊物習(xí)慣 %%%%
\addtolength{\topmargin}{-54pt}
\setlength{\oddsidemargin}{0.63cm}  % 3.17cm - 1 inch
\setlength{\evensidemargin}{\oddsidemargin}
\setlength{\textwidth}{14.66cm}
\setlength{\textheight}{24.00cm}    % 24.62

%%%% 下面的命令設(shè)置行間距與段落間距 %%%%
\linespread{1.4}
% \setlength{\parskip}{1ex}
\setlength{\parskip}{0.5\baselineskip}


%%%% 正文開(kāi)始 %%%%
\begin{document}
\begin{CJK}{UTF8}{gbsn}

%%%% 定理類(lèi)環(huán)境的定義 %%%%
\newtheorem{example}{例}             % 整體編號(hào)
\newtheorem{algorithm}{算法}
\newtheorem{theorem}{定理}[section]  % 按 section 編號(hào)
\newtheorem{definition}{定義}
\newtheorem{axiom}{公理}
\newtheorem{property}{性質(zhì)}
\newtheorem{proposition}{命題}
\newtheorem{lemma}{引理}
\newtheorem{corollary}{推論}
\newtheorem{remark}{注解}
\newtheorem{condition}{條件}
\newtheorem{conclusion}{結(jié)論}
\newtheorem{assumption}{假設(shè)}

%%%% 重定義 %%%%
\renewcommand{\contentsname}{目錄}  % 將Contents改為目錄
\renewcommand{\abstractname}{摘要}  % 將Abstract改為摘要
\renewcommand{\refname}{參考文獻(xiàn)}   % 將References改為參考文獻(xiàn)
\renewcommand{\indexname}{索引}
\renewcommand{\figurename}{圖}
\renewcommand{\tablename}{表}
\renewcommand{\appendixname}{附錄}
\renewcommand{\algorithm}{算法}


%%%% 定義標(biāo)題格式,包括title,author,affiliation,email等 %%%%
\title{\textbf{卷積與池化解析報(bào)告\\Report of Convolution and Pooling}}
\author{梁譽(yù)譯\footnote{電子郵件: lwcoder@outlook.com,學(xué)號(hào): 2014141463106}\\[2ex]
\xiaosihao 四川大學(xué)軟件學(xué)院\\[2ex]
}
\date{2016年11月}





%%%% 以下部分是正文 %%%%  
\maketitle

\tableofcontents
\newpage


\section{Question}
本練習(xí)先補(bǔ)全卷積和池化相關(guān)代碼,然后用數(shù)據(jù)集通過(guò)卷積層和池化層之后得到的特征來(lái)訓(xùn)練softmax分類(lèi)器并對(duì)數(shù)據(jù)集中的4類(lèi)物體(飛機(jī)、車(chē)、貓、狗)進(jìn)行分類(lèi),實(shí)驗(yàn)參考的是斯坦福網(wǎng)頁(yè)教程:Exercise:Convolution and Pooling。

\section{Data}
數(shù)據(jù)有兩組,一個(gè)是UFLDL提供的STL-10 dataset的子集。另一個(gè)是之前練習(xí)中學(xué)習(xí)到的特征的STL10Features


\section{Algorithm}
在處理大型圖片時(shí)(比如96*96的圖像),如果還是按照過(guò)去的神經(jīng)網(wǎng)絡(luò)模型使用全連接的話(huà),網(wǎng)絡(luò)中的參數(shù)個(gè)數(shù)會(huì)變得非常龐大,這樣龐大的參數(shù)所需要的訓(xùn)練時(shí)間幾乎是不能被接受的。卷積神經(jīng)網(wǎng)絡(luò)是人工神經(jīng)網(wǎng)絡(luò)的一種,它的權(quán)值共享和部分連接的類(lèi)似生物神經(jīng)網(wǎng)絡(luò),降低了網(wǎng)絡(luò)模型的復(fù)雜度,減少了權(quán)值的數(shù)量。在卷積神經(jīng)網(wǎng)絡(luò)中有兩個(gè)非常重要的概念,而且也是這篇解析的核心————卷積和池化。
\subsection{Convolution}
要說(shuō)到卷積就必須先提到局部感知野,上面也說(shuō)到了卷積神經(jīng)網(wǎng)絡(luò)(CNN)通過(guò)權(quán)值共享和部分連接來(lái)減少神經(jīng)網(wǎng)絡(luò)中參數(shù)的數(shù)量,一般認(rèn)為人對(duì)外界的認(rèn)知是從局部到全局的,而圖像的空間聯(lián)系也是局部的像素聯(lián)系較為緊密,而距離較遠(yuǎn)的像素相關(guān)性則較弱。因而,每個(gè)神經(jīng)元其實(shí)沒(méi)有必要對(duì)全局圖像進(jìn)行感知,只需要對(duì)局部進(jìn)行感知,然后在更高層將局部的信息綜合起來(lái)就得到了全局的信息。部分連接也是受啟發(fā)于生物學(xué)里面的視覺(jué)系統(tǒng)結(jié)構(gòu)。視覺(jué)皮層的神經(jīng)元就是局部接受信息的(即這些神經(jīng)元只響應(yīng)某些特定區(qū)域的刺激)。如圖\ref{fig:1}所示:左圖為全連接,右圖為局部連接。
\begin{figure}[htbp]
\centering
\includegraphics[width=5in]{images/1.jpg}
\caption{全連接(左)與部分連接(右)}
\label{fig:1}
\end{figure}

第二個(gè)就是權(quán)值共享,可以將其看成是提取特征的方式,該方式與位置無(wú)關(guān)。這其中隱含的原理則是:圖像的一部分的統(tǒng)計(jì)特性與其他部分是一樣的。這也意味著我們?cè)谶@一部分學(xué)習(xí)的特征也能用在另一部分上,所以對(duì)于這個(gè)圖像上的所有位置,我們都能使用同樣的學(xué)習(xí)特征。由于權(quán)值共享,減少了網(wǎng)絡(luò)的參數(shù)個(gè)數(shù),也大提高了網(wǎng)絡(luò)的學(xué)習(xí)效率。

\begin{figure}[htbp]
\centering
\includegraphics[width=5in]{images/2.png}
\caption{卷積過(guò)程示意圖}
\label{fig:2}
\end{figure}




\subsection{Pooling}
\indent池化是一種降采樣方法如圖\ref{fig:3}所示。在通過(guò)卷積獲取圖像特征之后是利用這些特征進(jìn)行分類(lèi)??梢杂盟刑崛〉降奶卣鲾?shù)據(jù)進(jìn)行分類(lèi)器的訓(xùn)練,但這通常會(huì)產(chǎn)生極大的計(jì)算量。所以在獲取圖像的卷積特征后,要通過(guò)池化方法對(duì)卷積特征進(jìn)行降維。將卷積特征劃分為數(shù)個(gè)n*n的不相交區(qū)域,用這些區(qū)域的最大(或平均)特征來(lái)表示降維后的卷積特征。

\begin{figure}[htbp]
\centering
\includegraphics[width=5in]{images/3.png}
\caption{池化過(guò)程示意圖}
\label{fig:3}
\end{figure}


\subsection{Summary}
總之,卷積和池化將局部感受野、權(quán)值共享以及亞采樣這三種結(jié)構(gòu)思想結(jié)合起來(lái)獲得了某種程度的位移、尺度、形變不變性而且大大減少了需要訓(xùn)練的參數(shù)。




\section{Analysis of code}

\subsection{cnnConvolve.m}

\begin{lstlisting}
function convolvedFeatures = cnnConvolve(patchDim, numFeatures, images, W, b, ZCAWhite, meanPatch)
%cnnConvolve Returns the convolution of the features given by W and b with
%the given images
%
% Parameters:
%  patchDim - patch (feature) dimension
%  numFeatures - number of features
%  images - large images to convolve with, matrix in the form
%           images(r, c, channel, image number)
%  W, b - W, b for features from the sparse autoencoder
%  ZCAWhite, meanPatch - ZCAWhitening and meanPatch matrices used for
%                        preprocessing
%
% Returns:
%  convolvedFeatures - matrix of convolved features in the form
%                      convolvedFeatures(featureNum, imageNum, imageRow, imageCol)
\end{lstlisting}
函數(shù)convolvedFeatures輸入每一片的維度(就是卷積核的維度),還有卷積核的數(shù)量以及圖片信息還有向量權(quán)值W和閾值b。還有用于預(yù)處理的兩項(xiàng)。
\begin{lstlisting}
patchSize = patchDim*patchDim;
assert(numFeatures == size(W,1),"W should have numFeatures rows");
numImages = size(images, 4);%第4維的大小,即圖片的樣本數(shù)
imageDim = size(images, 1);%第1維的大小,即圖片的行數(shù)
imageChannels = size(images, 3);%第3維的大小,即圖片的通道數(shù)
assert(patchSize*imageChannels == size(W,2), "W should have patchSize*imageChannels cols");
\end{lstlisting}
計(jì)算卷積核的大小并提取了所需的一些變量比如圖片數(shù)、圖片維度、圖片通道,做了防御式編程。

\begin{lstlisting}
% Instructions:
%   Convolve every feature with every large image here to produce the 
%   numFeatures x numImages x (imageDim - patchDim + 1) x (imageDim - patchDim + 1) 
%   matrix convolvedFeatures, such that 
%   convolvedFeatures(featureNum, imageNum, imageRow, imageCol) is the
%   value of the convolved featureNum feature for the imageNum image over
%   the region (imageRow, imageCol) to (imageRow + patchDim - 1, imageCol + patchDim - 1)
%
% Expected running times: 
%   Convolving with 100 images should take less than 3 minutes 
%   Convolving with 5000 images should take around an hour
%   (So to save time when testing, you should convolve with less images, as
%   described earlier)

% -------------------- YOUR CODE HERE --------------------
% Precompute the matrices that will be used during the convolution. Recall
% that you need to take into account the whitening and mean subtraction
% steps

WT = W*ZCAWhite;%等效的網(wǎng)絡(luò)參數(shù)
b_mean = b - WT*meanPatch;%針對(duì)未均值化的輸入數(shù)據(jù)需要加入該項(xiàng)

% --------------------------------------------------------
\end{lstlisting}
由于輸入的數(shù)據(jù)是大的圖片,所以每次進(jìn)行convolution時(shí)都需要進(jìn)行whitening和網(wǎng)絡(luò)的權(quán)值計(jì)算,這樣每一個(gè)學(xué)習(xí)到的隱含層節(jié)點(diǎn)的特征對(duì)每一張圖片都可以得到一張稍小的特征圖片

\begin{lstlisting}
convolvedFeatures = zeros(numFeatures, numImages, imageDim - patchDim + 1, imageDim - patchDim + 1);
for imageNum = 1:numImages
  for featureNum = 1:numFeatures

    % convolution of image with feature matrix for each channel
    convolvedImage = zeros(imageDim - patchDim + 1, imageDim - patchDim + 1);
    for channel = 1:imageChannels

      % Obtain the feature (patchDim x patchDim) needed during the convolution
      % ---- YOUR CODE HERE ----
      offset = (channel-1)*patchSize;
      feature = reshape(WT(featureNum,offset+1:offset+patchSize), patchDim, patchDim);%取一個(gè)權(quán)值圖像塊出來(lái)
      im  = images(:,:,channel,imageNum);

      % Flip the feature matrix because of the definition of convolution, as explained later
      feature = flipud(fliplr(squeeze(feature)));
      
      % Obtain the image
      im = squeeze(images(:, :, channel, imageNum));%取一張圖片出來(lái)

      % Convolve "feature" with "im", adding the result to convolvedImage
      % be sure to do a "valid" convolution
      % ---- YOUR CODE HERE ----
      convolvedoneChannel = conv2(im, feature, "valid");
      convolvedImage = convolvedImage + convolvedoneChannel;%直接把3通道的值加起來(lái),理由:3通道相當(dāng)于有3個(gè)feature-map,類(lèi)似于cnn第2層以后的輸入。
      
      % ------------------------
    end
    % Subtract the bias unit (correcting for the mean subtraction as well)
    % Then, apply the sigmoid function to get the hidden activation
    % ---- YOUR CODE HERE ----
    convolvedImage = sigmoid(convolvedImage+b_mean(featureNum));
    % ------------------------
    % The convolved feature is the sum of the convolved values for all channels
    convolvedFeatures(featureNum, imageNum, :, :) = convolvedImage;
  end
end
\end{lstlisting}
卷積的過(guò)程如\ref{fig:2}一樣,這里對(duì)每一張圖片用每一個(gè)卷積核去過(guò)濾(然后一個(gè)圖片三個(gè)通道RGB一個(gè)圖片一個(gè)卷積核生成3個(gè)特征圖,這里把三個(gè)通道的特征圖加起來(lái)了),用conv2函數(shù)來(lái)對(duì)當(dāng)前圖片im進(jìn)行卷積,參數(shù)shape設(shè)為valid表示不考慮邊界補(bǔ)零。將卷積后的特征圖加上做均值化處理后的閾值一起(Wx+b)輸入激活函數(shù)得到當(dāng)前層的輸出。


\subsection{cnnPool.m}
\begin{lstlisting}
function pooledFeatures = cnnPool(poolDim, convolvedFeatures)
%cnnPool Pools the given convolved features
%
% Parameters:
%  poolDim - dimension of pooling region
%  convolvedFeatures - convolved features to pool (as given by cnnConvolve)
%                      convolvedFeatures(featureNum, imageNum, imageRow, imageCol)
%
% Returns:
%  pooledFeatures - matrix of pooled features in the form
%                   pooledFeatures(featureNum, imageNum, poolRow, poolCol)
%     

numImages = size(convolvedFeatures, 2);%圖片數(shù)
numFeatures = size(convolvedFeatures, 1);%特征數(shù)
convolvedDim = size(convolvedFeatures, 3);%圖片的行數(shù)
resultDim  = floor(convolvedDim / poolDim);
pooledFeatures = zeros(numFeatures, numImages, resultDim, resultDim);

% -------------------- YOUR CODE HERE --------------------
% Instructions:
%   Now pool the convolved features in regions of poolDim x poolDim,
%   to obtain the 
%   numFeatures x numImages x (convolvedDim/poolDim) x (convolvedDim/poolDim) 
%   matrix pooledFeatures, such that
%   pooledFeatures(featureNum, imageNum, poolRow, poolCol) is the 
%   value of the featureNum feature for the imageNum image pooled over the
%   corresponding (poolRow, poolCol) pooling region 
%   (see http://ufldl/wiki/index.php/Pooling )
%   
%   Use mean pooling here.
% -------------------- YOUR CODE HERE --------------------
for imageNum = 1:numImages
    for featureNum = 1:numFeatures
        for poolRow = 1:resultDim
            offsetRow = 1+(poolRow-1)*poolDim;
            for poolCol = 1:resultDim
                offsetCol = 1+(poolCol-1)*poolDim;
                patch = convolvedFeatures(featureNum,imageNum,offsetRow:offsetRow+poolDim-1,...
                    offsetCol:offsetCol+poolDim-1);%取出一個(gè)patch
                pooledFeatures(featureNum,imageNum,poolRow,poolCol) = mean(patch(:));%使用均值pool
            end
        end
    end
end

end
\end{lstlisting}
池化的過(guò)程如圖\ref{fig:3}一樣,輸入池化的邊長(zhǎng)和卷積后的特征圖。輸出池化后的特征。在循環(huán)中對(duì)于那一片區(qū)域使用均值池化mean(patch(:)),最后得到池化后的特征。

\subsection{CnnExercise.m}
\begin{lstlisting}
%%======================================================================
%% STEP 0: Initialization
%  Here we initialize some parameters used for the exercise.

imageDim = 64;         % image dimension
imageChannels = 3;     % number of channels (rgb, so 3)

patchDim = 8;          % patch dimension
numPatches = 50000;    % number of patches

visibleSize = patchDim * patchDim * imageChannels;  % number of input units ,8*8*3=192
outputSize = visibleSize;   % number of output units
hiddenSize = 400;           % number of hidden units 

epsilon = 0.1;           % epsilon for ZCA whitening

poolDim = 19;          % dimension of pooling region
\end{lstlisting}
首先初始化各種參數(shù),如數(shù)據(jù)大小、通道數(shù)量、卷積核大小、隱含層神經(jīng)元個(gè)數(shù)、池化大小等。

\begin{lstlisting}
%%======================================================================
%% STEP 1: Train a sparse autoencoder (with a linear decoder) to learn 
%  features from color patches. If you have completed the linear decoder
%  execise, use the features that you have obtained from that exercise, 
%  loading them into optTheta. Recall that we have to keep around the 
%  parameters used in whitening (i.e., the ZCA whitening matrix and the
%  meanPatch)

% --------------------------- YOUR CODE HERE --------------------------
% Train the sparse autoencoder and fill the following variables with 
% the optimal parameters:

optTheta =  zeros(2*hiddenSize*visibleSize+hiddenSize+visibleSize, 1);%對(duì)patch網(wǎng)絡(luò)作用的所有參數(shù)個(gè)數(shù)
ZCAWhite =  zeros(visibleSize, visibleSize);
meanPatch = zeros(visibleSize, 1);
load STL10Features.mat;
% Display and check to see that the features look good
W = reshape(optTheta(1:visibleSize * hiddenSize), hiddenSize, visibleSize);
b = optTheta(2*hiddenSize*visibleSize+1:2*hiddenSize*visibleSize+hiddenSize);

\end{lstlisting}
加載了之前用稀疏自編碼器學(xué)習(xí)得到的特征。
\begin{lstlisting}
load stlTrainSubset.mat % loads numTrainImages, trainImages, trainLabels

%% Use only the first 8 images for testing
convImages = trainImages(:, :, :, 1:8); 

% NOTE: Implement cnnConvolve in cnnConvolve.m first!w和b已經(jīng)是矩陣或向量的形式了
convolvedFeatures = cnnConvolve(patchDim, hiddenSize, convImages, W, b, ZCAWhite, meanPatch);
\end{lstlisting}
加載了數(shù)據(jù)集并劃分了數(shù)據(jù)集為訓(xùn)練集和測(cè)試集,并使用之前完成的cnnConvolve


\begin{lstlisting}
%% STEP 2b: Checking your convolution
%  To ensure that you have convolved the features correctly, we have
%  provided some code to compare the results of your convolution with
%  activations from the sparse autoencoder

% For 1000 random points
for i = 1:1000    
    featureNum = randi([1, hiddenSize]);%隨機(jī)選取一個(gè)特征
    imageNum = randi([1, 8]);%隨機(jī)選取一個(gè)樣本
    imageRow = randi([1, imageDim - patchDim + 1]);%隨機(jī)選取一個(gè)點(diǎn)
    imageCol = randi([1, imageDim - patchDim + 1]);    
   
    %在那8張圖片中隨機(jī)選取1張圖片,然后又根據(jù)隨機(jī)選取的左上角點(diǎn)選取1個(gè)patch
    patch = convImages(imageRow:imageRow + patchDim - 1, imageCol:imageCol + patchDim - 1, :, imageNum);
    patch = patch(:); %這樣是按照列的順序來(lái)排列的           
    patch = patch - meanPatch;
    patch = ZCAWhite * patch;%用同樣的參數(shù)對(duì)該patch進(jìn)行白化處理
    
    features = feedForwardAutoencoder(optTheta, hiddenSize, visibleSize, patch); %計(jì)算出該patch的輸出值

    if abs(features(featureNum, 1) - convolvedFeatures(featureNum, imageNum, imageRow, imageCol)) > 1e-9
        fprintf("Convolved feature does not match activation from autoencoder\n");
        fprintf("Feature Number    : %d\n", featureNum);
        fprintf("Image Number      : %d\n", imageNum);
        fprintf("Image Row         : %d\n", imageRow);
        fprintf("Image Column      : %d\n", imageCol);
        fprintf("Convolved feature : %0.5f\n", convolvedFeatures(featureNum, imageNum, imageRow, imageCol));
        fprintf("parse AE feature : %0.5f\n", features(featureNum, 1));       
        error("Convolved feature does not match activation from autoencoder");
    end 
end

disp("Congratulations! Your convolution code passed the test.");
\end{lstlisting}
利用之前實(shí)現(xiàn)的稀疏自編碼器來(lái)檢驗(yàn)卷積實(shí)現(xiàn)的正確性。

\begin{lstlisting}
%% STEP 2c: Implement pooling
%  Implement pooling in the function cnnPool in cnnPool.m

% NOTE: Implement cnnPool in cnnPool.m first!
pooledFeatures = cnnPool(poolDim, convolvedFeatures);

%% STEP 2d: Checking your pooling
%  To ensure that you have implemented pooling, we will use your pooling
%  function to pool over a test matrix and check the results.

testMatrix = reshape(1:64, 8, 8);%將1~64這64個(gè)數(shù)字弄成一個(gè)矩陣,按列的方向依次遞增
%直接計(jì)算均值pooling值
expectedMatrix = [mean(mean(testMatrix(1:4, 1:4))) mean(mean(testMatrix(1:4, 5:8))); ...
                  mean(mean(testMatrix(5:8, 1:4))) mean(mean(testMatrix(5:8, 5:8))); ];
            
testMatrix = reshape(testMatrix, 1, 1, 8, 8);

%squeeze去掉維度為1的那一維
pooledFeatures = squeeze(cnnPool(4, testMatrix));%參數(shù)值為4表明是對(duì)4*4的區(qū)域進(jìn)行pooling

if ~isequal(pooledFeatures, expectedMatrix)
    disp("Pooling incorrect");
    disp("Expected");
    disp(expectedMatrix);
    disp("Got");
    disp(pooledFeatures);
else
    disp("Congratulations! Your pooling code passed the test.");
end
\end{lstlisting}
通過(guò)生成一個(gè)1-64的8*8的矩陣并按4*4的大小來(lái)直接測(cè)試池化實(shí)現(xiàn)是否正確。

\begin{lstlisting}
%%======================================================================
%% STEP 3: Convolve and pool with the dataset
%  In this step, you will convolve each of the features you learned with
%  the full large images to obtain the convolved features. You will then
%  pool the convolved features to obtain the pooled features for
%  classification.
%
%  Because the convolved features matrix is very large, we will do the
%  convolution and pooling 50 features at a time to avoid running out of
%  memory. Reduce this number if necessary

stepSize = 50;
assert(mod(hiddenSize, stepSize) == 0, "stepSize should divide hiddenSize");%hiddenSize/stepSize為整數(shù),這里分8次進(jìn)行

load stlTrainSubset.mat % loads numTrainImages, trainImages, trainLabels
load stlTestSubset.mat  % loads numTestImages,  testImages,  testLabels

pooledFeaturesTrain = zeros(hiddenSize, numTrainImages, ...%image是大圖片的尺寸,這里為64
    floor((imageDim - patchDim + 1) / poolDim), ... %.poolDim為多大的區(qū)域pool一次,這里為19,即19*19大小pool一次.
    floor((imageDim - patchDim + 1) / poolDim) );%最后算出的pooledFeaturesTrain大小為400*2000*3*3
pooledFeaturesTest = zeros(hiddenSize, numTestImages, ...
    floor((imageDim - patchDim + 1) / poolDim), ...
    floor((imageDim - patchDim + 1) / poolDim) );%pooledFeaturesTest大小為400*3200*3*3

tic();

\end{lstlisting}
加載訓(xùn)練集和測(cè)試集來(lái)訓(xùn)練網(wǎng)絡(luò)并供后面分類(lèi)測(cè)試正確率做準(zhǔn)備。

\begin{lstlisting}
for convPart = 1:(hiddenSize / stepSize)%stepSize表示分批次進(jìn)行原始圖片數(shù)據(jù)的特征提取,一次進(jìn)行stepSize個(gè)隱含層節(jié)點(diǎn)
    
    featureStart = (convPart - 1) * stepSize + 1;%選取起始的特征
    featureEnd = convPart * stepSize;%選取結(jié)束的特征
    
    fprintf("Step %d: features %d to %d\n", convPart, featureStart, featureEnd);  
    Wt = W(featureStart:featureEnd, :);
    bt = b(featureStart:featureEnd);    
    
    fprintf("Convolving and pooling train images\n");
    convolvedFeaturesThis = cnnConvolve(patchDim, stepSize, ...%參數(shù)2表示的是當(dāng)前"隱含層"節(jié)點(diǎn)的個(gè)數(shù)
        trainImages, Wt, bt, ZCAWhite, meanPatch);
    pooledFeaturesThis = cnnPool(poolDim, convolvedFeaturesThis);
    pooledFeaturesTrain(featureStart:featureEnd, :, :, :) = pooledFeaturesThis;   
    toc();
    clear convolvedFeaturesThis pooledFeaturesThis;%這些大的變量在不用的情況下全部刪除掉,因?yàn)楹竺嬗玫氖莟est部分
    
    fprintf("Convolving and pooling test images\n");
    convolvedFeaturesThis = cnnConvolve(patchDim, stepSize, ...
        testImages, Wt, bt, ZCAWhite, meanPatch);
    pooledFeaturesThis = cnnPool(poolDim, convolvedFeaturesThis);
    pooledFeaturesTest(featureStart:featureEnd, :, :, :) = pooledFeaturesThis;   
    toc();

    clear convolvedFeaturesThis pooledFeaturesThis;

end
% You might want to save the pooled features since convolution and pooling takes a long time
save("cnnPooledFeatures.mat", "pooledFeaturesTrain", "pooledFeaturesTest");
toc();
\end{lstlisting}
分別讓訓(xùn)練集圖片和測(cè)試集圖片經(jīng)過(guò)卷積層和池化層,最后出來(lái)池化后的特征放入對(duì)應(yīng)的變量中。

\begin{lstlisting}
%% STEP 4: Use pooled features for classification
%  Now, you will use your pooled features to train a softmax classifier,
%  using softmaxTrain from the softmax exercise.
%  Training the softmax classifer for 1000 iterations should take less than
%  10 minutes.

% Add the path to your softmax solution, if necessary
% addpath /path/to/solution/

% Setup parameters for softmax
softmaxLambda = 1e-4;%權(quán)值懲罰系數(shù)
numClasses = 4;
% Reshape the pooledFeatures to form an input vector for softmax
softmaxX = permute(pooledFeaturesTrain, [1 3 4 2]);%permute是調(diào)整順序,把圖片放在最后
softmaxX = reshape(softmaxX, numel(pooledFeaturesTrain) / numTrainImages,...%numel(pooledFeaturesTrain) / numTrainImages
                        numTrainImages);                                    %為每一張圖片得到的特征向量長(zhǎng)度                                                             
    
softmaxY = trainLabels;

options = struct;
options.maxIter = 200;
softmaxModel = softmaxTrain(numel(pooledFeaturesTrain) / numTrainImages,...%第一個(gè)參數(shù)為inputSize
    numClasses, softmaxLambda, softmaxX, softmaxY, options);
\end{lstlisting}
softmax是一種多分類(lèi)器,將之前得到的訓(xùn)練集池化后的特征以及訓(xùn)練集的標(biāo)簽輸入這個(gè)分類(lèi)器中訓(xùn)練這個(gè)分類(lèi)器。
\begin{lstlisting}
%% STEP 5: Test classifer
%  Now you will test your trained classifer against the test images

softmaxX = permute(pooledFeaturesTest, [1 3 4 2]);
softmaxX = reshape(softmaxX, numel(pooledFeaturesTest) / numTestImages, numTestImages);
softmaxY = testLabels;

[pred] = softmaxPredict(softmaxModel, softmaxX);
acc = (pred(:) == softmaxY(:));
acc = sum(acc) / size(acc, 1);
fprintf("Accuracy: %2.3f%%\n", acc * 100);%計(jì)算預(yù)測(cè)準(zhǔn)確度

% You should expect to get an accuracy of around 80% on the test images.
\end{lstlisting}
最后利用之前訓(xùn)練好的softmax分類(lèi)器測(cè)試集進(jìn)行分類(lèi)并比對(duì)測(cè)試集的label來(lái)計(jì)算正確率。

最后結(jié)果(沒(méi)有充分環(huán)境進(jìn)行訓(xùn)練,結(jié)果由代碼編寫(xiě)者所得):

Accuracy: 80.406\%

與教程中的80\%相差無(wú)幾



\section{Summary}
卷積和池化的局部連接權(quán)值共享大大降低了網(wǎng)絡(luò)中參數(shù)的個(gè)數(shù),在處理大圖像的時(shí)候非常明顯。且卷積和池化后讓圖像具有一定的平移不變性(一個(gè)是卷積時(shí)候圖像平移特征也平移,另一個(gè)是池化時(shí)候取最大或平均也讓其具有了一定的不變性)



\end{CJK}
\end{document}
最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請(qǐng)結(jié)合常識(shí)與多方信息審慎甄別。
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀(guān)點(diǎn),簡(jiǎn)書(shū)系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容