1.首先,caffe的安裝很麻煩,稍后有時(shí)間我在詳細(xì)寫(xiě)一個(gè)教程。
先貼個(gè)官網(wǎng)的安裝方法,http://caffe.berkeleyvision.org/installation.html
2.安裝好之后,仔細(xì)閱讀并照著流程跑一下官網(wǎng)給的例子,鏈接如下:
1).http://caffe.berkeleyvision.org/gathered/examples/mnist.html
2).http://caffe.berkeleyvision.org/gathered/examples/cifar10.html
……
3.看完之后,可以仔細(xì)研究以下通過(guò)python來(lái)使用caffe的例子,了解使用caffe的方法。
1). http://nbviewer.jupyter.org/github/BVLC/caffe/blob/master/examples/01-learning-lenet.ipynb
2). http://nbviewer.jupyter.org/github/BVLC/caffe/blob/master/examples/00-classification.ipynb
3).http://www.cnblogs.com/empty16/p/4878164.html
……
4.以下以人臉識(shí)別問(wèn)題使用以下庫(kù)使用caffe進(jìn)行訓(xùn)練和測(cè)試:
http://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html
里面包括了40個(gè)人,每人10張人臉照片。如下圖:

由于官網(wǎng)上給出了Model_Zoo的鏈接,通過(guò)查詢得知,已經(jīng)有訓(xùn)練好的人臉識(shí)別模型,可以直接拿來(lái)使用,即:
下載地址:
http://www.robots.ox.ac.uk/%7Evgg/software/vgg_face/src/vgg_face_caffe.tar.gz
在網(wǎng)站VGG Face Descriptor中提供了模型和源碼,具體使用參考相關(guān)說(shuō)明即可,基本的流程應(yīng)該比較簡(jiǎn)單:
- 在腳本源碼中指定Caffe庫(kù)的路徑,指定.caffemodel模型,指定輸入數(shù)據(jù),通過(guò)函數(shù)調(diào)用網(wǎng)絡(luò)的測(cè)試功能,獲取網(wǎng)絡(luò)輸出結(jié)果。
- 執(zhí)行腳本源碼。
如果源碼的使用說(shuō)明不能夠充分理解,可以參考Jupyter Notebook Viewer的示例?;玖鞒膛cImageNet的分類任務(wù)應(yīng)該是相同的。另外,模型的數(shù)據(jù)集在VGG Face Descriptor相關(guān)論文的第三章有說(shuō)明。pdf
其次因?yàn)槿四槇D片是灰度圖,需要首先用OpenCV將其轉(zhuǎn)化成RGB的圖片才能使用VGG。python代碼如下:
import os
import cv2
import sysdef
convert_gray_img_to_rgb(base_dir,dir_pre_str,dir_range_list,dir_post_str,file_format,partion_list):
for i in dir_range_list:
for index,partion_list_part in enumerate(partion_list):
for k in partion_list_part:
if base_dir=="":
base_dir_str=""
else:
base_dir_str=base_dir+os.sep
type=""
if index==0:
type="train"
elif index==1:
type="tst"
file_input_path=base_dir_str+type+os.sep+dir_pre_str+str(i)+\
dir_post_str+os.sep+str(k)+file_format
img = cv2.imread( file_input_path,0 )
img = cv2.cvtColor( img, cv2.COLOR_GRAY2RGB )
out_file= base_dir_str+type+os.sep+dir_pre_str+\
str(i)+dir_post_str+os.sep+str(k)+".jpg"
cv2.imwrite(out_file, img)
if __name__=='__main__':
source_dir="/Users/Ren/Downloads/att_faces_back"
dir_pre_str="s"
dir_range_list=range(1,41)
test_partion_list=[7,8,9,10]
train_partion_list=[1,2,3,4,5,6]
dir_post_str=""
file_format=".pgm"
convert_gray_img_to_rgb(source_dir,dir_pre_str,dir_range_list\
,dir_post_str,file_format,[train_partion_list,test_partion_list])
對(duì)于此數(shù)據(jù)庫(kù),首先需要將人臉的數(shù)據(jù)進(jìn)行劃分:訓(xùn)練和測(cè)試集,并轉(zhuǎn)換成lmdb模型。過(guò)程請(qǐng)參考:http://www.cnblogs.com/dupuleng/articles/4370236.html。我的代碼如下,將其保存到了example/att_faces/create_att_faces.sh
#!/usr/bin/env sh
# Create the imagenet lmdb inputs
# N.B. set the path to the imagenet train + val data dirs
EXAMPLE=examples/att_faces
DATA=data/att_faces
TOOLS=build/tools
DBTYPE=lmdb
TRAIN_DATA_ROOT=$DATA/train/
TEST_DATA_ROOT=$DATA/tst/
ROOT=./
# Set RESIZE=true to resize the images to 256x256. Leave as false if images have
# already been resized using another tool.
RESIZE=true
if $RESIZE; then
RESIZE_HEIGHT=224
RESIZE_WIDTH=224
else
RESIZE_HEIGHT=0
RESIZE_WIDTH=0
fi
if [ ! -d "$TRAIN_DATA_ROOT" ]; then
echo "Error: TRAIN_DATA_ROOT is not a path to a directory: $TRAIN_DATA_ROOT"
echo "Set the TRAIN_DATA_ROOT variable in create_att_faces.sh to the path" \
"where the ImageNet training data is stored."
exit 1
fi
if [ ! -d "$TEST_DATA_ROOT" ]; then
echo "Error: TEST_DATA_ROOT is not a path to a directory: $TEST_DATA_ROOT"
echo "Set the TEST_DATA_ROOT variable in create_att_faces.sh to the path" \
"where the ImageNet test data is stored."
exit 1
fi
echo "Creating train lmdb..."
rm -rf $EXAMPLE/att_faces_train_$DBTYPE $EXAMPLE/att_faces_tst_$DBTYPE
GLOG_logtostderr=1 $TOOLS/convert_imageset \
--resize_height=$RESIZE_HEIGHT \
--resize_width=$RESIZE_WIDTH \
--shuffle \
$ROOT \
$DATA/train.txt \
$EXAMPLE/att_faces_train_$DBTYPE
echo "Creating tst lmdb..."
rm -f $EXAMPLE/mean.binaryproto
GLOG_logtostderr=1 $TOOLS/convert_imageset \
--resize_height=$RESIZE_HEIGHT \
--resize_width=$RESIZE_WIDTH \
--shuffle \
$ROOT \
$DATA/tst.txt \
$EXAMPLE/att_faces_tst_$DBTYPE
echo "Computing image mean..."
./build/tools/compute_image_mean -backend=$DBTYPE \
$EXAMPLE/att_faces_train_$DBTYPE $EXAMPLE/mean.binaryproto
echo "Done."
之后可以使用該數(shù)據(jù)通過(guò)以models/finetune_flickr_style/train_val.prototxt 為模板,以vgg_face_caffe/VGG_FACE_deploy.prototxt 為內(nèi)容將網(wǎng)絡(luò)結(jié)構(gòu)進(jìn)行填充。即加入數(shù)據(jù)輸入層與改變最后一層的全連接層輸出數(shù)量,修正掉舊caffe的語(yǔ)法。修正后的內(nèi)容如下:
name: "VGG_FACE_16_Net"
layer {
name: "data"
type: "ImageData"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
mirror: true
crop_size: 224
mean_file: "examples/att_faces/mean.binaryproto"
}
image_data_param {
source: "data/att_faces/train.txt"
batch_size: 1
new_height: 224
new_width: 224
}
}
layer {
name: "data"
type: "ImageData"
top: "data"
top: "label"
include {
phase: TEST
}
transform_param {
mirror: false
crop_size: 224
mean_file: "examples/att_faces/mean.binaryproto"
}
image_data_param {
source: "data/att_faces/tst.txt"
batch_size: 1
new_height: 224
new_width: 224
}
}
layer {
name: "conv1_1"
type: "Convolution"
bottom: "data"
top: "conv1_1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 64
kernel_size: 3
pad: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1_1"
type: "ReLU"
bottom: "conv1_1"
top: "conv1_1"
}
layer {
name: "conv1_2"
type: "Convolution"
bottom: "conv1_1"
top: "conv1_2"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 64
kernel_size: 3
pad: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1_2"
type: "ReLU"
bottom: "conv1_2"
top: "conv1_2"
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1_2"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "conv2_1"
type: "Convolution"
bottom: "pool1"
top: "conv2_1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 128
kernel_size: 3
pad: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu2_1"
type: "ReLU"
bottom: "conv2_1"
top: "conv2_1"
}
layer {
name: "conv2_2"
type: "Convolution"
bottom: "conv2_1"
top: "conv2_2"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 128
kernel_size: 3
pad: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu2_2"
type: "ReLU"
bottom: "conv2_2"
top: "conv2_2"
}
layer {
name: "pool2"
type: "Pooling"
bottom: "conv2_2"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "conv3_1"
type: "Convolution"
bottom: "pool2"
top: "conv3_1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 256
kernel_size: 3
pad: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3_1"
type: "ReLU"
bottom: "conv3_1"
top: "conv3_1"
}
layer {
name: "conv3_2"
type: "Convolution"
bottom: "conv3_1"
top: "conv3_2"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 256
kernel_size: 3
pad: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3_2"
type: "ReLU"
bottom: "conv3_2"
top: "conv3_2"
}
layer {
name: "conv3_3"
type: "Convolution"
bottom: "conv3_2"
top: "conv3_3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 256
kernel_size: 3
pad: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3_3"
type: "ReLU"
bottom: "conv3_3"
top: "conv3_3"
}
layer {
name: "pool3"
type: "Pooling"
bottom: "conv3_3"
top: "pool3"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "conv4_1"
type: "Convolution"
bottom: "pool3"
top: "conv4_1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 512
kernel_size: 3
pad: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu4_1"
type: "ReLU"
bottom: "conv4_1"
top: "conv4_1"
}
layer {
name: "conv4_2"
type: "Convolution"
bottom: "conv4_1"
top: "conv4_2"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 512
kernel_size: 3
pad: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu4_2"
type: "ReLU"
bottom: "conv4_2"
top: "conv4_2"
}
layer {
name: "conv4_3"
type: "Convolution"
bottom: "conv4_2"
top: "conv4_3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 512
kernel_size: 3
pad: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu4_3"
type: "ReLU"
bottom: "conv4_3"
top: "conv4_3"
}
layer {
name: "pool4"
type: "Pooling"
bottom: "conv4_3"
top: "pool4"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "conv5_1"
type: "Convolution"
bottom: "pool4"
top: "conv5_1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 512
kernel_size: 3
pad: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu5_1"
type: "ReLU"
bottom: "conv5_1"
top: "conv5_1"
}
layer {
name: "conv5_2"
type: "Convolution"
bottom: "conv5_1"
top: "conv5_2"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 512
kernel_size: 3
pad: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu5_2"
type: "ReLU"
bottom: "conv5_2"
top: "conv5_2"
}
layer {
name: "conv5_3"
type: "Convolution"
bottom: "conv5_2"
top: "conv5_3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 512
kernel_size: 3
pad: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu5_3"
type: "ReLU"
bottom: "conv5_3"
top: "conv5_3"
}
layer {
name: "pool5"
type: "Pooling"
bottom: "conv5_3"
top: "pool5"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "fc6"
type: "InnerProduct"
bottom: "pool5"
top: "fc6"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 4096
weight_filler {
type: "gaussian"
std: 0.005
}
bias_filler {
type: "constant"
value: 1
}
}
}
layer {
name: "relu6"
type: "ReLU"
bottom: "fc6"
top: "fc6"
}
layer {
name: "drop6"
type: "Dropout"
bottom: "fc6"
top: "fc6"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc7"
type: "InnerProduct"
bottom: "fc6"
top: "fc7"
# Note that lr_mult can be set to 0 to disable any fine-tuning of this, and any other, layer
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 4096
weight_filler {
type: "gaussian"
std: 0.005
}
bias_filler {
type: "constant"
value: 1
}
}
}
layer {
name: "relu7"
type: "ReLU"
bottom: "fc7"
top: "fc7"
}
layer {
name: "drop7"
type: "Dropout"
bottom: "fc7"
top: "fc7"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc8_flickr"
type: "InnerProduct"
bottom: "fc7"
top: "fc8_flickr"
# lr_mult is set to higher than for other layers, because this layer is starting from random while the others are already trained
propagate_down: false
inner_product_param {
num_output: 40
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "accuracy"
type: "Accuracy"
bottom: "fc8_flickr"
bottom: "label"
top: "accuracy"
include {
phase: TEST
}
}
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "fc8_flickr"
bottom: "label"
top: "loss"
}
拷貝models/finetune_flickr_style/solver.prototxt,并將新的針對(duì)現(xiàn)問(wèn)題進(jìn)行修改,主要修改
net: "models/finetune/train_val.prototxt"
test_iter: 100
test_interval: 100
# lr for fine-tuning should be lower than when starting from scratch
base_lr: 0.001
lr_policy: "step"
gamma: 0.1
# stepsize should also be lower, as we're closer to being done
stepsize: 2000
display: 20
max_iter: 10000
momentum: 0.9
weight_decay: 0.0005
snapshot: 1000
snapshot_prefix: "models/finetune/finetune"
# uncomment the following to default to CPU mode solving
#solver_mode: CPU
最后使用自己的數(shù)據(jù)對(duì)模型進(jìn)行fine-tuning。代碼如下:
./build/tools/caffe train -solver models/finetune/solver.prototxt -weights models/vgg_face_caffe/VGG_FACE.caffemodel -gpu 0