文章代碼來源:《deep learning on keras》,非常好的一本書,大家如果英語好,推薦直接閱讀該書,如果時間不夠,可以看看此系列文章,文章為我自己翻譯的內(nèi)容加上自己的一些思考,水平有限,多有不足,請多指正,翻譯版權(quán)所有,若有轉(zhuǎn)載,請先聯(lián)系本人。
個人方向為數(shù)值計算,日后會向深度學(xué)習(xí)和計算問題的融合方面靠近,若有相近專業(yè)人士,歡迎聯(lián)系。
系列文章:
一、搭建屬于你的第一個神經(jīng)網(wǎng)絡(luò)
二、訓(xùn)練完的網(wǎng)絡(luò)去哪里找
三、【keras實戰(zhàn)】波士頓房價預(yù)測
四、keras的function API
五、keras callbacks使用
六、機器學(xué)習(xí)基礎(chǔ)Ⅰ:機器學(xué)習(xí)的四個標簽
七、機器學(xué)習(xí)基礎(chǔ)Ⅱ:評估機器學(xué)習(xí)模型
八、機器學(xué)習(xí)基礎(chǔ)Ⅲ:數(shù)據(jù)預(yù)處理、特征工程和特征學(xué)習(xí)
九、機器學(xué)習(xí)基礎(chǔ)Ⅳ:過擬合和欠擬合
十、機器學(xué)習(xí)基礎(chǔ)Ⅴ:機器學(xué)習(xí)的一般流程十一、計算機視覺中的深度學(xué)習(xí):卷積神經(jīng)網(wǎng)絡(luò)介紹
十二、計算機視覺中的深度學(xué)習(xí):從零開始訓(xùn)練卷積網(wǎng)絡(luò)
十三、計算機視覺中的深度學(xué)習(xí):使用預(yù)訓(xùn)練網(wǎng)絡(luò)
十四、計算機視覺中的神經(jīng)網(wǎng)絡(luò):可視化卷積網(wǎng)絡(luò)所學(xué)到的東西
為什么需要API
看似有了之前的分類和回歸兩種例子,我們已經(jīng)能夠搞定世界上的所有東西了,但是,千萬不要高興的太早,因為我們之前介紹的只是將卷積層一層一層堆放的方法,這個方法雖然很管用,但是在面臨以下三種情況,就顯得很無力了:
- 多個輸入的情況
- 多個輸出的情況
- 隔著層有關(guān)系的情況
多個輸入
文章舉了一個價格預(yù)測的例子:我們有三個信息源來得到價格相關(guān)信息:
- 用戶提交的數(shù)據(jù)
- 顧客口述的文本資料
- 相關(guān)圖片

當然,我們可以分別三個單獨訓(xùn)練以后加權(quán)平均,但是我們又無法保證這三個信息輸入源的信息是相互獨立的,很可能他們交叉以后能得到更好的結(jié)果。
多個輸出
例如,我們有一堆書的數(shù)據(jù),我們又希望得到書的種類,又希望得到出版日期,當然此時也可以分別訓(xùn)練得到,同樣的,我們也無法保證這二者的相互獨立性,所以一起訓(xùn)練可能會更好。

類似于圖的結(jié)構(gòu)


圖1和圖2分別表示了多個不同卷積層和殘差網(wǎng)絡(luò)的情況,這都是我們原來的方法難以解決的,所以這里隆重介紹API來搭建這些略微復(fù)雜的網(wǎng)絡(luò)。
如何使用API
from keras import Input, layers
# This is a tensor.
input_tensor = Input(shape=(32,))
# A layer is a function.
dense = layers.Dense(32, activation='relu')
# A layer may be called on a tensor, and it returns a tensor.
output_tensor = dense(input_tensor)
在這里,我們把dense定義成了一個函數(shù),input_tensor是一個張量,經(jīng)過dense的作用輸出成另一個張量。
from keras.models import Sequential, Model
from keras import layers
from keras import Input
# A Sequential model, which you already know all about.
seq_model = Sequential()
seq_model.add(layers.Dense(32, activation='relu', input_shape=(64,)))
seq_model.add(layers.Dense(32, activation='relu'))
seq_model.add(layers.Dense(10, activation='softmax'))
# Its functional equivalent
input_tensor = Input(shape=(64,))
x = layers.Dense(32, activation='relu')(input_tensor)
x = layers.Dense(32, activation='relu')(x)
output_tensor = layers.Dense(10, activation='softmax')(x)
# The Model class turns an input tensor and output tensor into a model
model = Model(input_tensor, output_tensor)
# Let's look at it!
model.summary()
此處使用model.summary()來查看我們構(gòu)建的網(wǎng)絡(luò)的情況:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) (None, 64) 0
_________________________________________________________________
dense_1 (Dense) (None, 32) 2080
_________________________________________________________________
dense_2 (Dense) (None, 32) 105
_________________________________________________________________
dense_3 (Dense) (None, 10) 33
=================================================================
Total params: 3,466
Trainable params: 3,466
Non-trainable params: 0
_______________________________________________________________
【實例】問答系統(tǒng)

構(gòu)建問答系統(tǒng)
這個網(wǎng)絡(luò)會根據(jù)輸入的問題和給的參考文獻來自己找答案,通過lstm層處理以后用concatenate,將二者合起來訓(xùn)練得出答案,屬于典型的多輸入問題。
from keras.models import Model
from keras import layers
from keras import Input
text_vocabulary_size = 10000
question_vocabulary_size = 10000
answer_vocabulary_size = 500
# Our text input is a variable-length sequence of integers.
# Note that we can optionally name our inputs!
text_input = Input(shape=(None,), dtype='int32', name='text')
# Which we embed into a sequence of vectors of size 64
embedded_text = layers.Embedding(64, text_vocabulary_size)(text_input)
# Which we encoded in a single vector via a LSTM
encoded_text = layers.LSTM(32)(embedded_text)
# Same process (with different layer instances) for the question
question_input = Input(shape=(None,), dtype='int32', name='question')
embedded_question = layers.Embedding(32, question_vocabulary_size)(question_input)
encoded_question = layers.LSTM(16)(embedded_question)
# We then concatenate the encoded question and encoded text
concatenated = layers.concatenate([encoded_text, encoded_question], axis=-1)
# And we add a softmax classifier on top
answer = layers.Dense(answer_vocabulary_size, activation='softmax')(concatenated)
# At model instantiation, we specify the two inputs and the output:
model = Model([text_input, question_input], answer)
model.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['acc'])
理解起來很簡單,我們把layers看成函數(shù),然后只需要分別定義兩個輸入的tensor,然后按照圖的層次順序依次用函數(shù)調(diào)用上一層的輸出作為這一層的自變量。
生成隨機數(shù)據(jù)作為輸入輸出
import numpy as np
# Let's generate some dummy Numpy data
text = np.random.randint(1, text_vocabulary_size, size=(num_samples, max_length))
question = np.random.randint(1, question_vocabulary_size, size=(num_samples, max_length))
# Answers are one-hot encoded, not integers
answers = np.random.randint(0, 1, size=(num_samples, answer_vocabulary_size))
# Fitting using a list of inputs
model.fit([text, question], answers, epochs=10, batch_size=128)
# Fitting using a dictionary of inputs (only if inputs were named!)
model.fit({'text': text, 'question': question}, answers,
epochs=10, batch_size=128)
接下來生成了一些隨機的數(shù)據(jù)來喂給網(wǎng)絡(luò)即可。
【實例】多輸出情況

from keras import layers
from keras import Input
from keras.models import Model
vocabulary_size = 50000
num_income_groups = 10
posts_input = Input(shape=(None,), dtype='int32', name='posts')
embedded_posts = layers.Embedding(256, vocabulary_size)(posts_input)
x = layers.Conv1D(128, 5, activation='relu')(embedded_posts)
x = layers.MaxPooling1D(5)(x)
x = layers.Conv1D(256, 5, activation='relu')(x)
x = layers.Conv1D(256, 5, activation='relu')(x)
x = layers.MaxPooling1D(5)(x)
x = layers.Conv1D(256, 5, activation='relu')(x)
x = layers.Conv1D(256, 5, activation='relu')(x)
x = layers.GlobalMaxPooling1D()(x)
x = layers.Dense(128, activation='relu')(x)
# Note that we are giving names to the output layers.
age_prediction = layers.Dense(1, name='age')(x)
income_prediction = layers.Dense(num_income_groups, activation='softmax', name='income')(x)
gender_prediction = layers.Dense(1, activation='sigmoid', name='gender')(x)
model = Model(input_posts, [age_prediction, income_prediction, gender_prediction])
上面代碼也沒什么難度,可以理解為除了最后輸出那一層加的不一樣以外,之前提取特征呀什么的都一樣,接下來就是compile的選取:
model.compile(optimizer='rmsprop',
loss=['mse', 'categorical_crossentropy', 'binary_crossentropy'])
# Equivalent (only possible if you gave names to the output layers!):
model.compile(optimizer='rmsprop',
loss={'age': 'mse',
'income': 'categorical_crossentropy',
'gender': 'binary_crossentropy'})
這里給出了兩種寫loss的方法,第二種的前提是在定義了名字的前提下。
model.compile(optimizer='rmsprop',
loss=['mse', 'categorical_crossentropy', 'binary_crossentropy'],
loss_weights=[0.25, 1., 10.])
# Equivalent (only possible if you gave names to the output layers!):
model.compile(optimizer='rmsprop',
loss={'age': 'mse',
'income': 'categorical_crossentropy',
'gender': 'binary_crossentropy'},
loss_weights={'age': 0.25,
'income': 1.,
'gender': 10.})
有偏好的情況下,我們給loss賦予權(quán)重,這里說明我們更加重視性別的預(yù)測結(jié)果。
# age_targets, income_targets and gender_targets are assumed to be Numpy arrays
model.fit(posts, [age_targets, income_targets, gender_targets],
epochs=10, batch_size=64)
# Equivalent (only possible if you gave names to the output layers!):
model.fit(posts, {'age': age_targets,
'income': income_targets,
'gender': gender_targets},
epochs=10, batch_size=64)
喂數(shù)據(jù)。
圖模樣的網(wǎng)絡(luò)構(gòu)建
雖然我們有了API這個強大的工具,幾乎可以構(gòu)建所有的圖,但是有一類圖我們是弄不了的,那就是循環(huán)圖,所以接下來我們討論的都是非循環(huán)圖。
from keras import layers
# We assume the existence of a 4D input tensor `x`
# Every branch has the same stride value (2), which is necessary to keep all
# branch outputs the same size, so as to be able to concatenate them.
branch_a = layers.Conv2D(128, 1, activation='relu', strides=2)(x)
# In this branch, the striding occurs in the spatial convolution layer
branch_b = layers.Conv2D(128, 1, activation='relu')(x)
branch_b = layers.Conv2D(128, 3, activation='relu', strides=2)(branch_b)
# In this branch, the striding occurs in the average pooling layer
branch_c = layers.AveragePooling2D(3, strides=2, activation='relu')(x)
branch_c = layers.Conv2D(128, 3, activation='relu')(branch_c)
branch_d = layers.Conv2D(128, 1, activation='relu')(x)
branch_d = layers.Conv2D(128, 3, activation='relu')(branch_d)
branch_d = layers.Conv2D(128, 3, activation='relu', strides=2)(branch_d)
# Finally, we concatenate the branch outputs to obtain the module output
output = layers.concatenate([branch_a, branch_b, branch_c, branch_d], axis=-1)
其實呢,理解了函數(shù)的概念,再看這個簡直太小兒科了,只要注意concatenate的用法就好了。
殘差網(wǎng)絡(luò)的構(gòu)建
from keras import layers
# We assume the existence of a 4D input tensor `x`
x = ...
y = layers.Conv2D(128, 3, activation='relu')(x)
y = layers.Conv2D(128, 3, activation='relu')(y)
y = layers.MaxPooling2D(2, strides=2)(y)
# We use a 1x1 convolution to linearly downsample
# the original `x` tensor to the same shape as `y`
residual = layers.Conv2D(1, strides=2)(x)
# We add the residual tensor back to the output features
y = layers.add([y, residual])
關(guān)鍵只要會用add就ok了
我們還遇到了需要共享某一層的情況,而keras也能很容易的實現(xiàn)。
from keras import layers
from keras import Input
from keras.models import Model
# We instantiate a single LSTM layer, once
lstm = layers.LSTM(32)
# Building the left branch of the model
# -------------------------------------
# Inputs are variable-length sequences of vectors of size 128
left_input = Input(shape=(None, 128))
left_output = lstm(left_input)
# Building the right branch of the model
# --------------------------------------
right_input = Input(shape=(None, 128))
# When we call an existing layer instance,
# we are reusing its weights
right_output = lstm(right_input)
# Building the classifier on top
# ------------------------------
merged = layers.concatenate([left_output, right_output], axis=-1)
predictions = layers.Dense(1, activation='sigmoid')(merged)
# Instantiating and training the model
# ------------------------------------
model = Model([left_input, right_input], predictions)
# When you train such a model, the weights of the `lstm` layer
# are updated based on both inputs.
model.fit([left_data, right_data], targets)
其實就是把需要共享的層預(yù)先定義成layer函數(shù)版,然后在需要加進去的時候采用函數(shù)調(diào)用的方法就行了。