【深度學(xué)習(xí)DL-PyTorch】八、推理和驗(yàn)證

推理和驗(yàn)證

在訓(xùn)練神經(jīng)網(wǎng)絡(luò)之后,可以使用它來進(jìn)行預(yù)測。這個(gè)過程通常稱為推理過程,這一術(shù)語來自統(tǒng)計(jì)學(xué)。然而,神經(jīng)網(wǎng)絡(luò)在面對(duì)訓(xùn)練數(shù)據(jù)時(shí)往往表現(xiàn)得太過優(yōu)異,因而無法泛化到未見過的數(shù)據(jù)。這稱之為過擬合,會(huì)影響推理效果。為了在訓(xùn)練中測試過擬合情況,我們會(huì)使用非訓(xùn)練集中的數(shù)據(jù)(稱為驗(yàn)證集)衡量效果。在訓(xùn)練期間監(jiān)測驗(yàn)證效果時(shí),我們使用正則化避免過擬合。

測試集包含和訓(xùn)練集相似的圖像。通常,我們會(huì)將原始數(shù)據(jù)集的 10-20% 作為測試和驗(yàn)證集,剩下的用于訓(xùn)練。

驗(yàn)證的目的是衡量模型在非訓(xùn)練集數(shù)據(jù)上的效果。效果標(biāo)準(zhǔn)由開發(fā)者自己決定。通常用準(zhǔn)確率表示,即網(wǎng)絡(luò)預(yù)測正確的類別所占百分比。其他標(biāo)準(zhǔn)包括精確率召回率以及top-5 錯(cuò)誤率。我們將側(cè)重于準(zhǔn)確率。首先,將使用測試集中的一批數(shù)據(jù)進(jìn)行前向傳播。

一、 過擬合

如果我們觀察訓(xùn)練過程中的訓(xùn)練和驗(yàn)證損失,就能發(fā)現(xiàn)一種叫做過擬合的現(xiàn)象。


overfitting.png

網(wǎng)絡(luò)能越來越好地學(xué)習(xí)訓(xùn)練集中的規(guī)律,導(dǎo)致訓(xùn)練損失越來越低。但是,它在泛化到訓(xùn)練集之外的數(shù)據(jù)時(shí)開始出現(xiàn)問題,導(dǎo)致驗(yàn)證損失上升。任何深度學(xué)習(xí)模型的最終目標(biāo)是對(duì)新數(shù)據(jù)進(jìn)行預(yù)測,因此我們要盡量降低驗(yàn)證損失。一種方法是使用驗(yàn)證損失最低的模型,在此例中是訓(xùn)練周期約為 8-10 次的模型。這種策略稱為早停法 (early stopping)在實(shí)踐中,你需要在訓(xùn)練時(shí)頻繁地保存模型,以便之后選擇驗(yàn)證損失最低的模型

最常用的減少過擬合方法(早停法除外)是丟棄,即隨機(jī)丟棄輸入單元。這樣就促使網(wǎng)絡(luò)在權(quán)重之間共享信息,使其更能泛化到新數(shù)據(jù)。在 PyTorch 中添加丟棄層很簡單,使用 nn.Dropout 模塊即可。

The network learns the training set better and better, resulting in lower training losses. However, it starts having problems generalizing to data outside the training set leading to the validation loss increasing. The ultimate goal of any deep learning model is to make predictions on new data, so we should strive to get the lowest validation loss possible. One option is to use the version of the model with the lowest validation loss, here the one around 8-10 training epochs. This strategy is called early-stopping. In practice, you'd save the model frequently as you're training then later choose the model with the lowest validation loss.

The most common method to reduce overfitting (outside of early-stopping) is dropout, where we randomly drop input units. This forces the network to share information between weights, increasing it's ability to generalize to new data. Adding dropout in PyTorch is straightforward using the nn.Dropout module.

在訓(xùn)練過程中,我們需要使用丟棄防止過擬合,但是在推理過程中,我們需要使用整個(gè)網(wǎng)絡(luò)。因此在驗(yàn)證、測試和使用網(wǎng)絡(luò)進(jìn)行預(yù)測時(shí),我們需要關(guān)閉丟棄功能。你可以使用 model.eval()。它會(huì)將模型設(shè)為驗(yàn)證模式,使丟棄率變成 0。也可以使用 model.train() ,將模型設(shè)為訓(xùn)練模式,重新開啟丟棄功能。通常,驗(yàn)證循環(huán)的規(guī)律將為:關(guān)閉梯度,將模型設(shè)為評(píng)估模式,計(jì)算驗(yàn)證損失和指標(biāo),然后將模型重新設(shè)為訓(xùn)練模式。

# turn off gradients
with torch.no_grad():
    
    # set model to evaluation mode
    model.eval()
    
    # validation pass here
    for images, labels in testloader:
        ...

# set model back to train mode
model.train()

二、 推理

訓(xùn)練好模型后,我們可以用它推理了。之前已經(jīng)進(jìn)行過這一步驟,但是現(xiàn)在需要使用 model.eval() 將模型設(shè)為推理模式。對(duì)于 torch.no_grad(),你需要關(guān)閉 autograd。

%matplotlib inline
%config InlineBackend.figure_format = 'retina'

import matplotlib.pyplot as plt

import torch
from torchvision import datasets, transforms

# Define a transform to normalize the data
transform = transforms.Compose([transforms.ToTensor(),
                                transforms.Normalize((0.5,), (0.5,))])
# Download and load the training data
trainset = datasets.FashionMNIST('~/.pytorch/F_MNIST_data/', download=True, train=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=64, shuffle=True)

# Download and load the test data
testset = datasets.FashionMNIST('~/.pytorch/F_MNIST_data/', download=True, train=False, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=64, shuffle=True)

from torch import nn, optim
import torch.nn.functional as F
##  Define your model with dropout added
class Classifier(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(784, 256)
        self.fc2 = nn.Linear(256, 128)
        self.fc3 = nn.Linear(128, 64)
        self.fc4 = nn.Linear(64, 10)
        
        #Dropout module with 0.2 drop probability
        self.dropout = nn.Dropout(p = 0.2)
        
    def forward(self, x):
        x = x.view(x.shape[0], -1)
        
        #with dropout
        x = self.dropout(F.relu(self.fc1(x)))
        x = self.dropout(F.relu(self.fc2(x)))
        x = self.dropout(F.relu(self.fc3(x)))
        
        x = F.log_softmax(self.fc4(x), dim=1)
        
        return x
##Train your model with dropout, and monitor the training progress with the validation loss and accuracy
model = Classifier()
criterion = nn.NLLLoss()
optimizer = optim.Adam(model.parameters(), lr=0.003)

epochs = 30
steps = 0

train_losses, test_losses = [],[]
for e in range(epochs):
    running_loss = 0
    
    model.train()
    for images, labels in trainloader:
        optimizer.zero_grad()
        
        log_ps = model(images)
        loss = criterion(log_ps, labels)
        loss.backward()
        optimizer.step()
        
        running_loss += loss.item()
    
    else:
        test_loss = 0
        accuracy = 0
        with torch.no_grad():
            model.eval()
            for images, labels in testloader:
                log_ps = model(images)
                loss = criterion(log_ps, labels)
                test_loss += loss
                
                ps = torch.exp(log_ps)
                top_p, top_class = ps.topk(1, dim=1)
                equals = top_class == labels.view(*top_class.shape)
                accuracy += torch.mean(equals.type(torch.FloatTensor))
         
        train_losses.append(running_loss/len(trainloader))
        test_losses.append(test_loss/len(testloader))
        print("Epoch: {}/{}..".format(e+1, epochs),
              "Training Loss: {:.3f}..".format(running_loss/len(trainloader)),
              "Test Loss: {:.3f}..".format(test_loss/len(testloader)),
              "Test Accuracy: {:.3f}".format(accuracy/len(testloader))
        )

 
        
plt.plot(train_losses, label='Training loss')
plt.plot(test_losses, label='Validation loss')
plt.legend(frameon=False)
# Import helper module (should be in the repo)
import helper

# Test out your network!

model.eval()

dataiter = iter(testloader)
images, labels = dataiter.next()
img = images[0]
# Convert 2D image to 1D vector
img = img.view(1, 784)

# Calculate the class probabilities (softmax) for img
with torch.no_grad():
    output = model.forward(img)

ps = torch.exp(output)

# Plot the image and probabilities
helper.view_classify(img.view(1, 28, 28), ps, version='Fashion')

【Source Code】>>>

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請(qǐng)結(jié)合常識(shí)與多方信息審慎甄別。
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡書系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容