第一章:回歸
1.獲取數(shù)據(jù)集
Fashion-MNIST是一個(gè)10類服飾分類數(shù)據(jù)集,體量比較小所以使用它
import torch
import torchvision
import torchvision.transforms as transforms
import matplotlib.pyplot as plt
import time
import sys
sys.path.append("D:\anaconda\Lib") # 為了導(dǎo)?上層?錄的d2lzh_pytorch
import d2lzh_pytorch as d2l
mnist_train = torchvision.datasets.FashionMNIST(root='D:/program/vs code/動(dòng)手學(xué)/Datasets/FashionMNIST', train=True, download=True, transform=transforms.ToTensor())
mnist_test = torchvision.datasets.FashionMNIST(root='D:/program/vs code/動(dòng)手學(xué)/Datasets/FashionMNIST', train=False, download=True, transform=transforms.ToTensor())
print(type(mnist_train))
print(len(mnist_train), len(mnist_test))
feature, label = mnist_train[0]
print(feature.shape, label)
d2lzh_pytorch是把本實(shí)驗(yàn)中常用的函數(shù)集合在一起。
輸出:
<class 'torchvision.datasets.mnist.FashionMNIST'>
60000 10000
torch.Size([1, 28, 28]) 9
.shape是Channel x Height x Width,有10個(gè)類,他是從0開始的所以打印出來是9.
Fashion-MNIST中一共包括了10個(gè)類別,可以把他們換成字符,而不是數(shù)字,這就調(diào)用了之前d2lzh_pytorch的trick:
def get_fashion_mnist_labels(labels):
text_labels = ['t-shirt', 'trouser', 'pullover', 'dress', 'coat',
'sandal', 'shirt', 'sneaker', 'bag', 'ankle boot']
return [text_labels[int(i)] for i in labels]
定義一個(gè)可以在一行里畫出多張圖像和對應(yīng)標(biāo)簽的函數(shù)。
def show_fashion_mnist(images, labels):
d2l.use_svg_display()
# 這里的_表示我們忽略(不使用)的變量
_, figs = plt.subplots(1, len(images), figsize=(12, 12))
for f, img, lbl in zip(figs, images, labels):
f.imshow(img.view((28, 28)).numpy())
f.set_title(lbl)
f.axes.get_xaxis().set_visible(False)
f.axes.get_yaxis().set_visible(False)
plt.show()
打?。?/p>
X, y = [], []#新建兩個(gè)列表
for i in range(10):
X.append(mnist_train[i][0])#在列表末尾添加新的對象
y.append(mnist_train[i][1])
show_fashion_mnist(X, get_fashion_mnist_labels(y))#放進(jìn)去每個(gè)類中第一個(gè)實(shí)例和轉(zhuǎn)換完后的label

2. 讀取小批量
batch_size = 256
if sys.platform.startswith('win'):
num_workers = 0 # 0表示不用額外的進(jìn)程來加速讀取數(shù)據(jù)
else:
num_workers = 4
train_iter = torch.utils.data.DataLoader(mnist_train, batch_size=batch_size, shuffle=True, num_workers=num_workers)
test_iter = torch.utils.data.DataLoader(mnist_test, batch_size=batch_size, shuffle=False, num_workers=num_workers)
start = time.time()
for X, y in train_iter:
continue
print('%.2f sec' % (time.time() - start))
這一小節(jié)比較簡單,介紹了如何加載數(shù)據(jù),加載數(shù)據(jù)時(shí)可以規(guī)定一個(gè)批次的大小,調(diào)用幾個(gè)線程,加載在cpu還是gpu等等。
torch.utils.data.DataLoader參數(shù)列表:實(shí)用的就那幾個(gè)參數(shù),加粗了
class torch.utils.data.DataLoader(dataset,batch_size=1,shuffle=False,sampler=None,batch_sampler=None,num_workers=0,collate_fn<functiondefault_collate>,pin_memory=False,drop_last=False,timeout=0,worker_init_fn=None)
含義:
1、dataset:(數(shù)據(jù)類型 dataset)
輸入的數(shù)據(jù)類型。看名字感覺就像是數(shù)據(jù)庫,C#里面也有dataset類,理論上應(yīng)該還有下一級的datatable。這應(yīng)當(dāng)是原始數(shù)據(jù)的輸入。PyTorch內(nèi)也有這種數(shù)據(jù)結(jié)構(gòu)。這里先不管,估計(jì)和C#的類似,這里只需要知道是輸入數(shù)據(jù)類型是dataset就可以了。
2、batch_size:(數(shù)據(jù)類型 int)
每次輸入數(shù)據(jù)的行數(shù),默認(rèn)為1。PyTorch訓(xùn)練模型時(shí)調(diào)用數(shù)據(jù)不是一行一行進(jìn)行的(這樣太沒效率),而是一捆一捆來的。這里就是定義每次喂給神經(jīng)網(wǎng)絡(luò)多少行數(shù)據(jù),如果設(shè)置成1,那就是一行一行進(jìn)行(個(gè)人偏好,PyTorch默認(rèn)設(shè)置是1)。
3、shuffle:(數(shù)據(jù)類型 bool)
洗牌。默認(rèn)設(shè)置為False。在每次迭代訓(xùn)練時(shí)是否將數(shù)據(jù)洗牌,默認(rèn)設(shè)置是False。將輸入數(shù)據(jù)的順序打亂,是為了使數(shù)據(jù)更有獨(dú)立性,但如果數(shù)據(jù)是有序列特征的,就不要設(shè)置成True了。
4、collate_fn:(數(shù)據(jù)類型 callable,沒見過的類型)
將一小段數(shù)據(jù)合并成數(shù)據(jù)列表,默認(rèn)設(shè)置是False。如果設(shè)置成True,系統(tǒng)會(huì)在返回前會(huì)將張量數(shù)據(jù)(Tensors)復(fù)制到CUDA內(nèi)存中。(不太明白作用是什么,就暫時(shí)默認(rèn)False)
5、batch_sampler:(數(shù)據(jù)類型 Sampler)
批量采樣,默認(rèn)設(shè)置為None。但每次返回的是一批數(shù)據(jù)的索引(注意:不是數(shù)據(jù))。其和batch_size、shuffle 、sampler and drop_last參數(shù)是不兼容的。我想,應(yīng)該是每次輸入網(wǎng)絡(luò)的數(shù)據(jù)是隨機(jī)采樣模式,這樣能使數(shù)據(jù)更具有獨(dú)立性質(zhì)。所以,它和一捆一捆按順序輸入,數(shù)據(jù)洗牌,數(shù)據(jù)采樣,等模式是不兼容的。
6、sampler:(數(shù)據(jù)類型 Sampler)
采樣,默認(rèn)設(shè)置為None。根據(jù)定義的策略從數(shù)據(jù)集中采樣輸入。如果定義采樣規(guī)則,則洗牌(shuffle)設(shè)置必須為False。
7、num_workers:(數(shù)據(jù)類型 Int)
工作者數(shù)量,默認(rèn)是0。使用多少個(gè)子進(jìn)程來導(dǎo)入數(shù)據(jù)。設(shè)置為0,就是使用主進(jìn)程來導(dǎo)入數(shù)據(jù)。注意:這個(gè)數(shù)字必須是大于等于0的,負(fù)數(shù)估計(jì)會(huì)出錯(cuò)。
8、pin_memory:(數(shù)據(jù)類型 bool)
內(nèi)存寄存,默認(rèn)為False。在數(shù)據(jù)返回前,是否將數(shù)據(jù)復(fù)制到CUDA內(nèi)存中。
9、drop_last:(數(shù)據(jù)類型 bool)
丟棄最后數(shù)據(jù),默認(rèn)為False。設(shè)置了 batch_size 的數(shù)目后,最后一批數(shù)據(jù)未必是設(shè)置的數(shù)目,有可能會(huì)小些。這時(shí)你是否需要丟棄這批數(shù)據(jù)。
10、timeout:(數(shù)據(jù)類型 numeric)
超時(shí),默認(rèn)為0。是用來設(shè)置數(shù)據(jù)讀取的超時(shí)時(shí)間的,但超過這個(gè)時(shí)間還沒讀取到數(shù)據(jù)的話就會(huì)報(bào)錯(cuò)。 所以,數(shù)值必須大于等于0。
11、worker_init_fn(數(shù)據(jù)類型 callable,沒見過的類型)
子進(jìn)程導(dǎo)入模式,默認(rèn)為Noun。在數(shù)據(jù)導(dǎo)入前和步長結(jié)束后,根據(jù)工作子進(jìn)程的ID逐個(gè)按順序?qū)霐?shù)據(jù)
數(shù)據(jù)讀取模塊
def load_data_fashion_mnist(batch_size, resize=None, root='~/Datasets/FashionMNIST'):
"""Download the fashion mnist dataset and then load into memory."""
trans = []
if resize:
trans.append(torchvision.transforms.Resize(size=resize))
trans.append(torchvision.transforms.ToTensor())
transform = torchvision.transforms.Compose(trans)
mnist_train = torchvision.datasets.FashionMNIST(root=root, train=True, download=True, transform=transform)
mnist_test = torchvision.datasets.FashionMNIST(root=root, train=False, download=True, transform=transform)
if sys.platform.startswith('win'):
num_workers = 0 # 0表示不用額外的進(jìn)程來加速讀取數(shù)據(jù)
else:
num_workers = 4
train_iter = torch.utils.data.DataLoader(mnist_train, batch_size=batch_size, shuffle=True, num_workers=num_workers)
test_iter = torch.utils.data.DataLoader(mnist_test, batch_size=batch_size, shuffle=False, num_workers=num_workers)
return train_iter, test_iter
這個(gè)模就負(fù)責(zé)返回train_iter, test_iter,其主要起作用的函數(shù)就是torch.utils.data.DataLoader,上一節(jié)提過了
3. softmax
import torch
import torchvision
import numpy as np
import sys
sys.path.append("D:\anaconda\Lib") # 為了導(dǎo)入上層目錄的d2lzh_pytorch
import d2lzh_pytorch as d2l
batch_size = 256
train_iter, test_iter = d2l.load_data_fashion_mnist(batch_size,root="D:/program/vs code/動(dòng)手學(xué)/Datasets/FashionMNIST")
num_inputs = 784#模型的輸入向量的長度是28*28=789
num_outputs = 10#一共又10個(gè)類型
W = torch.tensor(np.random.normal(0, 0.01, (num_inputs, num_outputs)), dtype=torch.float)#正態(tài)分布,參數(shù)分別為loc, scale size
b = torch.zeros(num_outputs, dtype=torch.float) #初始化模型參數(shù),全都是0
W.requires_grad_(requires_grad=True)#如果需要為張量計(jì)算梯度為True,否則為False
b.requires_grad_(requires_grad=True)
def softmax(X):
X_exp = X.exp()#對每個(gè)元素做指數(shù)運(yùn)算
partition = X_exp.sum(dim=1, keepdim=True)#對其中同一列(dim=0)或同一行(dim=1)的元素求和,并在結(jié)果中保留行和列這兩個(gè)維度(keepdim=True)。
return X_exp / partition # 這里應(yīng)用了廣播機(jī)制
X = torch.rand((2, 5))
X_prob = softmax(X)
print(X_prob, X_prob.sum(dim=1))


輸出
tensor([[0.1894, 0.2517, 0.1524, 0.2252, 0.1813],
[0.1592, 0.1721, 0.2165, 0.1756, 0.2765]]) tensor([1., 1.])
寫了個(gè)簡單的繪圖,來觀察softmax函數(shù)的特點(diǎn)
x = np.linspace(-10, 10, 200)
y = softmax(x)
print(x,y)
plt.plot(x,y)
plt.show()

4.訓(xùn)練
數(shù)據(jù)加載
import torch
from torch import nn
from torch.nn import init
import numpy as np
import torchvision
import torchvision.transforms as transforms
import sys
sys.path.append("D:\anaconda\Lib")
import d2lzh_pytorch as d2l
#數(shù)據(jù)讀取部分
batch_size=256 #數(shù)據(jù)
num_workers=0
mnist_train = torchvision.datasets.FashionMNIST(root='D:/program/vs code/動(dòng)手學(xué)/Datasets/FashionMNIST', train=True, download=True, transform=transforms.ToTensor())
mnist_test = torchvision.datasets.FashionMNIST(root='D:/program/vs code/動(dòng)手學(xué)/Datasets/FashionMNIST', train=False, download=True, transform=transforms.ToTensor())
train_iter = torch.utils.data.DataLoader(mnist_train, batch_size=batch_size, shuffle=True, num_workers=num_workers)
test_iter = torch.utils.data.DataLoader(mnist_test, batch_size=batch_size, shuffle=False, num_workers=num_workers)
num_inputs = 784
num_outputs = 10
#網(wǎng)絡(luò)的定義
模型定義
這是個(gè)最簡單的線性網(wǎng)絡(luò)
class LinearNet(nn.Module):
def __init__(self, num_inputs, num_outputs):
super(LinearNet, self).__init__()
self.linear = nn.Linear(num_inputs, num_outputs)
def forward(self, x): # x shape: (batch, 1, 28, 28)
y = self.linear(x.view(x.shape[0], -1))
return y
net = LinearNet(num_inputs, num_outputs)
超參數(shù)初始化
#均值為0、標(biāo)準(zhǔn)差為0.01的正態(tài)分布隨機(jī)初始化模型的權(quán)重參數(shù)。
init.normal_(net.linear.weight, mean=0, std=0.01)
init.constant_(net.linear.bias, val=0)
loss = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(net.parameters(), lr=0.1)
num_epochs = 5
accuracy計(jì)算模塊
#評估accuracy準(zhǔn)確率的
def evaluate_accuracy(data_iter, net):
acc_sum, n = 0.0, 0
for X, y in data_iter:
acc_sum += (net(X).argmax(dim=1) == y).float().sum().item()#對二維矩陣來講a[0][1]會(huì)有兩個(gè)索引方向,第一個(gè)方向?yàn)閍[0],默認(rèn)按列方向搜索最大值,返回的是位置,就是標(biāo)簽
n += y.shape[0]
print(n)
return acc_sum / n
訓(xùn)練模塊
def train_ch3(net, train_iter, test_iter, loss, num_epochs, batch_size,
params=None, lr=None, optimizer=None):
for epoch in range(num_epochs):
train_l_sum, train_acc_sum, n = 0.0, 0.0, 0
for X, y in train_iter:
y_hat = net(X)#正向傳播
l = loss(y_hat, y).sum()
# 梯度清零
if optimizer is not None:
optimizer.zero_grad()
elif params is not None and params[0].grad is not None:
for param in params:
param.grad.data.zero_()
l.backward()#backward(gradient=None, retain_variables=False)#反向傳播計(jì)算
if optimizer is None:
d2l.sgd(params, lr, batch_size)
else:
optimizer.step()
#算準(zhǔn)確率
train_l_sum += l.item()
train_acc_sum += (y_hat.argmax(dim=1) == y).sum().item()
n += y.shape[0]
test_acc = evaluate_accuracy(test_iter, net)
print('epoch %d, loss %.4f, train acc %.3f, test acc %.3f'
% (epoch + 1, train_l_sum / n, train_acc_sum / n, test_acc))
train_ch3(net, train_iter, test_iter, loss, num_epochs, batch_size, None, None, optimizer)
epoch 1, loss 0.0031, train acc 0.751, test acc 0.797
epoch 2, loss 0.0022, train acc 0.815, test acc 0.809
epoch 3, loss 0.0021, train acc 0.826, test acc 0.816
epoch 4, loss 0.0020, train acc 0.832, test acc 0.820
epoch 5, loss 0.0019, train acc 0.837, test acc 0.796
epoch 6, loss 0.0019, train acc 0.839, test acc 0.826
epoch 7, loss 0.0018, train acc 0.842, test acc 0.830
epoch 8, loss 0.0018, train acc 0.845, test acc 0.831
epoch 9, loss 0.0018, train acc 0.846, test acc 0.825
epoch 10, loss 0.0017, train acc 0.849, test acc 0.832
epoch 11, loss 0.0017, train acc 0.850, test acc 0.835
epoch 12, loss 0.0017, train acc 0.850, test acc 0.837
epoch 13, loss 0.0017, train acc 0.852, test acc 0.832
epoch 14, loss 0.0017, train acc 0.853, test acc 0.840
epoch 15, loss 0.0017, train acc 0.853, test acc 0.838
epoch 16, loss 0.0017, train acc 0.854, test acc 0.803
epoch 17, loss 0.0017, train acc 0.855, test acc 0.838
epoch 18, loss 0.0017, train acc 0.855, test acc 0.839
epoch 19, loss 0.0017, train acc 0.856, test acc 0.835
epoch 20, loss 0.0016, train acc 0.857, test acc 0.833
epoch 21, loss 0.0016, train acc 0.857, test acc 0.840
epoch 22, loss 0.0016, train acc 0.857, test acc 0.838
epoch 23, loss 0.0016, train acc 0.858, test acc 0.843
epoch 24, loss 0.0016, train acc 0.858, test acc 0.840
epoch 25, loss 0.0016, train acc 0.859, test acc 0.838
epoch 26, loss 0.0016, train acc 0.859, test acc 0.839
epoch 27, loss 0.0016, train acc 0.859, test acc 0.842
epoch 28, loss 0.0016, train acc 0.860, test acc 0.840
epoch 29, loss 0.0016, train acc 0.860, test acc 0.840
epoch 30, loss 0.0016, train acc 0.859, test acc 0.834
epoch 31, loss 0.0016, train acc 0.861, test acc 0.827
epoch 32, loss 0.0016, train acc 0.860, test acc 0.823
epoch 33, loss 0.0016, train acc 0.861, test acc 0.841
epoch 34, loss 0.0016, train acc 0.862, test acc 0.842
epoch 35, loss 0.0016, train acc 0.862, test acc 0.838
epoch 36, loss 0.0016, train acc 0.862, test acc 0.834
epoch 37, loss 0.0016, train acc 0.861, test acc 0.836
epoch 38, loss 0.0016, train acc 0.863, test acc 0.838
epoch 39, loss 0.0016, train acc 0.863, test acc 0.843
epoch 40, loss 0.0016, train acc 0.863, test acc 0.842
epoch 41, loss 0.0016, train acc 0.863, test acc 0.843
epoch 42, loss 0.0016, train acc 0.864, test acc 0.844
epoch 43, loss 0.0016, train acc 0.863, test acc 0.839
epoch 44, loss 0.0016, train acc 0.863, test acc 0.831
epoch 45, loss 0.0016, train acc 0.864, test acc 0.843
epoch 46, loss 0.0016, train acc 0.864, test acc 0.835
epoch 47, loss 0.0016, train acc 0.863, test acc 0.840
epoch 48, loss 0.0015, train acc 0.864, test acc 0.844
epoch 49, loss 0.0015, train acc 0.865, test acc 0.829
epoch 50, loss 0.0015, train acc 0.864, test acc 0.842
50輪后穩(wěn)定,50以后面就沒展示
預(yù)測模塊
X, y = iter(test_iter).next()
true_labels = d2l.get_fashion_mnist_labels(y.numpy())
pred_labels = d2l.get_fashion_mnist_labels(net(X).argmax(dim=1).numpy())
titles = [true + '\n' + pred for true, pred in zip(true_labels, pred_labels)]
d2l.show_fashion_mnist(X[0:9], titles[0:9])#show_fashion_mnist之前說過了

換了個(gè)模型
用兩層線性網(wǎng)絡(luò),中間relu一下,提高了6個(gè)點(diǎn)
class MYNet(nn.Module):
def __init__(self, num_inputs, num_outputs):
super(MYNet, self).__init__()
self.linear1 = nn.Linear(num_inputs, hidden_layer)
self.relu1=torch.nn.ReLU()
self.linear2 = nn.Linear(hidden_layer, num_outputs)
def forward(self, x): # x shape: (batch, 1, 28, 28)
y = self.linear1(x.view(x.shape[0], -1))
y = self.relu1(y)
y = self.linear2(y)
return y
epoch 1, loss 0.0036, train acc 0.689, test acc 0.751
epoch 2, loss 0.0022, train acc 0.799, test acc 0.807
epoch 3, loss 0.0020, train acc 0.824, test acc 0.779
epoch 4, loss 0.0018, train acc 0.838, test acc 0.823
epoch 5, loss 0.0017, train acc 0.845, test acc 0.835
epoch 6, loss 0.0016, train acc 0.852, test acc 0.838
epoch 7, loss 0.0016, train acc 0.856, test acc 0.846
epoch 8, loss 0.0015, train acc 0.860, test acc 0.809
epoch 9, loss 0.0015, train acc 0.865, test acc 0.851
epoch 10, loss 0.0015, train acc 0.868, test acc 0.857
epoch 11, loss 0.0014, train acc 0.871, test acc 0.859
epoch 12, loss 0.0014, train acc 0.873, test acc 0.840
epoch 13, loss 0.0014, train acc 0.876, test acc 0.857
epoch 14, loss 0.0013, train acc 0.878, test acc 0.866
epoch 15, loss 0.0013, train acc 0.880, test acc 0.865
epoch 16, loss 0.0013, train acc 0.880, test acc 0.849
epoch 17, loss 0.0013, train acc 0.884, test acc 0.861
epoch 18, loss 0.0013, train acc 0.885, test acc 0.869
epoch 19, loss 0.0012, train acc 0.887, test acc 0.866
epoch 20, loss 0.0012, train acc 0.889, test acc 0.859
epoch 21, loss 0.0012, train acc 0.891, test acc 0.866
epoch 22, loss 0.0012, train acc 0.891, test acc 0.864
epoch 23, loss 0.0012, train acc 0.893, test acc 0.868
epoch 24, loss 0.0011, train acc 0.894, test acc 0.874
epoch 25, loss 0.0011, train acc 0.895, test acc 0.872
epoch 26, loss 0.0011, train acc 0.897, test acc 0.872
epoch 27, loss 0.0011, train acc 0.898, test acc 0.866
epoch 28, loss 0.0011, train acc 0.899, test acc 0.880
epoch 29, loss 0.0011, train acc 0.900, test acc 0.861
epoch 30, loss 0.0011, train acc 0.902, test acc 0.880
epoch 31, loss 0.0011, train acc 0.903, test acc 0.813
epoch 32, loss 0.0010, train acc 0.905, test acc 0.878
epoch 33, loss 0.0010, train acc 0.904, test acc 0.875
epoch 34, loss 0.0010, train acc 0.907, test acc 0.878
epoch 35, loss 0.0010, train acc 0.906, test acc 0.881
epoch 36, loss 0.0010, train acc 0.908, test acc 0.878
epoch 37, loss 0.0010, train acc 0.910, test acc 0.867
epoch 38, loss 0.0010, train acc 0.910, test acc 0.884
epoch 39, loss 0.0010, train acc 0.912, test acc 0.877
epoch 40, loss 0.0010, train acc 0.913, test acc 0.881
epoch 41, loss 0.0009, train acc 0.914, test acc 0.874
epoch 42, loss 0.0009, train acc 0.915, test acc 0.880
epoch 43, loss 0.0009, train acc 0.914, test acc 0.857
epoch 44, loss 0.0009, train acc 0.916, test acc 0.876
epoch 45, loss 0.0009, train acc 0.916, test acc 0.884
epoch 46, loss 0.0009, train acc 0.918, test acc 0.874
epoch 47, loss 0.0009, train acc 0.918, test acc 0.885
epoch 48, loss 0.0009, train acc 0.919, test acc 0.888
epoch 49, loss 0.0009, train acc 0.920, test acc 0.876
epoch 50, loss 0.0009, train acc 0.920, test acc 0.882
第二章:體會(huì)過擬合和欠擬合
import torch
import numpy as np
import sys
import matplotlib as mth
import matplotlib.pyplot as plt
import pylab
sys.path.append("D:\anaconda\Lib")
import d2lzh_pytorch as d2l
n_train, n_test, true_w, true_b = 100, 100, [1.2, -3.4, 5.6], 5
features = torch.randn((n_train + n_test, 1))#生成200個(gè)隨機(jī)數(shù)來做訓(xùn)練和測試集合
poly_features = torch.cat((features, torch.pow(features, 2), torch.pow(features, 3)), 1) #torch.pow是求指數(shù)運(yùn)算
labels = (true_w[0] * poly_features[:, 0] + true_w[1] * poly_features[:, 1]
+ true_w[2] * poly_features[:, 2] + true_b)#那200個(gè)數(shù)據(jù)帶入多項(xiàng)式后的結(jié)果,在檢測中就叫l(wèi)abels
labels += torch.tensor(np.random.normal(0, 0.01, size=labels.size()), dtype=torch.float)#每個(gè)標(biāo)價(jià)都加一個(gè)從0到0.01的隨機(jī)偏執(zhí)
#畫圖函數(shù)
def semilogy(x_vals, y_vals, x_label, y_label, x2_vals=None, y2_vals=None,
legend=None, figsize=(3.5, 2.5)):
d2l.set_figsize(figsize)
d2l.plt.xlabel(x_label)
d2l.plt.ylabel(y_label)
d2l.plt.semilogy(x_vals, y_vals)
if x2_vals and y2_vals:
d2l.plt.semilogy(x2_vals, y2_vals, linestyle=':')
d2l.plt.legend(legend)
num_epochs, loss = 100, torch.nn.MSELoss()
def fit_and_plot(train_features, test_features, train_labels, test_labels):
net = torch.nn.Linear(train_features.shape[-1], 1)
#用最簡單的先行層就行,要先轉(zhuǎn)置一下,變成列向量,輸出是一個(gè)元素y
# 通過Linear文檔可知,pytorch已經(jīng)將參數(shù)初始化了,所以我們這里就不手動(dòng)初始化了
batch_size = min(10, train_labels.shape[0])
dataset = torch.utils.data.TensorDataset(train_features, train_labels)
#TensorDataset對給定的tensor數(shù)據(jù)(樣本和標(biāo)簽),將它們包裝成dataset,要求輸入必須是Tensor。
train_iter = torch.utils.data.DataLoader(dataset, batch_size, shuffle=True)#加載剛才定義的數(shù)據(jù)集
optimizer = torch.optim.SGD(net.parameters(), lr=0.01)
train_ls, test_ls = [], []#loss
for _ in range(num_epochs):
for X, y in train_iter:
l = loss(net(X), y.view(-1, 1))#最后得到一個(gè)列向量
optimizer.zero_grad()
l.backward()
optimizer.step()
train_labels = train_labels.view(-1, 1)
test_labels = test_labels.view(-1, 1)
train_ls.append(loss(net(train_features), train_labels).item())
test_ls.append(loss(net(test_features), test_labels).item())
print('final epoch: train loss', train_ls[-1], 'test loss', test_ls[-1])
semilogy(range(1, num_epochs + 1), train_ls, 'epochs', 'loss',
range(1, num_epochs + 1), test_ls, ['train', 'test'])
print('weight:', net.weight.data,
'\nbias:', net.bias.data)
fig = plt.figure(figsize = (16, 4))
ax1 = plt.subplot(1,3,1)
plt.sca(ax1)
fit_and_plot(poly_features[:n_train, :], poly_features[n_train:, :],
labels[:n_train], labels[n_train:])
ax2 = plt.subplot(1,3,2)
plt.sca(ax2)
fit_and_plot(features[:n_train, :], features[n_train:, :], labels[:n_train],
labels[n_train:])
ax3 = plt.subplot(1,3,3)
plt.sca(ax3)
fit_and_plot(poly_features[0:2, :], poly_features[n_train:, :], labels[0:2],
labels[n_train:])
pylab.show()
final epoch: train loss 8.301918569486588e-05 test loss 0.00011247050133533776
weight: tensor([[ 1.1989, -3.3984, 5.5999]])
bias: tensor([4.9993])
final epoch: train loss 198.7726593017578 test loss 295.92022705078125
weight: tensor([[18.8718]])
bias: tensor([1.1912])
final epoch: train loss 0.8291055560112 test loss 380.7915344238281
weight: tensor([[2.0050, 1.7987, 1.8384]])
bias: tensor([3.1670])

目標(biāo)函數(shù)是:

營造欠擬和是靠用一元函數(shù)來訓(xùn)練多元函數(shù)
過擬合是用訓(xùn)練的時(shí)候,營造訓(xùn)練數(shù)據(jù)不足來
第三章:模型構(gòu)造
1.通過繼承module
就是那種最復(fù)雜的,:
class MLP(nn.Module):
def __init__(self, 其他參數(shù)):
def forward(self, x):
2.使用module的子類Sequential:
net = MySequential(
nn.Linear(784, 256),
nn.ReLU(),
nn.Linear(256, 10),
)