本篇筆記的數據和完整代碼:https://github.com/ChenWentai/PyTorch
本期筆記使用PyTorch搭建一個多層神經網絡,解決具體的分類問題。
1 數據準備
數據采用糖尿病分類數據集diabetes.csv。這是一個典型的分類問題數據,包含768個樣本,每個樣本的數據包含8個特征,分別代表受試者的不同身體指標,標簽為0或1,代表是否患有糖尿病。數據集示意圖如下:

圖1. 糖尿病數據集
首先導入此數據集,并進行數據預處理
import torch
import torch.nn.functional as F
import torch.nn.init as init
import math
import numpy as np
import pandas as pd
from torch.autograd import Variable
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
#從本地讀取數據
xy = pd.read_csv('./diabetes.csv').values
x = Variable(torch.from_numpy(xy[:,0:-1]))
y = Variable(torch.from_numpy(xy[:,-1]))
#劃分訓練數據和測試數據
x_train, x_test,y_train,y_test= train_test_split(x.numpy(),y.numpy(), test_size=0.3,random_state=2018)
ss = StandardScaler()
#特征歸一化
x_train = Variable(torch.tensor(ss.fit_transform(x_train)))
x_test = Variable(torch.tensor(ss.fit_transform(x_test)))
y_train = Variable(torch.tensor(y_train))
y_test = Variable(torch.tensor(y_test))
2 搭建并訓練神經網絡
class Model(torch.nn.Module):
def __init__(self):
super(Model, self).__init__()
self.l1 = torch.nn.Linear(8, 10)
self.l2 = torch.nn.Linear(10, 10)
self.l3 = torch.nn.Linear(10, 10)
self.l4 = torch.nn.Linear(10, 1)
def forward(self, x):
x = F.tanh(self.l1(x.float()))
x = F.tanh(self.l2(x))
x = F.tanh(self.l3(x))
x = self.l4(x)
return x
model = Model()
model.apply(weights_init)
criterion = torch.nn.BCEWithLogitsLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=1e-1, momentum=0.9)
Loss = []
Acc = []
EPOCHS = 5000
for epoch in range(EPOCHS):
y_pred = model(x_train)
loss = criterion(y_pred, y_train.float().view(-1,1))
preds = torch.tensor(y_pred >= 0)
corrects = torch.sum(preds.byte() == y_train.view(-1,1).byte())
acc = corrects.item()/len(x_train)
if epoch%100 == 0:
print("corrects:",corrects)
print("epoch = {0}, loss = {1}, acc = {2}".format(epoch, loss, acc))
Loss.append(loss)
Acc.append(acc)
optimizer.zero_grad()
loss.backward()
optimizer.step()
經訓練后,訓練集上的acc可以達到1.0.
作出loss和acc的圖像
plt.plot(range(len(Loss)), Loss)
plt.ylabel('loss')
plt.xlabel('epochs')
plt.show()
plt.plot(range(len(Acc)), Acc)
plt.ylabel('acc')
plt.xlabel('epochs')
plt.show()

loss

acc
3 測試數據
在測試集上的表現如下
y_pred = model(x_test)
preds = torch.tensor(y_pred >= 0)
corrects = torch.sum(preds.byte() == y_test.view(-1,1).byte())
acc = corrects.item()/len(x_test)
print("corrects:",corrects.numpy().item())
print("acc = {}".format(acc))
corrects: 160
acc = 0.6926406926406926