本篇筆記的完整代碼:https://github.com/ChenWentai/PyTorch/blob/master/task3_logistic.py
1. 準備數據
這次任務使用Logistic解決二分類問題。對于Logistic回歸,數據的標簽為0和1(而不是1和-1),其中y=0的訓練數據由均值為2,方差為1正態(tài)分布產生,y=1的訓練數據由均值為-2, 方差為1的正態(tài)分布產生。
此處數據參考Liam Coder的博客https://blog.csdn.net/out_of_memory_error/article/details/81275651
import torch
from torch.autograd import Variable
N = torch.ones(100, 2) #訓練樣本數
x0 = Variable(torch.normal(2*N, 1))
y0 = Variable(torch.zeros(100, 1))
x1 = Variable(torch.normal(-2*N, 1))
y1 = Variable(torch.ones(100, 1))
x = torch.cat((x0, x1), 0).type(torch.FloatTensor)
y = torch.cat((y0, y1), 0).type(torch.FloatTensor)
#作出散點圖
fig, ax = plt.subplots()
labels = ['class 0','class 1']
ax.scatter(x.numpy()[0:len(x0),0], x.numpy()[0:len(x0),1], label=labels[0])
ax.scatter(x.numpy()[len(x0):len(x),0], x.numpy()[len(x0):len(x),1], label=labels[1])
ax.legend()
數據分布如下:

數據分布
2. 使用Pytorch Tensor實現Logistic回歸
Logistic回歸采用最大似然法求解參數的最優(yōu)值。 似然函數如下:
其中 表示有N個樣本,
是Logistic函數。通過梯度下降法可以求得參數
的最優(yōu)值。注意,此處的
包含了偏置
.
(1)梯度下降求解參數
和
#初始化w和b
w = Variable(torch.zeros(2, 1), requires_grad = True)
b = Variable(torch.zeros(1, 1), requires_grad = True)
EPOCHS = 200
likelihood = []
lr = 0.01
for epoch in range(EPOCHS):
A = 1/(1+torch.exp(-(x.mm(w)+b))) #Logistic函數
J = -torch.mean(y*torch.log(A) + (1-y)*torch.log(1-A)) #對數似然函數
likelihood.append(-J.data.numpy().item())
J.backward() #求似然函數對w和b的梯度
w.data = w.data - lr * w.grad.data #更新w
w.grad.data.zero_()
b.data = b.data - lr * b.grad.data #更新b
b.grad.data.zero_()
(2)作出似然函數
的圖像:
#
import matplotlib.pyplot as plt
plt.plot(likelihood)
plt.ylabel("lieklihood")
plt.xlabel("epoch")
plt.show()

似然函數J
P.S. 這里似然函數的公式為J = -torch.mean(y*torch.log(A) + (1-y)*torch.log(1-A)), 由前述的求和項改為了平均值。個人觀點是為了適應PyTorch的求導規(guī)則。如果使用torch.sum(),在梯度下降的過程中會出現似然函數為nan的現象。具體原因有待進一步探究。
(3) 作出分類邊界圖像:
xa = list(range(-4, 5))
xb = []
for item in xa:
xb.append(-(b.data + item*w[0])/w[1])
fig, ax = plt.subplots()
labels = ['class 0','class 1']
ax.scatter(x.numpy()[0:len(x0),0], x.numpy()[0:len(x0),1], label=labels[0])
ax.scatter(x.numpy()[len(x0):len(x),0], x.numpy()[len(x0):len(x),1], label=labels[1])
ax.legend()
plt.plot(xa, xb)
plt.show()

分類邊界
3. 使用nn.Module實現Logistic回歸
(1)搭建nn模型,梯度下降求解參數
和
from torch import nn
class Logistic(nn.Module):
def __init__(self):
super(Logistic, self).__init__()
self.linear = nn.Linear(2,1)
self.sigmoid = nn.Sigmoid()
def forward(self, x):
y_pred = self.linear(x)
y_pred = self.sigmoid(y_pred)
return y_pred
model = Logistic()
criterion = nn.BCELoss()
optimizer = torch.optim.SGD(model.parameters(), lr= 0.001)
EPOCHS = 1000
costs = []
for epoch in range(EPOCHS):
x = Variable(x)
y = Variable(y)
out = model(x)
loss = criterion(out, y)
costs.append(loss.data.item())
optimizer.zero_grad()
loss.backward()
optimizer.step()
(2)作出損失函數的圖像:
#
import matplotlib.pyplot as plt
plt.plot(costs)
plt.show(range(len(costs)), costs)

loss-epoch圖像
(3) 作出分類邊界圖像:
w1, w2 = model.linear.weight[0]
b = model.linear.bias.item()
plot_x = range(-5, 6, 1)
plot_y = [-(w1*item+b)/w2 for item in plot_x]
fig, ax = plt.subplots()
labels = ['class 0','class 1']
ax.scatter(x.numpy()[0:len(x0),0], x.numpy()[0:len(x0),1], label=labels[0])
ax.scatter(x.numpy()[len(x0):len(x),0], x.numpy()[len(x0):len(x),1], label=labels[1])
ax.legend()
ax.plot(plot_x, plot_y)

分類邊界