盡管pytorch已經(jīng)足夠易用,但是仍然有一個模板化的代碼,比如循環(huán)迭代訓(xùn)練網(wǎng)絡(luò)。參考keras的api封裝理念,我們可以把常用的pytorch的代碼片斷,也封裝起來。
繼承自nn.Module
compile傳入優(yōu)化器,損失函數(shù)。注意,我們在調(diào)用compile的時候再初始化
fit函數(shù)是sklearn風(fēng)格的訓(xùn)練接口。 loss = self.loss_fn(y_pred, y_train),這里必須先傳y_pred,后傳y_train,損失函數(shù)會檢查后者沒有梯度函數(shù)
如下三步是訓(xùn)練的核心: 每個迭代,需要把導(dǎo)數(shù)清零,否則導(dǎo)數(shù)是累計的;然后對loss進行求導(dǎo)loss.backward(),最后往導(dǎo)數(shù)的方向迭代一步self.optimizer.step()
predict函數(shù),在訓(xùn)練完成,使用模型時,需要把model標(biāo)記為測試模式,即調(diào)用.eval方法。
self.optimizer.zero_grad()
loss.backward()
self.optimizer.step()
import torch
from torch import nn
from torch import optim
class BaseModel(nn.Module):
def __init__(self):
super(BaseModel,self).__init__()
def predict(self,x_test):
self.eval()
y_pred = self(x_test)
return y_pred
def save(self,path):
torch.save(self.state_dict(), path)
def compile(self,optimizer,loss,metrics=None):
for p in self.parameters():
print(p)
self.optimizer = optimizer(self.parameters(), lr=1e-4)
self.loss_fn = loss()
if metrics is not None:
self.metrics = metrics()
def fit(self,x_train,y_train):
# 開始訓(xùn)練
num_epochs = 1000
for epoch in range(num_epochs):
y_pred = self(x_train)
#這里順序不能反,前者是預(yù)測值out,后者是target
# backward
self.optimizer.zero_grad()
loss = self.loss_fn(y_pred, y_train)
loss.backward()
self.optimizer.step()
if (epoch + 1) % 20 == 0:
print('Epoch[{}/{}], loss: {:.6f}'
.format(epoch + 1, num_epochs, loss.data[0]))
在這個BaseModel的基礎(chǔ)上對前文的線性模型進行改造,代碼就非常簡潔了。
定義模型的代碼不需要變:
# 線性模型
class LinearRegression(BaseModel):
def __init__(self):
super(LinearRegression, self).__init__()
self.linear = nn.Linear(1, 1) # input and output is 1 dimension
def forward(self, x):
out = self.linear(x)
return out
模型初始化,只需要兩行代碼就夠了:
model = LinearRegression()
model.compile(optimizer=optim.SGD,loss=nn.MSELoss)
準(zhǔn)備數(shù)據(jù)并訓(xùn)練,訓(xùn)練只需要調(diào)用一下fit就好了,相當(dāng)?shù)暮啙崳?/p>
import numpy as np
x_train = np.array([[3.3], [4.4], [5.5], [6.71], [6.93], [4.168],
[9.779], [6.182], [7.59], [2.167], [7.042],
[10.791], [5.313], [7.997], [3.1]])
y_train = np.array([[1.7], [2.76], [2.09], [3.19], [1.694], [1.573],
[3.366], [2.596], [2.53], [1.221], [2.827],
[3.465], [1.65], [2.904], [1.3]])
model.fit(torch.Tensor(x_train),torch.Tensor(y_train))
最后是模型的應(yīng)用,對結(jié)果進行繪制:
y_pred = model.predict(torch.Tensor(x_train))
data = y_pred.data
import matplotlib.pyplot as plt
plt.plot(x_train, y_pred.data.numpy(), label='Fitting Line')
plt.show()
model.save('./mymodel.pth')
logistic回歸是一種廣義線性回歸(generalized linear model),因此與多重線性回歸分析有很多相同之處。它們的模型形式基本上相同,都具有 wx+b,其中w和b是待求參數(shù),其區(qū)別在于他們的因變量不同,多重線性回歸直接將wx+b作為因變量,即y =wx+b,而logistic回歸則通過函數(shù)L將wx+b對應(yīng)一個隱狀態(tài)p,p=L(wx+b),然后根據(jù)p 與1-p的大小決定因變量的值。如果L是logistic函數(shù),就是logistic回歸,如果L是多項式函數(shù)就是多項式回歸。
邏輯回歸模型,雖然模型名字里有“回歸”兩字,但它是一個分類算法,我們用mnist的圖片集來驗證它的分類效果。
模型的實現(xiàn)與線性模型幾乎一致,區(qū)別在于損失函數(shù)上:loss=nn.CrossEntropyLoss交叉熵損失。
輸入?yún)?shù):in_dim是輸入N*784,輸出分類n_class為10類
# 定義 Logistic Regression 模型
class Logstic_Regression(BaseModel):
def __init__(self, in_dim, n_class):
super(Logstic_Regression, self).__init__()
self.logstic = nn.Linear(in_dim, n_class)
def forward(self, x):
out = self.logstic(x)
return out
mnist數(shù)據(jù)集準(zhǔn)備,torsh.util.data下的dataloader為數(shù)據(jù)預(yù)處理提供了很好的模型,當(dāng)然我們也可以自己實現(xiàn)和擴展。
from torchvision import datasets,transforms
from torch.utils.data import DataLoader
train_dataset = datasets.MNIST(
root='./data', train=True, transform=transforms.ToTensor(), download=True)
test_dataset = datasets.MNIST(
root='./data', train=False, transform=transforms.ToTensor())
train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=32, shuffle=False)
為使用dataloader進行模型訓(xùn)練,我們需要新增一個fit_loader的函數(shù)。主要是針對一批數(shù)據(jù)較大的情況,比如minist訓(xùn)練集有6萬多張圖片,不可能一次在內(nèi)存里計算。我們會設(shè)定batch_size=32,那一次讀取32張,作為一批訓(xùn)練的樣本。
def fit_dataloader(self,loader):
num_epochs = 3
for epoch in range(num_epochs):
for i,data in enumerate(loader):
img,y_train = data
y_pred = self(img.view(img.size()[0],-1))
#y_train = y_train.view(y_train.size()[0],-1)
print(y_train.size())
# backward
self.optimizer.zero_grad()
loss = self.loss_fn(y_pred, y_train)
loss.backward()
self.optimizer.step()
if i % 300 == 0:
print('Epoch[{}/{}], loss: {:.6f}'
.format(epoch + 1, num_epochs, loss.data[0]))
MLP(Multi-Layer Perceptron),即多層感知器,是一種前向結(jié)構(gòu)的人工神經(jīng)網(wǎng)絡(luò),映射一組輸入向量到一組輸出向量。MLP可以被看做是一個有向圖,由多個節(jié)點層組成,每一層全連接到下一層。除了輸入節(jié)點,每個節(jié)點都是一個帶有非線性激活函數(shù)的神經(jīng)元(或稱處理單元)。一種被稱為反向傳播算法的監(jiān)督學(xué)習(xí)方法中間一個隱層,克服了感知器不能對線性不可分數(shù)據(jù)進行識別的弱點。
模型實現(xiàn):
n_input是輸入的維度batch_size*n_input
n_hidden_1是隱層的輸入維度
n_hidden_2是隱層的輸出維度
n_output是輸出層的分類數(shù)
三個層都是全連接層
使用SGD隨機梯度下降訓(xùn)練網(wǎng)絡(luò)
使用交叉熵損失計算多分類問題
import torch
from torch import nn
from torch import optim
class MLP(BaseModel):
def __init__(self,n_input,n_hidden_1,n_hidden_2,n_output):
super(MLP,self).__init__()
self.layer1 = nn.Linear(n_input,n_hidden_1)
self.layer2 = nn.Linear(n_hidden_1,n_hidden_2)
self.layer3 = nn.Linear(n_hidden_2,n_output)
def forward(self, x):
out = self.layer1(x)
out = self.layer2(out)
out = self.layer3(out)
return out
模型調(diào)用:
from torchvision import datasets,transforms
from torch.utils.data import DataLoader
train_dataset = datasets.MNIST(
root='./data', train=True, transform=transforms.ToTensor(), download=True)
test_dataset = datasets.MNIST(
root='./data', train=False, transform=transforms.ToTensor())
train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=32, shuffle=False)
model = MLP(28*28,300,100,10)
model.compile(optimizer=optim.SGD,loss=nn.CrossEntropyLoss)
model.fit_dataloader(loader)
優(yōu)化fit_dataloader函數(shù),增加每一個batch_size的losss,計算準(zhǔn)確性的統(tǒng)計。
def fit_dataloader(self,loader):
num_epochs = 10
for epoch in range(num_epochs):
batch_loss = 0.0
batch_acc = 0.0
step = 300
for i,data in enumerate(loader):
img,y_train = data
y_pred = self(img.view(img.size()[0],-1))
#計算當(dāng)前批次的準(zhǔn)確率(max:返回每列最大的,第二個返回值是對應(yīng)的下標(biāo))
_, pred = torch.max(y_pred, 1)
num_correct = (pred == y_train).sum()
batch_acc += num_correct.data.item()
# 反向傳播
self.optimizer.zero_grad()
loss = self.loss_fn(y_pred, y_train)
#計算當(dāng)前批次的損失
batch_loss += loss.data.item()
loss.backward()
self.optimizer.step()
if (i+1) % step == 0:
print('Epoch[{}/{}],batch:{}, avg loss: {:.6f},train acc:{:.6f}'
.format(epoch + 1, num_epochs,i+1, batch_loss/step,batch_acc/(step*32)))
batch_loss = 0.0
batch_acc = 0.0
這樣可以清楚看出,訓(xùn)練過程中的損失在逐漸背叛以及準(zhǔn)確性在穩(wěn)步上升。
Epoch[1/10],batch:300, avg loss: 2.286891,train acc:0.163438
Epoch[1/10],batch:600, avg loss: 2.283830,train acc:0.160000
Epoch[1/10],batch:900, avg loss: 2.276441,train acc:0.180312
......
Epoch[10/10],batch:300, avg loss: 1.938921,train acc:0.677188
Epoch[10/10],batch:600, avg loss: 1.928737,train acc:0.679896
Epoch[10/10],batch:900, avg loss: 1.920588,train acc:0.690208
Epoch[10/10],batch:1200, avg loss: 1.912090,train acc:0.691562
Epoch[10/10],batch:1500, avg loss: 1.908464,train acc:0.689583
Epoch[10/10],batch:1800, avg loss: 1.892407,train acc:0.696667
對模型進行評估,擴展BaseModel,新增predict_loader函數(shù):
def predict_dataloader(self,loader):
self.eval()
total_loss = 0.0
acc = 0.0
for data in test_loader:
img, y_train = data
img = img.view(img.size(0), -1)
y_pred = self(img)
loss = self.loss_fn(y_pred, y_train)
total_loss += loss.data.item()
_, pred = torch.max(y_pred, 1)
num_correct = (pred == y_train).sum()
acc += num_correct.data.item()
print(total_loss/len(loader),acc/32/len(loader))
從結(jié)果上看出來,經(jīng)過10輪的迭代,在測試集上得到總體67.16%的準(zhǔn)確率。
1.9051747440149227 0.6716253993610224
關(guān)于作者:魏佳斌,互聯(lián)網(wǎng)產(chǎn)品/技術(shù)總監(jiān),北京大學(xué)光華管理學(xué)院(MBA),特許金融分析師(CFA),資深產(chǎn)品經(jīng)理/碼農(nóng)。偏愛python,深度關(guān)注互聯(lián)網(wǎng)趨勢,人工智能,AI金融量化。致力于使用最前沿的認知技術(shù)去理解這個復(fù)雜的世界。AI量化開源項目:
聯(lián)系客服