BP(Back Propagation) neural network is a multilayer feedforward network trained according to the error back propagation algorithm. Its learning rule is to use the gradient descent method and continuously adjust the weight and threshold of the network through back propagation to minimize the sum of squares of the network error. The topological structure of BP neural network model includes input layer, hidden layer and output layer. The learning process of BP network consists of two processes: forward propagation of information and back propagation of error.
BP neural network diagram
The following takes iris data set as an example, and uses pytorch to realize BP neural network:
Iris iris data set is a classic data set, which is often used as an example in the fields of statistical learning and machine learning. The data set contains 150 records in three categories, with 50 data in each category. Each record has four characteristics: calyx length, calyx width, petal length, petal width. Through these four characteristics, we can predict which species of iris setosa, iris versicolor, iris virginica.
2. Super parameter setting super parameters cannot be obtained through training, but can only be given by experience, or traverse to find the best combination.
lr = 0.02 #Learning rate epochs = 300 n_feature = 4 # Input features (four features of iris) n_hidden = 20 # Hidden layer n_output = 3 # Output (three categories of iris)
3 data preparation classify the data into training set (80%) and test set (20%), train the training set to get the weight of the network, and the test set evaluates the generalization error of the network. To normalize the data, the data normalization of the test set must use the maximum and minimum values obtained on the training set. Finally, the data is converted into Tensor type so that training can be carried out.
iris = datasets.load_iris() # Download and import data # Divide data sets and test sets x_train0, x_test0,y_train,y_test = train_test_split(iris.data,iris.target,test_size=0.2,random_state= 22) # normalization x_train = np.zeros(np.shape(x_train0)) x_test = np.zeros(np.shape(x_test0 )) for i in range(4): xMax = np.max(x_train0[:,i]) xMin = np.min(x_train0[:,i]) x_train[:,i] = (x_train0[:,i] - xMin)/(xMax - xMin) x_test[:,i] = (x_test0[:,i] - xMin)/(xMax - xMin) # Convert data type to tensor x_train = torch.FloatTensor(x_train) y_train = torch.LongTensor (y_train) x_test = torch.FloatTensor(x_test ) y_test = torch.LongTensor (y_test )
4 define BP neural network define three layers of neural network, namely 1 input layer, 1 hidden layer, 1 output layer, and the activation function uses softmax.
class bpnnModel(torch.nn.Module): def __init__(self, n_feature, n_hidden, n_output): super(bpnnModel, self).__init__() self.hidden = torch.nn.Linear(n_feature, n_hidden) # Define hidden layer network self.out = torch.nn.Linear(n_hidden, n_output) # Define output layer network def forward(self, x): x = Fun.relu(self.hidden(x)) # The activation function of the hidden layer adopts relu, SIGMOD and tanh out = Fun.softmax(self.out(x), dim=1) # Output layer softmax activation function return out
5 define the optimizer loss function. The optimizer uses Adam and the loss function uses cross entropy (commonly used for classification problems).
net = bpnnModel(n_feature=n_feature, n_hidden=n_hidden, n_output=n_output) optimizer = torch.optim.Adam(net.parameters(), lr=lr) # The optimizer adopts the random gradient descent method loss_func = torch.nn.CrossEntropyLoss() # The cross entropy loss function is generally used for multi classification
6. The process of training and testing is to cycle epichs=300 times, and each cycle will traverse all the data of the training set, and update the weight through error back propagation.
loss_steps = np.zeros(epochs) # Save the loss function value of each round of epoch accuracy_steps = np.zeros(epochs) # Save the accuracy of each round of epoch on the test set for epoch in range(epochs): y_pred = net(x_train) # Forward process loss = loss_func(y_pred, y_train) # Output vs. label optimizer.zero_grad() # Gradient reset loss.backward() # Back propagation optimizer.step() # Using the gradient optimizer loss_steps[epoch] = loss.item() # Save loss # Next, calculate the accuracy of the testing machine, and there is no need to calculate the gradient with torch.no_grad(): y_pred = net(x_test) # Test set prediction correct = (torch.argmax(y_pred, dim=1) == y_test).type(torch.FloatTensor) accuracy_steps[epoch] = correct.mean() # Test set accuracy print("Test set warbler tail flower prediction accuracy",accuracy_steps[-1])
Loss function and accuracy curve, the training starts. As the loss function decreases, the accuracy of the test set also decreases, but after 100 rounds of training, it basically does not improve.
The following are all codes:
import numpy as np import matplotlib.pyplot as plt from sklearn import datasets from sklearn.model_selection import train_test_split #Introduce the partition function of training set and test set import torchimport torch.nn.functional as Fun # 0. Super parameter setting lr = 0.02 epochs = 300 n_feature = 4 n_hidden = 20 n_output = 3 # 1. Data preparation iris = datasets.load_iris() x_train0, x_test0,y_train,y_test = train_test_split(iris.data,iris.target,test_size=0.2,random_state= 22) x_train = np.zeros(np.shape(x_train0)) x_test = np.zeros(np.shape(x_test0 )) for i in range(4): xMax = np.max(x_train0[:,i]) xMin = np.min(x_train0[:,i]) x_train[:,i] = (x_train0[:,i] - xMin)/(xMax - xMin) x_test[:,i] = (x_test0[:,i] - xMin)/(xMax - xMin) x_train = torch.FloatTensor(x_train) y_train = torch.LongTensor(y_train) x_test = torch.FloatTensor(x_test ) y_test = torch.LongTensor(y_test ) # 2. Define BP neural network class bpnnModel(torch.nn.Module): def __init__(self, n_feature, n_hidden, n_output): super(bpnnModel, self).__init__() self.hidden = torch.nn.Linear(n_feature, n_hidden) # Define hidden layer network self.out = torch.nn.Linear(n_hidden, n_output) # Define output layer network def forward(self, x): x = Fun.relu(self.hidden(x))# The activation function of the hidden layer adopts relu, SIGMOD and tanh out = Fun.softmax(self.out(x), dim=1) # Output layer softmax activation function return out # 3. Define optimizer loss function net = bpnnModel(n_feature=n_feature, n_hidden=n_hidden, n_output=n_output) optimizer = torch.optim.Adam(net.parameters(), lr=lr) # The optimizer adopts the random gradient descent method loss_func = torch.nn.CrossEntropyLoss() # The cross entropy loss function is generally used for multi classification # 4. Training data loss_steps = np.zeros(epochs) accuracy_steps = np.zeros(epochs) for epoch in range(epochs): y_pred = net(x_train) # Forward process loss = loss_func(y_pred, y_train) # Output vs. label optimizer.zero_grad() # Gradient reset loss.backward() # Back propagation optimizer.step() # Using the gradient optimizer loss_steps[epoch] = loss.item() # Save loss with torch.no_grad(): y_pred = net(x_test) correct = (torch.argmax(y_pred, dim=1) == y_test).type(torch.FloatTensor) accuracy_steps[epoch] = correct.mean() print("Test set warbler tail flower prediction accuracy",accuracy_steps[-1]) # 5 draw loss function and accuracy fig_name = 'Iris_dataset_classify_BPNN' fontsize = 15 fig, (ax1, ax2) = plt.subplots(2, figsize=(15, 12), sharex=True) x1.plot(accuracy_steps) ax1.set_ylabel("test accuracy", fontsize=fontsize) ax1.set_title(fig_name, fontsize='xx-large') ax2.plot(loss_steps) ax2.set_ylabel("train loss", fontsize=fontsize) ax2.set_xlabel("epochs", fontsize=fontsize) plt.tight_layout() plt.savefig(fig_name+'.png') plt.show()
reference resources:
https://www.jianshu.com/p/52b86c774b0b
Source: Qihai Tongtu official account Summary of video tutorials in hydrology, ecology, remote sensing, artificial intelligence, atmosphere and other directionshttps://mp.weixin.qq.com/s?__biz=MzUzMTczMDMwMw==&mid=2247500552&idx=4&sn=0d68a31fcdac77b47f1aec2c04e65f78&chksm=fabc9918cdcb100e706da62a720d2343106238ca2fc4a4aac001bad3e01d066d1355284fa6bd&token=1597410074&lang=zh_CN#rd