PyTorch
- In the previous linear regression was done by hand, the problem of linear regression is actually to solve the w value in the case where the loss function is the smallest.
- Many functions in PyTorch are encapsulated, and we can use them directly.
loss function
handwritten loss function
def loss(y, y_pred): """loss function""" # (true value - predicted value)^2 mean return ((y_pred - y)**2).mean()
Loss function packaged by PyTorch
The loss function is to calculate the average of (predicted value - true value)^2
(mean squared error)
torch.nn.MSELoss() method parameters:
- Predicted values for all data
- all true values
- (The same is true for swapping positions, because the mean squared error is calculated)
import torch X = torch.tensor([1,2,3,4],dtype=torch.float32) Y = torch.tensor([2,4,6,8],dtype=torch.float32) w = torch.tensor(0.0,dtype=torch.float32,requires_grad=True) def forward(x): return w * x # The mean squared error calculates the distance between the predicted value and the true value loss = torch.nn.MSELoss() # Calculate the loss at this time y_pre = forward(X) l = loss(y_pre,Y) print(f"loss value at this time:{l}")
optimizer
The optimizer replaces the handwritten gradient descent algorithm, PyTorch will automatically calculate the gradient and update the parameters
Define the optimizer
The optim module contains multiple optimizers, all of which are improved algorithms based on the basic gradient descent algorithm, which can find the optimal parameter solutions faster, such as: SGD, Adam,Momentum,RMSProp, where SGD gradient descent is used algorithm
- Parameters that need to be updated by backpropagation, there may be multiple parameters that need to be updated, so parameter 1 is a list
- lr named parameter is learning rate
optimizer = torch.optim.SGD([W], lr=learning_rate)
complete linear regression
l.backward() loss function backpropagation to calculate gradient
optimizer.step() Update the w parameter through the optimizer and take a step in the direction of the gradient (solve the partial derivative of the loss loss value with respect to w)
optimizer.zero_grad() clears the gradient calculation to prevent the accumulation of gradients from causing incorrect results
# Create x, y data and custom w parameters X = torch.tensor([1,2,3,4],dtype=torch.float32) Y = torch.tensor([2,4,6,8],dtype=torch.float32) w = torch.tensor(0.0,dtype=torch.float32,requires_grad=True) # Define the learning rate and the number of iterations to train the model learning_rate = 0.001 n_iters = 1000 # Create a loss function loss = torch.nn.MSELoss() # Create the optimizer Throw in the w parameter, to calculate the relationship of the loss value with respect to w (partial derivative) Throw in the learning rate optimizer = torch.optim.SGD([w],lr=learning_rate) # Forward Propagation Function (Model) def forward(x): """Forward Propagation Function""" return w * x # Train the model for epoch in range(n_iters): # Get the predicted value through forward propagation y_pred = forward(X) # Get the loss value through the loss function l = loss(y_pred,Y) # Backpropagation computes gradients l.backward() # Update the w parameter by the optimizer Take a step in the direction of the gradient optimizer.step() # Clear the gradient calculation to prevent the accumulation of gradients from causing incorrect results optimizer.zero_grad() if epoch % 100 == 0: # Print the change of the epoch w parameter and the change of the loss value print(f'epoch: {epoch},w: {w},loss: {l:.8f}')
model building
Building a model is to save the function of handwritten forward propagation
torch.nn.Linear(input_size,output_size) represents the function of the linear model
- input_size: the dimension of the input data
- output_size: the dimension of the output data
model.parameters() parameters in the model
# Create x, y data and custom w parameters X = torch.tensor([[1],[2],[3],[4]],dtype=torch.float32) Y = torch.tensor([[2],[4],[6],[8]],dtype=torch.float32) # test set X_test = torch.tensor([5],dtype=torch.float32) # Define model parameters n_samples,n_features = X.shape print(n_features) # Because the currently defined input and output dimensions are consistent, the parameter input and output are both n_features model = torch.nn.Linear(n_features,n_features) # Define the learning rate and the number of iterations to train the model learning_rate = 0.01 n_iters = 1000 # Create a loss function loss = torch.nn.MSELoss() # Create an optimizer Throw in the parameters that the model needs to update, and calculate the relationship of the loss value with respect to w (partial derivative) Throw in the learning rate optimizer = torch.optim.SGD(model.parameters(),lr=learning_rate) # Train the model for epoch in range(n_iters): # Get the predicted value through forward propagation y_pred = model(X) # Get the loss value through the loss function l = loss(y_pred,Y) # Backpropagation computes gradients l.backward() # Update the w parameter by the optimizer Take a step in the direction of the gradient optimizer.step() # Clear the gradient calculation to prevent the accumulation of gradients from causing incorrect results optimizer.zero_grad() if epoch % 100 == 0: # Get the value of the w parameter and the value of b in the model w,b = model.parameters() # Print the change of the epoch w parameter and the change of the loss value print(f'epoch: {epoch},w: {w[0,0].item()},loss: {l}')
After training, use the test set to test the w parameter:
test_model = model(X_test)
y = 2x + b data 5 is infinitely close to 10
Summarize:
PyTorch training model process:
- Get training set, get input dimension and output dimension
- Create appropriate models based on input and output dimensions
- Create a loss function
- Create an optimizer
- Train the model Forward Propagation ➡ Back Propagation ➡ Gradient Descent ➡ Loop
- Get the test set to evaluate the model
Machine learning does not solve the solution at once, it is a process of slowly approaching the solution.