Machine Learning----PyTorch Model Training

PyTorch

  • In the previous linear regression was done by hand, the problem of linear regression is actually to solve the w value in the case where the loss function is the smallest.
  • Many functions in PyTorch are encapsulated, and we can use them directly.

loss function

handwritten loss function

def loss(y, y_pred):
	"""loss function"""
	# (true value - predicted value)^2 mean
    return ((y_pred - y)**2).mean()

Loss function packaged by PyTorch

The loss function is to calculate the average of (predicted value - true value)^2
(mean squared error)

torch.nn.MSELoss() method parameters:

  1. Predicted values ​​for all data
  2. all true values
  3. (The same is true for swapping positions, because the mean squared error is calculated)
import torch
X = torch.tensor([1,2,3,4],dtype=torch.float32)
Y = torch.tensor([2,4,6,8],dtype=torch.float32)
w = torch.tensor(0.0,dtype=torch.float32,requires_grad=True)

def forward(x):
    return w * x

# The mean squared error calculates the distance between the predicted value and the true value
loss = torch.nn.MSELoss()
# Calculate the loss at this time
y_pre = forward(X)
l = loss(y_pre,Y)
print(f"loss value at this time:{l}")

optimizer

The optimizer replaces the handwritten gradient descent algorithm, PyTorch will automatically calculate the gradient and update the parameters

Define the optimizer

The optim module contains multiple optimizers, all of which are improved algorithms based on the basic gradient descent algorithm, which can find the optimal parameter solutions faster, such as: SGD, Adam,Momentum,RMSProp, where SGD gradient descent is used algorithm

  1. Parameters that need to be updated by backpropagation, there may be multiple parameters that need to be updated, so parameter 1 is a list
  2. lr named parameter is learning rate
optimizer = torch.optim.SGD([W], lr=learning_rate)

complete linear regression

l.backward() loss function backpropagation to calculate gradient
optimizer.step() Update the w parameter through the optimizer and take a step in the direction of the gradient (solve the partial derivative of the loss loss value with respect to w)
optimizer.zero_grad() clears the gradient calculation to prevent the accumulation of gradients from causing incorrect results

# Create x, y data and custom w parameters
X = torch.tensor([1,2,3,4],dtype=torch.float32)
Y = torch.tensor([2,4,6,8],dtype=torch.float32)
w = torch.tensor(0.0,dtype=torch.float32,requires_grad=True)
# Define the learning rate and the number of iterations to train the model
learning_rate = 0.001
n_iters = 1000
# Create a loss function
loss = torch.nn.MSELoss()
# Create the optimizer Throw in the w parameter, to calculate the relationship of the loss value with respect to w (partial derivative) Throw in the learning rate
optimizer = torch.optim.SGD([w],lr=learning_rate)

# Forward Propagation Function (Model)
def forward(x):
    """Forward Propagation Function"""
    return w * x

# Train the model
for epoch in range(n_iters):
    # Get the predicted value through forward propagation
    y_pred = forward(X)
    # Get the loss value through the loss function
    l = loss(y_pred,Y)
    # Backpropagation computes gradients
    l.backward()
    # Update the w parameter by the optimizer Take a step in the direction of the gradient
    optimizer.step()
    # Clear the gradient calculation to prevent the accumulation of gradients from causing incorrect results
    optimizer.zero_grad()
    
    if epoch % 100 == 0:
        # Print the change of the epoch w parameter and the change of the loss value
        print(f'epoch: {epoch},w: {w},loss: {l:.8f}')

model building

Building a model is to save the function of handwritten forward propagation

torch.nn.Linear(input_size,output_size) represents the function of the linear model

  • input_size: the dimension of the input data
  • output_size: the dimension of the output data

model.parameters() parameters in the model

# Create x, y data and custom w parameters
X = torch.tensor([[1],[2],[3],[4]],dtype=torch.float32)
Y = torch.tensor([[2],[4],[6],[8]],dtype=torch.float32)
# test set
X_test = torch.tensor([5],dtype=torch.float32)
# Define model parameters
n_samples,n_features = X.shape
print(n_features)
# Because the currently defined input and output dimensions are consistent, the parameter input and output are both n_features
model = torch.nn.Linear(n_features,n_features)
# Define the learning rate and the number of iterations to train the model
learning_rate = 0.01
n_iters = 1000
# Create a loss function
loss = torch.nn.MSELoss()
# Create an optimizer Throw in the parameters that the model needs to update, and calculate the relationship of the loss value with respect to w (partial derivative) Throw in the learning rate
optimizer = torch.optim.SGD(model.parameters(),lr=learning_rate)


# Train the model
for epoch in range(n_iters):
    # Get the predicted value through forward propagation
    y_pred = model(X)
    # Get the loss value through the loss function
    l = loss(y_pred,Y)
    # Backpropagation computes gradients
    l.backward()
    # Update the w parameter by the optimizer Take a step in the direction of the gradient
    optimizer.step()
    # Clear the gradient calculation to prevent the accumulation of gradients from causing incorrect results
    optimizer.zero_grad()
    
    if epoch % 100 == 0:
        # Get the value of the w parameter and the value of b in the model
        w,b = model.parameters()
        # Print the change of the epoch w parameter and the change of the loss value
        print(f'epoch: {epoch},w: {w[0,0].item()},loss: {l}')

After training, use the test set to test the w parameter:

test_model = model(X_test)


y = 2x + b data 5 is infinitely close to 10

Summarize:

PyTorch training model process:

  1. Get training set, get input dimension and output dimension
  2. Create appropriate models based on input and output dimensions
  3. Create a loss function
  4. Create an optimizer
  5. Train the model Forward Propagation ➡ Back Propagation ➡ Gradient Descent ➡ Loop
  6. Get the test set to evaluate the model

Machine learning does not solve the solution at once, it is a process of slowly approaching the solution.

Tags: Deep Learning Pytorch Machine Learning

Posted by Jordi_E on Wed, 01 Jun 2022 17:53:51 +0530