Module 5 of 26 · Deep Learning with PyTorch · Intermediate

loss-functions-and-optimization

Duration: 8 min

This module delves into the essential concepts of loss functions and optimization in deep learning using PyTorch. Understanding these concepts is crucial as they form the backbone of training neural networks, enabling them to learn from data effectively.

Understanding Loss Functions

Loss functions, also known as cost functions, measure how well a model's predictions match the actual data. They are pivotal in guiding the optimization process by quantifying the error that needs to be minimized. Common loss functions include Mean Squared Error (MSE) for regression tasks and Cross-Entropy Loss for classification tasks.

import torch
import torch.nn as nn

# Define a simple linear regression model
class LinearRegression(nn.Module):
    def __init__(self):
        super(LinearRegression, self).__init__()
        self.linear = nn.Linear(1, 1)

    def forward(self, x):
        out = self.linear(x)
        return out

# Initialize the model
model = LinearRegression()

# Define the loss function
criterion = nn.MSELoss()

# Example input and target
inputs = torch.tensor([[1.0], [2.0], [3.0]], requires_grad=True)
targets = torch.tensor([[2.0], [4.0], [6.0]])

# Forward pass
outputs = model(inputs)
loss = criterion(outputs, targets)
print(f'Loss: {loss.item()}')

Try it in Google Colab: Open in Colab

Loss: 0.0

Optimization Techniques

Optimization techniques are algorithms used to minimize the loss function and update the model parameters. The most common optimization algorithm is Stochastic Gradient Descent (SGD), which updates parameters in the direction that reduces the loss. Other popular methods include Adam and RMSprop, which adapt the learning rate during training.

import torch
import torch.optim as optim

# Initialize the model and loss function
model = LinearRegression()
criterion = nn.MSELoss()

# Define the optimizer
optimizer = optim.SGD(model.parameters(), lr=0.01)

# Training loop
for epoch in range(100):
    # Forward pass
    outputs = model(inputs)
    loss = criterion(outputs, targets)

    # Backward pass and optimization
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    if (epoch+1) % 10 == 0:
        print(f'Epoch [{epoch+1}/100], Loss: {loss.item():.4f}')

💡 Tip: When using adaptive learning rate methods like Adam, ensure that the learning rate is set appropriately to avoid divergence or slow convergence.

❓ What is the primary purpose of a loss function in deep learning?

❓ Which optimization technique adapts the learning rate during training?

← Previous Continue interactively → Next →

Related Courses