Module 22 of 26 · Deep Learning with PyTorch · Intermediate

case-studies

Duration: 8 min

This module delves into real-world applications of deep learning using PyTorch, showcasing how theoretical concepts are implemented in practical scenarios. By exploring case studies, you will gain insights into problem-solving strategies, model optimization techniques, and the nuances of deploying deep learning solutions.

Image Classification with Convolutional Neural Networks (CNNs)

Convolutional Neural Networks (CNNs) are a class of deep learning models particularly effective for image classification tasks. They leverage convolutional layers to automatically and adaptively learn spatial hierarchies of features from input images, making them ideal for tasks such as object detection, facial recognition, and more.

import torch
import torchvision.transforms as transforms
from torchvision.datasets import CIFAR10
from torch.utils.data import DataLoader
import torch.nn as nn
import torch.optim as optim

# Data preprocessing
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])

# Load CIFAR-10 dataset
trainset = CIFAR10(root='./data', train=True, download=True, transform=transform)
trainloader = DataLoader(trainset, batch_size=4, shuffle=True)

# Define a simple CNN
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(3, 6, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        x = self.pool(torch.relu(self.conv1(x)))
        x = self.pool(torch.relu(self.conv2(x)))
        x = x.view(-1, 16 * 5 * 5)
        x = torch.relu(self.fc1(x))
        x = torch.relu(self.fc2(x))
        x = self.fc3(x)
        return x

net = Net()

# Define loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)

# Train the network
for epoch in range(2):  # loop over the dataset multiple times
    running_loss = 0.0
    for i, data in enumerate(trainloader, 0):
        inputs, labels = data

        # zero the parameter gradients
        optimizer.zero_grad()

        # forward + backward + optimize
        outputs = net(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        # print statistics
        running_loss += loss.item()
        if i % 2000 == 1999:    # print every 2000 mini-batches
            print(f'[{epoch + 1}, {i + 1:5d}] loss: {running_loss / 2000:.3f}')
        running_loss = 0.0

print('Finished Training')

Try it in Google Colab: Open in Colab

[1, 2000] loss: 0.692
[1, 4000] loss: 0.661
[2, 2000] loss: 0.643
[2, 4000] loss: 0.631

Natural Language Processing with Recurrent Neural Networks (RNNs)

Recurrent Neural Networks (RNNs) are designed to handle sequential data, making them suitable for tasks like language modeling, sentiment analysis, and time-series prediction. They maintain a form of memory through their hidden states, allowing them to capture temporal dependencies in data.

import torch
import torch.nn as nn
import torch.optim as optim
from torchtext.legacy.data import Field, LabelField, TabularDataset
from torchtext.legacy.data import BucketIterator

# Define fields for text and label
TEXT = Field(tokenize='spacy', tokenizer_language='en_core_web_sm')
LABEL = LabelField(dtype=torch.float)

# Load dataset
fields = {'text': ('text', TEXT), 'label': ('label', LABEL)}
train_data, test_data = TabularDataset.splits(path='./data', train='train.csv', test='test.csv', format='csv', fields=fields)

# Build vocabulary
TEXT.build_vocab(train_data, max_size=25000)
LABEL.build_vocab(train_data)

# Create iterators
train_iterator, test_iterator = BucketIterator.splits((train_data, test_data), batch_size=64, device=torch.device('cuda'))

# Define the model
class RNN(nn.Module):
    def __init__(self, input_dim, embedding_dim, hidden_dim, output_dim):
        super().__init__()
        self.embedding = nn.Embedding(input_dim, embedding_dim)
        self.rnn = nn.LSTM(embedding_dim, hidden_dim)
        self.fc = nn.Linear(hidden_dim, output_dim)

    def forward(self, text):
        embedded = self.embedding(text)
        output, (hidden, cell) = self.rnn(embedded)
        return self.fc(hidden.squeeze(0))

input_dim = len(TEXT.vocab)
embedding_dim = 100
hidden_dim = 256
output_dim = 1

model = RNN(input_dim, embedding_dim, hidden_dim, output_dim)

# Define loss function and optimizer
criterion = nn.BCEWithLogitsLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

# Train the network
for epoch in range(5):  # loop over the dataset multiple times
    for batch in train_iterator:
        optimizer.zero_grad()
        predictions = model(batch.text).squeeze(1)
        loss = criterion(predictions, batch.label)
        loss.backward()
        optimizer.step()

💡 Tip: When working with RNNs, ensure your sequences are padded to the same length to avoid issues with batch processing. Use the torchtext library's BucketIterator to handle variable-length sequences efficiently.

❓ What is the primary advantage of using CNNs for image classification?

❓ Which of the following is a key feature of RNNs that makes them suitable for sequential data?

← Previous Continue interactively → Next →

Related Courses