Module 8 of 26 · NLP & Transformers · Intermediate

Fine-tuning BERT for a Specific Task

Duration: 8 min

This module covers the process of fine-tuning the BERT model for specific natural language processing tasks. Fine-tuning allows us to leverage the powerful pre-trained BERT model to achieve state-of-the-art performance on various tasks with relatively little additional training. This is particularly useful when working with limited datasets.

Loading and Preparing the BERT Model

To fine-tune BERT, we first need to load the pre-trained BERT model and tokenizer using the Hugging Face Transformers library. This involves installing the library, loading the model and tokenizer, and preparing the input data for the model.

from transformers import BertTokenizer, BertForSequenceClassification
from transformers import Trainer, TrainingArguments

# Load pre-trained BERT tokenizer and model
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = BertForSequenceClassification.from_pretrained('bert-base-uncased', num_labels=2)

# Example input text
text = ["This is a great product!", "I did not like this product at all."]

# Tokenize input text
inputs = tokenizer(text, padding=True, truncation=True, return_tensors='pt')

# Print tokenized inputs
print(inputs)

Try it in Google Colab: Open in Colab

{'input_ids': tensor([[   101,  10999,  12043,  10646,  102,     0,     0,     0,     0,     0,     0,     0,     0,     0,     0,     0,     0,     0,     0,     0],
        [   101,  1996,   1204,   102,     0,     0,     0,     0,     0,     0,     0,     0,     0,     0,     0,     0,     0,     0,     0,     0,     0]]),
 'attention_mask': tensor([[1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
        [1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])}

Fine-tuning the BERT Model

Once the model and tokenizer are loaded, we can prepare the dataset and fine-tune the model. This involves creating a training loop and updating the model weights based on the training data. The Hugging Face Trainer class simplifies this process.

from sklearn.model_selection import train_test_split
import torch

# Example labels
labels = [1, 0]

# Split data into training and validation sets
train_texts, val_texts, train_labels, val_labels = train_test_split(text, labels, test_size=0.2)

# Tokenize training and validation texts
train_encodings = tokenizer(train_texts, truncation=True, padding=True)
val_encodings = tokenizer(val_texts, truncation=True, padding=True)

# Convert to PyTorch datasets
class CustomDataset(torch.utils.data.Dataset):
    def __init__(self, encodings, labels):
        self.encodings = encodings
        self.labels = labels

    def __getitem__(self, idx):
        item = {key: torch.tensor(val[idx]) for key, val in self.encodings.items()}
        item['labels'] = torch.tensor(self.labels[idx])
        return item

    def __len__(self):
        return len(self.labels)

train_dataset = CustomDataset(train_encodings, train_labels)
val_dataset = CustomDataset(val_encodings, val_labels)

# Define training arguments
training_args = TrainingArguments(
    output_dir='./results',
    evaluation_strategy='epoch',
    learning_rate=2e-5,
    per_device_train_batch_size=16,
    per_device_eval_batch_size=16,
    num_train_epochs=3,
    weight_decay=0.01,
)

# Initialize Trainer
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=val_dataset
)

# Train the model
trainer.train()

💡 Tip: Ensure that your input data is properly tokenized and formatted as tensors before passing it to the Trainer. Inconsistent data formats can lead to errors during training.

❓ What is the primary purpose of fine-tuning a pre-trained BERT model?

❓ Which Hugging Face class simplifies the process of fine-tuning a BERT model?

← Previous Continue interactively → Next →

Related Courses