Fine-tuning & Custom Models
Duration: 55 min
Fine-tuning adapts foundation models to your specific domain or task. This module covers continued pre-training, supervised fine-tuning, provisioned throughput, and when to use each approach.
Fine-tuning vs Pre-training
Pre-training: Training from scratch on massive datasets (expensive, time-consuming)
Fine-tuning: Adapting a pre-trained model to your specific task (faster, cheaper)
Foundation Model (Pre-trained)
↓
Fine-tuning on Your Data
↓
Custom Model (Domain-specific)When to Fine-tune
# ✅ Good candidates for fine-tuning:
# - Specialized domain (medical, legal, finance)
# - Specific writing style or tone
# - Consistent output format
# - Limited training data (100+ examples)
# ❌ Not good candidates:
# - General-purpose tasks (use prompt engineering)
# - Very small datasets (<50 examples)
# - Rapidly changing requirementsPreparing Training Data
import json
# Training data format for Claude fine-tuning
training_data = [
{
"messages": [
{
"role": "user",
"content": "Classify this email: 'Buy cheap watches now!'"
},
{
"role": "assistant",
"content": "This is spam. It uses urgency and promotional language."
}
]
},
{
"messages": [
{
"role": "user",
"content": "Classify this email: 'Your meeting is scheduled for 2 PM'"
},
{
"role": "assistant",
"content": "This is legitimate. It's a calendar notification."
}
]
}
]
# Save to JSONL format (one JSON object per line)
with open('training_data.jsonl', 'w') as f:
for example in training_data:
f.write(json.dumps(example) + '\n')
# Upload to S3
import boto3
s3 = boto3.client('s3')
s3.upload_file('training_data.jsonl', 'my-bucket', 'training/data.jsonl')Creating a Fine-tuned Model
import boto3
client = boto3.client('bedrock', region_name='us-east-1')
# Create a fine-tuning job
response = client.create_model_customization_job(
jobName='email-classifier-ft',
customModelName='email-classifier-v1',
roleArn='arn:aws:iam::ACCOUNT:role/BedrockFineTuningRole',
baseModelIdentifier='anthropic.claude-3-sonnet-20240229-v1:0',
trainingDataConfig={
's3Uri': 's3://my-bucket/training/data.jsonl'
},
outputDataConfig={
's3OutputPath': 's3://my-bucket/output/'
},
hyperParameters={
'epochs': '3',
'batchSize': '8',
'learningRate': '0.0001'
}
)
job_id = response['jobArn']
print(f"Fine-tuning job started: {job_id}")Monitoring Fine-tuning
# Check job status
response = client.get_model_customization_job(
jobIdentifier=job_id
)
status = response['status']
print(f"Status: {status}")
if status == 'Completed':
model_arn = response['outputModelArn']
print(f"Custom model ARN: {model_arn}")
elif status == 'Failed':
print(f"Error: {response['failureMessage']}")Using Fine-tuned Models
# Invoke the fine-tuned model
response = client.invoke_model(
modelId='arn:aws:bedrock:us-east-1:ACCOUNT:custom-model/email-classifier-v1',
body=json.dumps({
"anthropic_version": "bedrock-2023-06-01",
"max_tokens": 100,
"messages": [
{
"role": "user",
"content": "Classify: 'Limited time offer - 50% off today!'"
}
]
})
)
result = json.loads(response['body'].read())
print(result['content'][0]['text'])Provisioned Throughput
Reserve capacity for predictable workloads:
# Create provisioned throughput
response = client.create_provisioned_model_throughput(
modelUnits=1, # 1 unit = 100K input tokens/min
provisionedModelName='email-classifier-provisioned',
modelId='arn:aws:bedrock:us-east-1:ACCOUNT:custom-model/email-classifier-v1',
commitmentDuration='ONE_MONTH' # or SIX_MONTHS
)
provisioned_arn = response['provisionedModelArn']
print(f"Provisioned throughput: {provisioned_arn}")
# Use provisioned model
response = client.invoke_model(
modelId=provisioned_arn,
body=json.dumps({
"anthropic_version": "bedrock-2023-06-01",
"max_tokens": 100,
"messages": [
{"role": "user", "content": "Classify this email..."}
]
})
)Hyperparameter Tuning
# Common hyperparameters
hyperparameters = {
'epochs': '3', # Number of training passes (1-10)
'batchSize': '8', # Examples per batch (4-32)
'learningRate': '0.0001', # How fast to learn (0.00001-0.1)
'warmupSteps': '100', # Gradual learning rate increase
'weightDecay': '0.01' # Regularization to prevent overfitting
}
# Guidelines:
# - Start with defaults
# - Increase epochs if underfitting
# - Decrease learning rate if loss is unstable
# - Increase batch size for stability (if memory allows)Continued Pre-training
For domain-specific language patterns:
# Continued pre-training data (unlabeled text)
pretraining_data = [
"AWS Bedrock is a managed service for foundation models.",
"You can invoke models using the boto3 SDK.",
"Guardrails help control model behavior and filter content."
]
# Save as plain text
with open('pretraining_data.txt', 'w') as f:
for text in pretraining_data:
f.write(text + '\n')
# Upload to S3
s3.upload_file('pretraining_data.txt', 'my-bucket', 'pretraining/data.txt')
# Create continued pre-training job
response = client.create_model_customization_job(
jobName='bedrock-domain-pt',
customModelName='bedrock-domain-v1',
roleArn='arn:aws:iam::ACCOUNT:role/BedrockFineTuningRole',
baseModelIdentifier='anthropic.claude-3-sonnet-20240229-v1:0',
trainingDataConfig={
's3Uri': 's3://my-bucket/pretraining/data.txt'
},
outputDataConfig={
's3OutputPath': 's3://my-bucket/output/'
},
customizationType='CONTINUED_PRE_TRAINING'
)Best Practices
# ✅ Good: Diverse, high-quality training data
training_data = [
# Include edge cases
{"messages": [{"role": "user", "content": "..."}, {"role": "assistant", "content": "..."}]},
# Include variations
{"messages": [{"role": "user", "content": "..."}, {"role": "assistant", "content": "..."}]},
# Include corrections
{"messages": [{"role": "user", "content": "..."}, {"role": "assistant", "content": "..."}]}
]
# ✅ Good: Validation set
# Reserve 10-20% of data for validation
# Monitor validation loss to detect overfitting
# ✅ Good: Iterative improvement
# Start with small fine-tuning job
# Evaluate results
# Collect more data if needed
# Re-fine-tune with improvements
# ✅ Good: Cost tracking
# Fine-tuning costs: ~$0.30-1.00 per 1M tokens
# Provisioned throughput: ~$0.50-2.00 per model unit/hour
# Calculate ROI before committing❓ What is the main advantage of fine-tuning over pre-training?
❓ What is the minimum recommended training data size?
❓ What is provisioned throughput used for?
❓ What format should training data be in?