Available Models & Selection

Duration: 50 min

Bedrock provides access to foundation models from multiple providers. Each model has different strengths, speeds, and costs. This module covers the available models, their capabilities, and how to choose the right one for your use case.

Model Providers & Families

Anthropic Claude

Claude is a family of large language models known for safety, reasoning, and instruction-following.

Claude 3 Family:

Claude 3 Opus — Most capable, best for complex reasoning (slower, more expensive)
Claude 3 Sonnet — Balanced speed/capability, recommended for most tasks
Claude 3 Haiku — Fastest, cheapest, good for simple tasks

import boto3
import json

client = boto3.client('bedrock-runtime', region_name='us-east-1')

# Using Claude 3 Sonnet
response = client.invoke_model(
    modelId='anthropic.claude-3-sonnet-20240229-v1:0',
    body=json.dumps({
        "anthropic_version": "bedrock-2023-06-01",
        "max_tokens": 1024,
        "messages": [
            {"role": "user", "content": "Explain quantum computing"}
        ]
    })
)

print(json.loads(response['body'].read())['content'][0]['text'])

Meta Llama

Open-source models optimized for speed and cost.

Llama 2 & Llama 3:

Llama 2 70B — Powerful, good for complex tasks
Llama 3 70B — Improved reasoning and instruction-following
Llama 3 8B — Lightweight, fast inference

# Using Llama 3 70B
response = client.invoke_model(
    modelId='meta.llama3-70b-instruct-v1:0',
    body=json.dumps({
        "prompt": "What is machine learning?",
        "max_gen_len": 512,
        "temperature": 0.7,
        "top_p": 0.9
    })
)

result = json.loads(response['body'].read())
print(result['generation'])

Mistral AI

Fast, efficient models with strong performance on reasoning tasks.

Mistral Models:

Mistral Large — Most capable, good for complex tasks
Mistral 7B — Lightweight, fast
Mixtral 8x7B — Mixture of experts, efficient

# Using Mistral Large
response = client.invoke_model(
    modelId='mistral.mistral-large-2402-v1:0',
    body=json.dumps({
        "prompt": "Explain RAG systems",
        "max_tokens": 512,
        "temperature": 0.7
    })
)

result = json.loads(response['body'].read())
print(result['outputs'][0]['text'])

Amazon Titan

AWS-native models optimized for Bedrock integration.

Titan Models:

Titan Text Premier — High-quality text generation
Titan Text Express — Fast, cost-effective
Titan Embeddings — Generate vector embeddings for RAG

# Using Titan Text Express
response = client.invoke_model(
    modelId='amazon.titan-text-express-v1:0',
    body=json.dumps({
        "inputText": "Summarize AWS Bedrock",
        "textGenerationConfig": {
            "maxTokenCount": 512,
            "temperature": 0.7,
            "topP": 0.9
        }
    })
)

result = json.loads(response['body'].read())
print(result['results'][0]['outputText'])

Stability AI

Image generation and manipulation models.

Stable Diffusion:

Stable Diffusion 3 — Latest, highest quality
Stable Diffusion XL — Fast, good quality
Stable Diffusion 1.6 — Lightweight

import base64

# Generate an image
response = client.invoke_model(
    modelId='stability.stable-diffusion-xl-v1:0',
    body=json.dumps({
        "text_prompts": [
            {"text": "A futuristic AI assistant", "weight": 1.0}
        ],
        "cfg_scale": 10,
        "steps": 50,
        "seed": 0
    })
)

result = json.loads(response['body'].read())
image_data = base64.b64decode(result['artifacts'][0]['base64'])
with open('generated_image.png', 'wb') as f:
    f.write(image_data)

Model Comparison Matrix

Model	Speed	Cost	Reasoning	Code	Creativity
Claude 3 Opus	Slow	High	⭐⭐⭐⭐⭐	⭐⭐⭐⭐	⭐⭐⭐⭐
Claude 3 Sonnet	Medium	Medium	⭐⭐⭐⭐	⭐⭐⭐⭐	⭐⭐⭐⭐
Claude 3 Haiku	Fast	Low	⭐⭐⭐	⭐⭐⭐	⭐⭐⭐
Llama 3 70B	Medium	Low	⭐⭐⭐⭐	⭐⭐⭐⭐	⭐⭐⭐
Mistral Large	Medium	Low	⭐⭐⭐⭐	⭐⭐⭐⭐	⭐⭐⭐
Titan Text Express	Fast	Low	⭐⭐⭐	⭐⭐⭐	⭐⭐⭐

How to Choose a Model

For customer support chatbots:

Use Claude 3 Sonnet or Llama 3 70B
Fast response time, good instruction-following
Cost-effective at scale

For complex reasoning & analysis:

Use Claude 3 Opus or Mistral Large
Best for multi-step problem solving
Accept slower response times

For code generation:

Use Claude 3 Sonnet or Llama 3 70B
Both excel at code tasks
Sonnet slightly better for complex logic

For high-volume, cost-sensitive tasks:

Use Llama 3 8B or Mistral 7B
Fastest inference, lowest cost
Good for simple classification, summarization

For image generation:

Use Stable Diffusion 3 for quality
Use Stable Diffusion XL for speed/cost balance

Listing Available Models

# List all available models
aws bedrock list-foundation-models --region us-east-1

# Filter by provider
aws bedrock list-foundation-models \
  --by-provider anthropic \
  --region us-east-1

# Filter by capability
aws bedrock list-foundation-models \
  --by-customization-type FINE_TUNING \
  --region us-east-1

Model IDs Reference

{
  "claude": {
    "opus": "anthropic.claude-3-opus-20240229-v1:0",
    "sonnet": "anthropic.claude-3-sonnet-20240229-v1:0",
    "haiku": "anthropic.claude-3-haiku-20240307-v1:0"
  },
  "llama": {
    "llama3_70b": "meta.llama3-70b-instruct-v1:0",
    "llama3_8b": "meta.llama3-8b-instruct-v1:0",
    "llama2_70b": "meta.llama2-70b-chat-v1:0"
  },
  "mistral": {
    "large": "mistral.mistral-large-2402-v1:0",
    "7b": "mistral.mistral-7b-instruct-v0:2"
  },
  "titan": {
    "text_premier": "amazon.titan-text-premier-v1:0",
    "text_express": "amazon.titan-text-express-v1:0",
    "embeddings": "amazon.titan-embed-text-v2:0"
  }
}

❓ Which Claude 3 model offers the best balance of speed and capability?

Claude 3 Opus Claude 3 Sonnet Claude 3 Haiku All are equally balanced

❓ Which model family is best for cost-sensitive, high-volume tasks?

Claude 3 Opus Mistral Large Llama 3 8B or Mistral 7B Stable Diffusion 3

❓ What is Titan Embeddings used for?

Generating vector embeddings for RAG systems Image generation Code generation Fine-tuning other models

❓ Which model would you choose for complex multi-step reasoning?

Llama 3 8B Mistral 7B Claude 3 Haiku Claude 3 Opus