SageMaker Studio & Notebooks
Duration: 50 min
SageMaker Studio is the integrated development environment for ML workflows. This module covers Studio setup, JupyterLab interface, kernel management, and instance type selection for different workloads.
SageMaker Studio Overview
Studio provides a unified web interface with built-in Git integration, experiment tracking, and model registry access. It eliminates the need to manage separate Jupyter servers and provides seamless access to SageMaker services.
Creating a Studio Domain
# Create a SageMaker Studio domain
aws sagemaker create-domain \
--domain-name ml-studio \
--auth-mode IAM \
--default-user-settings \
ExecutionRole=arn:aws:iam::123456789012:role/SageMakerRole \
--region us-east-1Launching Studio and User Profiles
import boto3
sagemaker_client = boto3.client('sagemaker', region_name='us-east-1')
# Create user profile
response = sagemaker_client.create_user_profile(
DomainId='d-xxxxx',
UserProfileName='data-scientist-1',
UserSettings={
'ExecutionRole': 'arn:aws:iam::123456789012:role/SageMakerRole',
'SharingSettings': {
'NotebookOutputOption': 'Allowed',
'S3OutputPath': 's3://my-bucket/studio-output'
}
}
)
print(f"User profile created: {response['UserProfileArn']}")JupyterLab Interface
Studio uses JupyterLab as the notebook interface. The left sidebar shows file browser, running kernels, and extensions. The main editor supports multiple notebooks, terminals, and text editors simultaneously.
# Access SageMaker session from Studio notebook
import sagemaker
from sagemaker import get_execution_role
session = sagemaker.Session()
role = get_execution_role()
bucket = session.default_bucket()
# List available notebooks
notebooks = session.list_notebook_instances()
print(f"Available notebooks: {len(notebooks['NotebookInstances'])}")Kernel Management
Studio supports multiple kernel types: Python 3, Python 2, R, and custom kernels. Each kernel runs in an isolated environment with its own dependencies.
# Install packages in current kernel
pip install sagemaker boto3 pandas scikit-learn
# Check installed packages
pip list | grep sagemakerInstance Types for Studio
{
"instance_types": {
"ml.t3.medium": {
"use_case": "Development, light workloads",
"cpu": 1,
"memory_gb": 4,
"cost_per_hour": 0.05
},
"ml.m5.large": {
"use_case": "General purpose, data exploration",
"cpu": 2,
"memory_gb": 8,
"cost_per_hour": 0.10
},
"ml.p3.2xlarge": {
"use_case": "GPU-intensive, deep learning",
"cpu": 8,
"memory_gb": 61,
"gpu": 1,
"cost_per_hour": 3.06
}
}
}Working with Notebooks in Studio
# Create and run a simple notebook
import pandas as pd
import numpy as np
# Load data
data = pd.read_csv('s3://my-bucket/data.csv')
print(f"Data shape: {data.shape}")
# Basic exploration
print(data.describe())
print(data.isnull().sum())
# Save processed data
data.to_csv('s3://my-bucket/processed_data.csv', index=False)Git Integration
# Clone a repository in Studio terminal
git clone https://github.com/aws/amazon-sagemaker-examples.git
# Commit changes
git add .
git commit -m "Update notebook"
git push origin mainQuiz 1
❓ What is SageMaker Studio?
Quiz 2
❓ Which notebook interface does Studio use?
Quiz 3
❓ Which instance type is best for GPU-intensive deep learning?
Quiz 4
❓ How do you install packages in a Studio kernel?
Quiz 5
❓ What is the primary benefit of Studio's Git integration?