Module 1 of 9 · Real Datasets & Pre-trained Models · Beginner

Where to Find Real Datasets

Duration: 5 min

Before you can train a model, you need data. The good news: there are millions of free, real-world datasets available online. This module covers the main sources and how to access them.

The main dataset sources

Downloading from Kaggle

# Install the Kaggle CLI
pip install kaggle

# Place your kaggle.json API key in ~/.kaggle/
# Download from: kaggle.com > Account > API > Create New Token

# Download a dataset
kaggle datasets download -d camnugent/california-housing-prices
unzip california-housing-prices.zip

Try it in Google Colab: Open in Colab

Classic datasets worth knowing

💡 Tip: Start with a dataset that has a clear target variable (what you're predicting) and under 100,000 rows. You can iterate fast and see results quickly.

❓ Which platform hosts the largest community of ML datasets and competitions?

Continue interactively → Next →

Related Courses