Introduction to Supervised Learning

Duration: 5 min

This module provides an introduction to supervised learning, a fundamental area of machine learning where algorithms learn from labeled data to make predictions. We'll cover key algorithms including Linear Regression, Logistic Regression, Decision Trees, Random Forests, Support Vector Machines (SVM), and Gradient Boosting. Understanding these algorithms is crucial for building predictive models in various applications.

Supervised Learning Pipeline

Linear Regression

Linear Regression is a statistical method for modeling the relationship between a dependent variable and one or more independent variables. It assumes a linear relationship between the input variables (x) and the single output variable (y). More specifically, that y can be calculated from a linear combination of the input variables (x). When there is a single input variable, this technique is known as Simple Linear Regression.

import numpy as np
from sklearn.linear_model import LinearRegression

# Sample data
x = np.array([[1], [2], [3], [4], [5]])
y = np.array([1, 3, 2, 5, 4])

# Create and train the model
model = LinearRegression()
model.fit(x, y)

# Make a prediction
print(model.predict([[6]])[0])

Try it in Google Colab:

5.999999999999999

Logistic Regression

Logistic Regression is a popular algorithm used for binary classification problems. Despite its name, it is a classification, not regression, technique. It models the probability that a given input point belongs to a certain class. The algorithm uses a logistic function to provide a probability which can then be used to assign a class.

from sklearn.linear_model import LogisticRegression
from sklearn.datasets import make_classification

# Generate a binary classification dataset
X, y = make_classification(n_samples=100, n_features=2, n_informative=2, n_redundant=0, random_state=1)

# Create and train the model
model = LogisticRegression(max_iter=1000)
model.fit(X, y)

# Make a prediction
print(model.predict([[0, 0]])[0])

💡 Tip: When using Logistic Regression, ensure your features are scaled properly as the algorithm is sensitive to the scale of the input features.

❓ What type of problems is Linear Regression best suited for?

Classification problems Regression problems Clustering problems Dimensionality reduction problems

❓ What is the primary use of Logistic Regression?

Predicting continuous values Classifying data into categories Reducing the number of features Grouping similar data points

Key Concepts

Concept	Description
Labels	Core principle in this module
Training	Core principle in this module
Validation	Core principle in this module
Prediction	Core principle in this module

Check Your Understanding

❓ What is the main purpose of Introduction?

To classify data To predict values To understand patterns To reduce dimensions

❓ Which of these is a key characteristic of Introduction?

Supervised Unsupervised Semi-supervised Reinforcement

Introduction to Supervised Learning

Linear Regression

Logistic Regression

Key Concepts

Check Your Understanding

Related Courses