Introduction to Unsupervised Learning
Duration: 5 min
This module provides an introduction to unsupervised learning, a type of machine learning that identifies patterns in data without labeled responses. We will cover key algorithms such as K-Means, DBSCAN, Hierarchical Clustering, PCA, t-SNE, and Autoencoders. Understanding these techniques is crucial for data analysis and preprocessing, as they help uncover hidden structures and reduce dimensionality.
K-Means Clustering
K-Means is a popular clustering algorithm that partitions data into K clusters by minimizing the variance within each cluster. It works by randomly initializing K centroids, assigning each data point to the nearest centroid, and then recalculating the centroids based on the assigned points. This process repeats until the centroids stabilize.
import numpy as np
from sklearn.cluster import KMeans
# Generate sample data
X = np.array([[1, 2], [1, 4], [1, 0],
[10, 2], [10, 4], [10, 0]])
# Apply K-Means clustering
kmeans = KMeans(n_clusters=2, random_state=0).fit(X)
# Print cluster labels
print(kmeans.labels_)[1 1 1 0 0 0]DBSCAN Clustering
DBSCAN (Density-Based Spatial Clustering of Applications with Noise) groups together points that are closely packed, marking points that lie alone in low-density regions as outliers. It requires two parameters: eps (the maximum distance between two points to be considered in the same neighborhood) and min_samples (the minimum number of points in a neighborhood for a point to be considered a core point).
from sklearn.cluster import DBSCAN
# Generate sample data
X = np.array([[1, 2], [2, 2], [2, 3],
[8, 7], [8, 8], [25, 80]])
# Apply DBSCAN clustering
dbscan = DBSCAN(eps=3, min_samples=2).fit(X)
# Print cluster labels
print(dbscan.labels_)💡 Tip: When using DBSCAN, carefully choose the eps and min_samples parameters, as they significantly affect the resulting clusters.
❓ What is the primary goal of K-Means clustering?
❓ Which parameter in DBSCAN determines the maximum distance between two points to be considered in the same neighborhood?