Linear Regression Basics
Duration: 5 min
This module introduces the fundamentals of Linear Regression, a foundational algorithm in supervised machine learning. We will explore the theory behind Linear Regression, its applications, and how to implement it using Python. Understanding Linear Regression is crucial as it serves as the basis for more complex algorithms and is widely used in various fields for predictive analysis.
Understanding Linear Regression
Linear Regression is a statistical method for modeling the relationship between a dependent variable and one or more independent variables. The goal is to find the line of best fit that minimizes the sum of squared residuals (differences between observed and predicted values). This line can then be used to make predictions on new data.
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
# Sample data
x = np.array([1, 2, 3, 4, 5]).reshape(-1, 1)
y = np.array([2, 3, 5, 7, 11])
# Create and fit the model
model = LinearRegression()
model.fit(x, y)
# Make predictions
x_pred = np.array([6]).reshape(-1, 1)
y_pred = model.predict(x_pred)
# Plotting
plt.scatter(x, y, color='blue')
plt.plot(x, model.predict(x), color='red')
plt.scatter(x_pred, y_pred, color='green')
plt.show()A scatter plot with a red line representing the line of best fit and a green point showing the prediction for x=6.Evaluating the Model
After fitting a Linear Regression model, it's important to evaluate its performance. Common metrics include the coefficient of determination (R^2 score) and Mean Squared Error (MSE). The R^2 score indicates how well the model explains the variance in the target variable, while MSE measures the average squared difference between actual and predicted values.
from sklearn.metrics import r2_score, mean_squared_error
# Actual and predicted values
y_true = np.array([2, 3, 5, 7, 11])
y_pred = model.predict(x)
# Calculate metrics
r2 = r2_score(y_true, y_pred)
mse = mean_squared_error(y_true, y_pred)
print(f'R^2 Score: {r2}')
print(f'Mean Squared Error: {mse}')💡 Tip: Always check the assumptions of Linear Regression, such as linearity, independence, homoscedasticity, and normality of residuals, to ensure the model's validity.
❓ What is the primary goal of Linear Regression?
❓ Which metric is used to evaluate how well the Linear Regression model explains the variance in the target variable?
Key Concepts
| Concept | Description |
|---|---|
| Slope & Intercept | Core principle in this module |
| Least Squares | Core principle in this module |
| R² Score | Core principle in this module |
| Residuals | Core principle in this module |
Check Your Understanding
❓ What is the main purpose of Linear?
❓ Which of these is a key characteristic of Linear?