Bias and Variance in Machine Learning
Duration: 5 min
Bias and variance are fundamental concepts in machine learning that describe the error in a model due to different reasons. Bias refers to the error introduced by approximating a real-world problem, which may be complex, by a simplified model. Variance refers to the error introduced by sensitivity to small fluctuations in the training set.
Understanding Bias
Bias is the error that is introduced by approximating a real-world problem, which may be complex, by a simplified model. A high bias model is one that makes strong assumptions about the shape of the target function. This can lead to underfitting, where the model is too simple to capture the underlying patterns in the data.
from sklearn.linear_model import LinearRegression
# Create a simple linear regression model
model = LinearRegression()
model.fit(X_train, y_train)
# Predict on the test set
y_pred = model.predict(X_test)array([1.2, 2.3, 3.4,...])Understanding Variance
Variance is the error introduced by sensitivity to small fluctuations in the training set. A high variance model is one that models the random noise in the training data, rather than the intended outputs. This can lead to overfitting, where the model performs well on the training data but poorly on unseen data.
💡 Tip: To reduce variance, consider using techniques such as regularization, cross-validation, or ensemble methods.
❓ What is the primary cause of high bias in a model?