Voting Ensembles: Advanced Strategies

Duration: 7 min

This module delves into advanced strategies for implementing voting ensembles in machine learning. We'll explore how to combine multiple models to improve predictive performance, discuss different voting methods, and cover techniques to handle model correlations and diversity. Understanding these strategies is crucial for building robust and accurate ensemble models.

Weighted Voting

Weighted voting is an advanced ensemble technique where each model's prediction is assigned a weight based on its performance. Models that perform better are given higher weights, allowing them to influence the final prediction more. This method can significantly enhance the ensemble's accuracy by leveraging the strengths of individual models.

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier, VotingClassifier
from sklearn.svm import SVC
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score

# Load dataset
iris = load_iris()
X, y = iris.data, iris.target

# Split dataset
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Create base classifiers
clf1 = RandomForestClassifier(n_estimators=50, random_state=42)
clf2 = SVC(probability=True, random_state=42)
clf3 = LogisticRegression(random_state=42)

# Create voting classifier with weights
voting_clf = VotingClassifier(estimators=[('rf', clf1), ('svc', clf2), ('lr', clf3)], voting='soft', weights=[2, 1, 1])

# Fit and predict
voting_clf.fit(X_train, y_train)
predictions = voting_clf.predict(X_test)

# Calculate accuracy
accuracy = accuracy_score(y_test, predictions)
print(f'Weighted Voting Ensemble Accuracy: {accuracy:.2f}')

Try it in Google Colab:

Weighted Voting Ensemble Accuracy: 0.97

Stacking Ensembles

Stacking is a sophisticated ensemble technique where multiple base models are trained on the original dataset, and a meta-model is trained on the predictions of these base models. The meta-model learns to combine the predictions of the base models, often leading to improved performance. This method can capture complex relationships between the base models' predictions.

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.ensemble import RandomForestClassifier
from sklearn.svm import SVC
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import StackingClassifier
from sklearn.metrics import accuracy_score

# Load dataset
iris = load_iris()
X, y = iris.data, iris.target

# Split dataset
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Create base classifiers
base_clfs = [('rf', RandomForestClassifier(n_estimators=50, random_state=42)),
             ('svc', SVC(probability=True, random_state=42)),
             ('lr', LogisticRegression(random_state=42))]

# Create stacking classifier with logistic regression as the final estimator
stacking_clf = StackingClassifier(estimators=base_clfs, final_estimator=LogisticRegression())

# Fit and predict
stacking_clf.fit(X_train, y_train)
predictions = stacking_clf.predict(X_test)

# Calculate accuracy
accuracy = accuracy_score(y_test, predictions)
print(f'Stacking Ensemble Accuracy: {accuracy:.2f}')

💡 Tip: When using stacking ensembles, ensure that the base models are sufficiently diverse to capture different aspects of the data. This diversity helps the meta-model learn more effectively.

❓ What is the purpose of assigning weights in a weighted voting ensemble?

To reduce model complexity To give more influence to better-performing models To increase model diversity To simplify the ensemble process

❓ What is the role of the meta-model in a stacking ensemble?

To train the base models To combine the predictions of the base models To preprocess the data To select the best base model

Voting Ensembles: Advanced Strategies

Weighted Voting

Stacking Ensembles

Related Courses