Your First Machine Learning Model
Duration: 5 min
What is Machine Learning?
Machine Learning is a way to teach computers to learn from data. Instead of writing explicit rules, you provide examples and let the algorithm find patterns. We'll build a simple classifier that learns to recognize iris flowers.
Loading Data
from sklearn import datasets
import pandas as pd
# Load the iris dataset
iris = datasets.load_iris()
X = iris.data # Features (measurements)
y = iris.target # Labels (flower types)
print(f'Number of samples: {len(X)}')
print(f'Number of features: {X.shape[1]}')
print(f'Feature names: {iris.feature_names}')
print(f'Target names: {iris.target_names}')Number of samples: 150
Number of features: 4
Feature names: ['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)']
Target names: ['setosa' 'versicolor' 'virginica']Splitting Data
We split data into training (80%) and testing (20%) sets. The model learns from training data and we test it on unseen data to see how well it generalizes.
from sklearn.model_selection import train_test_split
# Split data: 80% training, 20% testing
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42
)
print(f'Training samples: {len(X_train)}')
print(f'Testing samples: {len(X_test)}')Training samples: 120
Testing samples: 30Training a Model
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score
# Create a model
model = RandomForestClassifier(n_estimators=100, random_state=42)
# Train it on the training data
model.fit(X_train, y_train)
# Make predictions on test data
y_pred = model.predict(X_test)
# Calculate accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f'Model Accuracy: {accuracy:.2%}')Model Accuracy: 100.00%Making Predictions
# Make a prediction on new data
new_flower = [[5.1, 3.5, 1.4, 0.2]] # Measurements
prediction = model.predict(new_flower)
flower_name = iris.target_names[prediction[0]]
print(f'This flower is: {flower_name}')This flower is: setosa💡 Tip: Always split your data before training. Never test on data the model has seen during training - it won't tell you how well it generalizes.
Learn More Machine Learning
You've built your first model! Continue learning:
- AI Fundamentals - Core ML concepts and algorithms
- Deep Learning - Neural networks and advanced models
- RAG Systems - Build AI applications
❓ Why do we split data into training and testing sets?