Mejbah Ahammad

Posted on Dec 7

Day 4 – Model Evaluation and Selection

📑 Table of Contents

🌟 Welcome to Day 4
🔍 Review of Day 3
📈 Introduction to Model Evaluation and Selection
- 🔍 Why Model Evaluation Matters
- 🧠 Bias-Variance Tradeoff
🔄 Model Validation Techniques
- 🔄 Train-Test Split
- 🔄 Cross-Validation
⚙️ Hyperparameter Tuning
🛠️ Implementing Model Evaluation and Selection with Scikit-Learn
📈 Comparing Models
- 📊 Performance Metrics Comparison
- 📉 Visualizing Model Performance
🛠️📈 Example Project: Comparing Models with Cross-Validation and Hyperparameter Tuning
🚀🎓 Conclusion and Next Steps
📜 Summary of Day 4 📜

1. 🌟 Welcome to Day 4

Welcome to Day 4 of "Becoming a Scikit-Learn Boss in 90 Days"! Today, we'll focus on Model Evaluation and Selection, essential steps to ensure your machine learning models are both accurate and generalizable. You'll learn about various evaluation techniques, validation strategies, and hyperparameter tuning methods to optimize your models effectively.

2. 🔍 Review of Day 3

Before diving into today's topics, let's briefly recap what we covered yesterday:

Supervised Learning: Regression Algorithms: Explored Linear Regression, Ridge Regression, Lasso Regression, and Elastic Net.
Implementing Regression Algorithms with Scikit-Learn: Built, trained, and evaluated different regression models.
Model Evaluation for Regression: Learned about Mean Squared Error (MSE), Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and R-squared (R²).
Example Project: Housing Price Prediction: Developed a regression pipeline to predict housing prices and compared model performances.

With this foundation, we're ready to enhance our models through robust evaluation and selection techniques.

3. 📈 Introduction to Model Evaluation and Selection

🔍 Why Model Evaluation Matters

Model evaluation is crucial to determine how well your machine learning model performs on unseen data. It helps in:

Assessing Performance: Understanding the strengths and weaknesses of your model.
Preventing Overfitting: Ensuring the model generalizes well to new data.
Comparing Models: Selecting the best model among various candidates.
Optimizing Hyperparameters: Fine-tuning model parameters for optimal performance.

🧠 Bias-Variance Tradeoff

Understanding the bias-variance tradeoff is fundamental in model evaluation:

Bias: Error due to overly simplistic assumptions in the learning algorithm. High bias can cause underfitting.
Variance: Error due to too much complexity in the learning algorithm. High variance can cause overfitting.
Tradeoff: Striking a balance between bias and variance leads to better generalization.

Image Source: Medium

4. 🔄 Model Validation Techniques

🔄 Train-Test Split

The simplest form of model validation where the dataset is split into training and testing sets.

from sklearn.model_selection import train_test_split

# Assume X and y are already defined
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

Pros:

Simple and fast.

Cons:

Can lead to high variance in performance metrics depending on the split.

🔄 Cross-Validation

A more robust method that involves partitioning the data into multiple subsets to ensure the model's performance is consistent across different data splits.

🧮 K-Fold Cross-Validation

Divides the dataset into K equal-sized folds. The model is trained on K-1 folds and tested on the remaining fold. This process is repeated K times, each time with a different fold as the test set.

from sklearn.model_selection import cross_val_score
from sklearn.linear_model import LinearRegression

model = LinearRegression()
scores = cross_val_score(model, X, y, cv=5, scoring='r2')
print(f"Cross-Validation R² Scores: {scores}")
print(f"Average R² Score: {scores.mean():.2f}")

📈 Stratified K-Fold

Ensures that each fold has the same proportion of classes as the entire dataset, useful for imbalanced datasets.

from sklearn.model_selection import StratifiedKFold

skf = StratifiedKFold(n_splits=5)
for train_index, test_index in skf.split(X, y):
    X_train_fold, X_test_fold = X[train_index], X[test_index]
    y_train_fold, y_test_fold = y[train_index], y[test_index]
    # Train and evaluate your model

🔄 Leave-One-Out Cross-Validation (LOOCV)

Each observation is used once as a test set while the remaining observations form the training set. Best for small datasets.

from sklearn.model_selection import LeaveOneOut

loo = LeaveOneOut()
for train_index, test_index in loo.split(X):
    X_train_fold, X_test_fold = X[train_index], X[test_index]
    y_train_fold, y_test_fold = y[train_index], y[test_index]
    # Train and evaluate your model

5. ⚙️ Hyperparameter Tuning

🔧 Importance of Hyperparameters

Hyperparameters are parameters set before the learning process begins. They control the behavior of the training algorithm and can significantly impact model performance.

🔍 Grid Search

An exhaustive search over a specified parameter grid. It evaluates all possible combinations of hyperparameters.

from sklearn.model_selection import GridSearchCV
from sklearn.ensemble import RandomForestRegressor

param_grid = {
    'n_estimators': [100, 200],
    'max_depth': [None, 10, 20],
    'min_samples_split': [2, 5]
}

grid_search = GridSearchCV(estimator=RandomForestRegressor(random_state=42),
                           param_grid=param_grid,
                           cv=5,
                           scoring='r2',
                           n_jobs=-1)

grid_search.fit(X_train_scaled, y_train)
print(f"Best Parameters: {grid_search.best_params_}")
print(f"Best R² Score: {grid_search.best_score_:.2f}")

🔍 Randomized Search

Searches a random subset of the hyperparameter space, making it faster than Grid Search, especially with large datasets or many parameters.

from sklearn.model_selection import RandomizedSearchCV
from sklearn.ensemble import RandomForestRegressor
from scipy.stats import randint

param_dist = {
    'n_estimators': randint(100, 500),
    'max_depth': [None, 10, 20, 30],
    'min_samples_split': randint(2, 11)
}

random_search = RandomizedSearchCV(estimator=RandomForestRegressor(random_state=42),
                                   param_distributions=param_dist,
                                   n_iter=50,
                                   cv=5,
                                   scoring='r2',
                                   random_state=42,
                                   n_jobs=-1)

random_search.fit(X_train_scaled, y_train)
print(f"Best Parameters: {random_search.best_params_}")
print(f"Best R² Score: {random_search.best_score_:.2f}")

✨ Bayesian Optimization

A more efficient method that builds a probabilistic model of the objective function and uses it to select the most promising hyperparameters to evaluate.

Note: Requires additional libraries like scikit-optimize.

from skopt import BayesSearchCV
from sklearn.ensemble import RandomForestRegressor

bayes_search = BayesSearchCV(
    estimator=RandomForestRegressor(random_state=42),
    search_spaces={
        'n_estimators': (100, 500),
        'max_depth': (10, 50),
        'min_samples_split': (2, 20)
    },
    n_iter=32,
    cv=5,
    scoring='r2',
    random_state=42,
    n_jobs=-1
)

bayes_search.fit(X_train_scaled, y_train)
print(f"Best Parameters: {bayes_search.best_params_}")
print(f"Best R² Score: {bayes_search.best_score_:.2f}")

6. 🛠️ Implementing Model Evaluation and Selection with Scikit-Learn

🔄 Performing K-Fold Cross-Validation

from sklearn.model_selection import cross_val_score
from sklearn.ensemble import RandomForestRegressor

model = RandomForestRegressor(random_state=42)
scores = cross_val_score(model, X, y, cv=5, scoring='r2')
print(f"Cross-Validation R² Scores: {scores}")
print(f"Average R² Score: {scores.mean():.2f}")

🔧 Hyperparameter Tuning with GridSearchCV

from sklearn.model_selection import GridSearchCV
from sklearn.ensemble import RandomForestRegressor

param_grid = {
    'n_estimators': [100, 200, 300],
    'max_depth': [None, 10, 20],
    'min_samples_split': [2, 5, 10]
}

grid_search = GridSearchCV(estimator=RandomForestRegressor(random_state=42),
                           param_grid=param_grid,
                           cv=5,
                           scoring='r2',
                           n_jobs=-1)

grid_search.fit(X_train_scaled, y_train)
print(f"Best Parameters: {grid_search.best_params_}")
print(f"Best R² Score: {grid_search.best_score_:.2f}")

🔧 Hyperparameter Tuning with RandomizedSearchCV

from sklearn.model_selection import RandomizedSearchCV
from sklearn.ensemble import RandomForestRegressor
from scipy.stats import randint

param_dist = {
    'n_estimators': randint(100, 500),
    'max_depth': [None, 10, 20, 30],
    'min_samples_split': randint(2, 11)
}

random_search = RandomizedSearchCV(estimator=RandomForestRegressor(random_state=42),
                                   param_distributions=param_dist,
                                   n_iter=50,
                                   cv=5,
                                   scoring='r2',
                                   random_state=42,
                                   n_jobs=-1)

random_search.fit(X_train_scaled, y_train)
print(f"Best Parameters: {random_search.best_params_}")
print(f"Best R² Score: {random_search.best_score_:.2f}")

7. 📈 Comparing Models

📊 Performance Metrics Comparison

After training multiple models, it's essential to compare their performance metrics to select the best one.

import pandas as pd
import numpy as np
from sklearn.metrics import mean_squared_error, r2_score

models = {
    'Linear Regression': linear_reg,
    'Ridge Regression': ridge_reg,
    'Lasso Regression': lasso_reg,
    'Elastic Net': elastic_net
}

evaluation_df = pd.DataFrame(columns=['Model', 'MSE', 'RMSE', 'MAE', 'R²'])

for name, model in models.items():
    y_pred = model.predict(X_test_scaled)
    mse = mean_squared_error(y_test, y_pred)
    rmse = np.sqrt(mse)
    mae = mean_absolute_error(y_test, y_pred)
    r2 = r2_score(y_test, y_pred)
    evaluation_df = evaluation_df.append({
        'Model': name,
        'MSE': mse,
        'RMSE': rmse,
        'MAE': mae,
        'R²': r2
    }, ignore_index=True)

print(evaluation_df)

📉 Visualizing Model Performance

Visual representations can help in comparing the performance of different models effectively.

import seaborn as sns
import matplotlib.pyplot as plt

# Plot R² Scores
plt.figure(figsize=(10, 6))
sns.barplot(x='R²', y='Model', data=evaluation_df, palette='viridis')
plt.title('R² Score Comparison of Regression Models')
plt.xlabel('R² Score')
plt.ylabel('Model')
plt.xlim(0, 1)
plt.show()

# Plot RMSE
plt.figure(figsize=(10, 6))
sns.barplot(x='RMSE', y='Model', data=evaluation_df, palette='magma')
plt.title('RMSE Comparison of Regression Models')
plt.xlabel('RMSE')
plt.ylabel('Model')
plt.show()

8. 📈 Example Project: Comparing Models with Cross-Validation and Hyperparameter Tuning

Let's apply today's concepts by developing a comprehensive regression pipeline to predict housing prices using the California Housing Dataset. We'll compare multiple regression algorithms, perform cross-validation, and tune hyperparameters to optimize model performance.

📋 Project Overview

Objective: Develop and compare different regression models to predict median housing prices based on various features. Implement cross-validation and hyperparameter tuning to enhance model accuracy and generalizability.

Tools: Python, Scikit-Learn, pandas, Matplotlib, Seaborn

📝 Step-by-Step Guide

1. Load and Explore the Dataset

from sklearn.datasets import fetch_california_housing
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

# Load California Housing dataset
housing = fetch_california_housing()
X = pd.DataFrame(housing.data, columns=housing.feature_names)
y = pd.Series(housing.target, name='MedHouseVal')

# Combine features and target
df = pd.concat([X, y], axis=1)
print(df.head())

# Visualize distribution of target variable
sns.histplot(df['MedHouseVal'], bins=50, kde=True)
plt.title('Distribution of Median House Values')
plt.xlabel('Median House Value')
plt.ylabel('Frequency')
plt.show()

2. Data Preprocessing

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Feature Scaling
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

3. Building and Training the Models

from sklearn.linear_model import LinearRegression, Ridge, Lasso, ElasticNet
from sklearn.ensemble import RandomForestRegressor
from sklearn.svm import SVR

# Initialize models
linear_reg = LinearRegression()
ridge_reg = Ridge(alpha=1.0)
lasso_reg = Lasso(alpha=0.1)
elastic_net = ElasticNet(alpha=0.1, l1_ratio=0.5)
random_forest = RandomForestRegressor(random_state=42)
svm_reg = SVR(kernel='linear')

# Train models
linear_reg.fit(X_train_scaled, y_train)
ridge_reg.fit(X_train_scaled, y_train)
lasso_reg.fit(X_train_scaled, y_train)
elastic_net.fit(X_train_scaled, y_train)
random_forest.fit(X_train_scaled, y_train)
svm_reg.fit(X_train_scaled, y_train)

4. Making Predictions and Evaluating the Models

from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score

models = {
    'Linear Regression': linear_reg,
    'Ridge Regression': ridge_reg,
    'Lasso Regression': lasso_reg,
    'Elastic Net': elastic_net,
    'Random Forest': random_forest,
    'Support Vector Machine': svm_reg
}

evaluation_df = pd.DataFrame(columns=['Model', 'MSE', 'RMSE', 'MAE', 'R²'])

for name, model in models.items():
    y_pred = model.predict(X_test_scaled)
    mse = mean_squared_error(y_test, y_pred)
    rmse = np.sqrt(mse)
    mae = mean_absolute_error(y_test, y_pred)
    r2 = r2_score(y_test, y_pred)
    evaluation_df = evaluation_df.append({
        'Model': name,
        'MSE': mse,
        'RMSE': rmse,
        'MAE': mae,
        'R²': r2
    }, ignore_index=True)

print(evaluation_df)

5. Performing K-Fold Cross-Validation

from sklearn.model_selection import cross_val_score

# Perform cross-validation for Random Forest
cv_scores = cross_val_score(random_forest, X, y, cv=5, scoring='r2')
print(f"Random Forest Cross-Validation R² Scores: {cv_scores}")
print(f"Average Cross-Validation R² Score: {cv_scores.mean():.2f}")

6. Hyperparameter Tuning with GridSearchCV

from sklearn.model_selection import GridSearchCV

# Define parameter grid for Random Forest
param_grid = {
    'n_estimators': [100, 200, 300],
    'max_depth': [None, 10, 20, 30],
    'min_samples_split': [2, 5, 10]
}

grid_search = GridSearchCV(estimator=RandomForestRegressor(random_state=42),
                           param_grid=param_grid,
                           cv=5,
                           scoring='r2',
                           n_jobs=-1)

grid_search.fit(X_train_scaled, y_train)
print(f"Best Parameters: {grid_search.best_params_}")
print(f"Best Cross-Validation R² Score: {grid_search.best_score_:.2f}")

7. Hyperparameter Tuning with RandomizedSearchCV

from sklearn.model_selection import RandomizedSearchCV
from scipy.stats import randint

param_dist = {
    'n_estimators': randint(100, 500),
    'max_depth': [None, 10, 20, 30, 40],
    'min_samples_split': randint(2, 21)
}

random_search = RandomizedSearchCV(estimator=RandomForestRegressor(random_state=42),
                                   param_distributions=param_dist,
                                   n_iter=50,
                                   cv=5,
                                   scoring='r2',
                                   random_state=42,
                                   n_jobs=-1)

random_search.fit(X_train_scaled, y_train)
print(f"Best Parameters: {random_search.best_params_}")
print(f"Best Cross-Validation R² Score: {random_search.best_score_:.2f}")

8. Comparing Model Performance

import seaborn as sns
import matplotlib.pyplot as plt

# Plot R² Scores
plt.figure(figsize=(10, 6))
sns.barplot(x='R²', y='Model', data=evaluation_df, palette='viridis')
plt.title('R² Score Comparison of Regression Models')
plt.xlabel('R² Score')
plt.ylabel('Model')
plt.xlim(0, 1)
plt.show()

# Plot RMSE
plt.figure(figsize=(10, 6))
sns.barplot(x='RMSE', y='Model', data=evaluation_df, palette='magma')
plt.title('RMSE Comparison of Regression Models')
plt.xlabel('RMSE')
plt.ylabel('Model')
plt.show()

9. 🚀 Conclusion and Next Steps

Congratulations on completing Day 4 of "Becoming a Scikit-Learn Boss in 90 Days"! Today, you mastered Model Evaluation and Selection, learning how to validate your models effectively through cross-validation, perform hyperparameter tuning with Grid Search and Randomized Search, and compare multiple regression models to select the best performer.

🔮 What’s Next?

Day 5: Unsupervised Learning – Clustering and Dimensionality Reduction: Understand clustering algorithms like K-Means and techniques like PCA.
Day 6: Advanced Feature Engineering: Master techniques to create and select features that enhance model performance.
Day 7: Ensemble Methods: Explore ensemble techniques like Bagging, Boosting, and Stacking.
Day 8: Model Deployment with Scikit-Learn: Learn how to deploy your models into production environments.
Days 9-90: Specialized Topics and Projects: Engage in specialized topics and comprehensive projects to solidify your expertise.

📝 Tips for Success

Practice Regularly: Apply the concepts through exercises and real-world projects.
Engage with the Community: Join forums, attend webinars, and collaborate with peers.
Stay Curious: Continuously explore new features and updates in Scikit-Learn.
Document Your Work: Keep a detailed journal of your learning progress and projects.

Keep up the great work, and stay motivated as you continue your journey to mastering Scikit-Learn and machine learning!

📜 Summary of Day 4

📈 Introduction to Model Evaluation and Selection: Learned the importance of model evaluation and the bias-variance tradeoff.
🔄 Model Validation Techniques: Explored Train-Test Split, K-Fold Cross-Validation, Stratified K-Fold, and Leave-One-Out Cross-Validation.
⚙️ Hyperparameter Tuning: Mastered Grid Search, Randomized Search, and Bayesian Optimization for tuning model parameters.
🛠️ Implementing Model Evaluation and Selection with Scikit-Learn: Practiced cross-validation and hyperparameter tuning using GridSearchCV and RandomizedSearchCV.
📈 Comparing Models: Compared different regression models using performance metrics and visualizations.
🛠️📈 Example Project: Comparing Models with Cross-Validation and Hyperparameter Tuning: Developed a comprehensive regression pipeline to predict housing prices, evaluated multiple models, and optimized their performance through cross-validation and hyperparameter tuning.