Multiple Hypothesis Testing
Duration: 5 min
This module delves into the concept of multiple hypothesis testing, a critical aspect of statistical analysis in machine learning. It explains the challenges that arise when performing multiple statistical tests simultaneously and introduces methods to control the family-wise error rate and false discovery rate. Understanding multiple hypothesis testing is essential for ensuring the reliability and validity of machine learning experiments.
Family-Wise Error Rate (FWER)
The Family-Wise Error Rate (FWER) is the probability of making one or more false discoveries, or type I errors, when performing multiple hypotheses tests. Controlling the FWER is crucial to ensure that the overall error rate across all tests remains within acceptable limits. The Bonferroni correction is a common method used to control the FWER by adjusting the significance level for each individual test.
import numpy as np
from statsmodels.stats.multitest import multipletests
# Generate some p-values
p_values = np.random.rand(10)
# Apply Bonferroni correction
reject, corrected_p_values, alpha_corrected, _ = multipletests(p_values, method='bonferroni')
print('Original p-values:', p_values)
print('Reject null hypothesis:', reject)
print('Corrected p-values:', corrected_p_values)Original p-values: [0.37454687 0.95391295 0.42334769 0.256338 0.4929622 0.18393391 0.98190047 0.72781178 0.03582775 0.77781746]
Reject null hypothesis: [False False False False False False False False False False]
Corrected p-values: [1. 1. 1. 1. 1. 1. 1. 1. 0.35827745 1. ]False Discovery Rate (FDR)
The False Discovery Rate (FDR) is the expected proportion of false positives among the total number of positive results. Unlike FWER, which controls the probability of making any false discoveries, FDR allows for some false positives but aims to keep their proportion low. The Benjamini-Hochberg procedure is a popular method for controlling the FDR.
import numpy as np
from statsmodels.stats.multitest import multipletests
# Generate some p-values
p_values = np.random.rand(10)
# Apply Benjamini-Hochberg correction
reject, corrected_p_values, alpha_corrected, _ = multipletests(p_values, method='fdr_bh')
print('Original p-values:', p_values)
print('Reject null hypothesis:', reject)
print('Corrected p-values:', corrected_p_values)💡 Tip: When performing multiple hypothesis tests, always consider the trade-off between FWER and FDR. Use FWER methods like Bonferroni for stringent control of type I errors and FDR methods like Benjamini-Hochberg when you can tolerate some false positives to increase power.
❓ What does FWER stand for?
❓ Which method is commonly used to control the FDR?
Key Concepts
| Concept | Description |
|---|---|
| Null Hypothesis | Core principle in this module |
| P-value | Core principle in this module |
| Significance | Core principle in this module |
| Power | Core principle in this module |
Check Your Understanding
❓ How does Multiple handle edge cases?
❓ What is the computational complexity of Multiple?
❓ Which hyperparameter is most critical for Multiple?