Module 6 of 26 · Statistics for Machine Learning — Probability, Distributions, Hypothesis Testing, Bayesian Inference, A/B Testing · Intermediate

Parametric vs Non-parametric Tests

Duration: 5 min

This module delves into the differences between parametric and non-parametric tests, crucial for making informed decisions in machine learning. Understanding these tests helps in choosing the right statistical method for data analysis, ensuring valid and reliable results.

Parametric Tests

Parametric tests are statistical methods that assume the data follows a specific distribution, often a normal distribution. These tests are powerful when their assumptions are met, providing more accurate results. Common examples include t-tests and ANOVA.

import scipy.stats as stats

# Sample data
data1 = [5, 7, 8, 6, 7]
data2 = [6, 8, 9, 7, 8]

# Perform t-test
t_stat, p_value = stats.ttest_ind(data1, data2)

print(f'T-statistic: {t_stat}')
print(f'P-value: {p_value}')

Try it in Google Colab: Open in Colab

T-statistic: -1.0
P-value: 0.3519569516061532

Non-parametric Tests

Non-parametric tests do not assume a specific distribution for the data. They are useful when the data does not meet the assumptions of parametric tests, such as normality. Examples include the Mann-Whitney U test and the Kruskal-Wallis test.

import scipy.stats as stats

# Sample data
data1 = [5, 7, 8, 6, 7]
data2 = [6, 8, 9, 7, 8]

# Perform Mann-Whitney U test
u_stat, p_value = stats.mannwhitneyu(data1, data2)

print(f'U-statistic: {u_stat}')
print(f'P-value: {p_value}')

💡 Tip: Always check the assumptions of your data before choosing between parametric and non-parametric tests. Misapplying these tests can lead to incorrect conclusions.

❓ Which test assumes the data follows a specific distribution?

❓ Which test does not assume a specific distribution for the data?

Key Concepts

Concept Description
Distribution Core principle in this module
Hypothesis Core principle in this module
P-value Core principle in this module
Confidence Core principle in this module

Check Your Understanding

❓ How does Parametric handle edge cases?

❓ What is the computational complexity of Parametric?

❓ Which hyperparameter is most critical for Parametric?

← Previous Continue interactively → Next →

Related Courses