
Introduction
Dunn’s Test is a non-parametric post-hoc test used to perform multiple comparisons between groups after obtaining a significant result from a Kruskal-Wallis Test. It is especially useful when dealing with data that does not meet the assumptions of normality and homogeneity of variances required for ANOVA. In this article, we’ll guide you through the process of performing Dunn’s Test in Python, from setting up your environment to interpreting the results.
Table of Contents
- Background and Use Cases
- Understanding Dunn’s Test
- Setting Up the Python Environment
- Preparing the Data
- Performing the Kruskal-Wallis Test
- Performing Dunn’s Test
- Interpreting the Results
- Visualizing the Results
- Conclusion
1. Background and Use Cases
1.1. Non-Parametric Tests
Non-parametric tests are used when the data does not meet the assumptions of parametric counterparts, such as normal distribution.
1.2. Use Cases
Dunn’s Test is typically used in experimental research where you want to compare the medians of three or more independent groups. For instance, comparing the effects of different medications on a non-normally distributed health parameter.
2. Understanding Dunn’s Test
Dunn’s Test is used after a Kruskal-Wallis Test has indicated a significant difference among groups. While Kruskal-Wallis tells you that at least two groups are different, Dunn’s Test helps you determine which pairs of groups are different.
3. Setting Up the Python Environment
To perform Dunn’s Test, you will need Python along with the following libraries:
- pandas
- numpy
- scipy
- scikit-posthocs
You can install them using pip:
pip install pandas numpy scipy scikit-posthocs
4. Preparing the Data
For Dunn’s Test, data should be in groups. Here’s an example dataset in CSV format:
Group,Value
A,5
A,6
B,7
B,8
C,6
C,7
...
Load the data into Python:
import pandas as pd
data = pd.read_csv('data.csv')
5. Performing the Kruskal-Wallis Test
Before performing Dunn’s Test, it’s crucial to perform the Kruskal-Wallis Test to see if there are any significant differences among groups.
from scipy.stats import kruskal
group_a = data[data['Group'] == 'A']['Value']
group_b = data[data['Group'] == 'B']['Value']
group_c = data[data['Group'] == 'C']['Value']
stat, p = kruskal(group_a, group_b, group_c)
print('Statistics=%.3f, p=%.3f' % (stat, p))
6. Performing Dunn’s Test
If the p-value from the Kruskal-Wallis Test is below 0.05, you can perform Dunn’s Test for multiple comparisons.
import scikit_posthocs as sp
# Stack the data for Dunn's Test
stacked_data = data.stack().reset_index()
stacked_data.columns = ['id', 'groups', 'values']
# Perform Dunn's Test
dunn_results = sp.posthoc_dunn(stacked_data, val_col='values', group_col='groups', p_adjust='holm')
print(dunn_results)
7. Interpreting the Results
Dunn’s Test outputs a matrix of p-values for comparisons between each pair of groups. P-values below 0.05 typically indicate a statistically significant difference between the groups.
8. Visualizing the Results
Visualizations can help in better understanding of the analysis. You can create box plots to visualize the data distribution among the different groups.
import seaborn as sns
import matplotlib.pyplot as plt
sns.boxplot(x='Group', y='Value', data=data)
plt.title('Group Comparisons')
plt.xlabel('Group')
plt.ylabel('Value')
plt.show()
9. Conclusion
Dunn’s Test is an essential non-parametric post-hoc test for comparing the medians of three or more independent groups. Python offers an excellent suite of libraries for performing and visualizing Dunn’s Test. It’s important to understand the assumptions and limitations of this test and to interpret the results with caution. Dunn’s Test should be used as part of a broader statistical analysis plan, which should be carefully devised based on research questions and the nature of the data.