
Introduction
In statistical analysis, it is often important to compare multiple groups or samples to draw inferences. The Friedman Test is a non-parametric statistical test used for comparing the means of three or more paired groups. It is the non-parametric alternative to the one-way ANOVA with repeated measures and is used when the assumptions of the latter are not met, particularly when the data does not follow a normal distribution or when the sample sizes are small.
In this article, we will guide you through the steps to perform the Friedman Test in Python, including preparing your data, performing the test, and interpreting the results. We will also give you an overview of the underlying concepts.
Table of Contents
- Background and Use Cases
- Understanding the Friedman Test
- Setting Up the Python Environment
- Preparing the Data
- Performing the Friedman Test
- Post-Hoc Analysis
- Interpreting the Results
- Conclusion
1. Background and Use Cases
1.1. Non-Parametric Tests
Non-parametric tests make fewer assumptions about the distribution of data. They are often used when the data is ordinal or when the assumptions of parametric tests, like the normal distribution, are violated.
1.2. Use Cases
The Friedman Test is used in scenarios where you have three or more paired groups. It is commonly used in clinical experiments, where subjects are measured under different conditions over time. For example, it can be used to compare the effect of three different medications on the same group of patients.
2. Understanding the Friedman Test
The Friedman Test is based on ranks. The basic steps involved in the Friedman Test are:
- Convert the raw data to ranks within each block (a set of measurements that are inherently paired).
- Calculate the test statistic.
- Compare the test statistic to a chi-square distribution to determine the p-value.
3. Setting Up the Python Environment
Before you can perform the Friedman Test, you need to make sure that Python is installed on your system. You will also need some libraries, including scipy
, numpy
, and pandas
.
pip install scipy numpy pandas
4. Preparing the Data
For the Friedman Test, data should be in the form of two or more paired groups. The data could be in a CSV file, Excel sheet, or any other format. Let’s assume you have it in a CSV file:
Subject,Drug A,Drug B,Drug C
1,23,30,29
2,30,28,35
3,35,36,38
4,36,35,40
Load the data into Python:
import pandas as pd
data = pd.read_csv('data.csv')
5. Performing the Friedman Test
We will use the friedmanchisquare
function from the scipy.stats
module.
from scipy.stats import friedmanchisquare
# Extract the data
drug_a = data['Drug A']
drug_b = data['Drug B']
drug_c = data['Drug C']
# Perform the Friedman test
stat, p = friedmanchisquare(drug_a, drug_b, drug_c)
print('Statistics=%.3f, p=%.3f' % (stat, p))
6. Post-Hoc Analysis
If the Friedman Test is significant, it only tells you that at least two groups are different. You need post-hoc tests to find out which groups are different. The Nemenyi test is one such post-hoc test. The scikit-posthocs
library provides an implementation of this test.
pip install scikit-posthocs
import scikit_posthocs as sp
# Concatenate the data and add a grouping variable
stacked_data = data.stack().reset_index()
stacked_data.columns = ['id', 'treatments', 'values']
# Perform the Nemenyi post-hoc test
results = sp.posthoc_nemenyi_friedman(stacked_data, y_col='values', block_col='id', group_col='treatments')
print(results)
7. Interpreting the Results
Interpreting the results involves examining the test statistic and the p-value.
- If p < 0.05, then there is a statistically significant difference between the groups.
- If p > 0.05, then there is no statistically significant difference between the groups.
In the case of a significant Friedman test, examine the results of the post-hoc test to understand which specific groups are different.
8. Conclusion
The Friedman Test is an essential non-parametric test for comparing three or more paired groups. Python, with its rich ecosystem of libraries, provides an excellent environment for performing this test. It is crucial to understand the assumptions and limitations of the Friedman Test and interpret the results carefully. Remember that the test is sensitive to the sample size and should be used in conjunction with other methods of analysis to draw meaningful conclusions.