How to Conduct a Mann-Whitney U Test in Python

Spread the love


The Mann-Whitney U test, also known as the Wilcoxon rank-sum test, is a nonparametric statistical test that is used to determine if there are differences between two groups that are not normally distributed. This test can be used in lieu of the independent t-test when the assumptions of the t-test (i.e., normality and homogeneity of variance) are violated. The Mann-Whitney U test works by ranking all the values from both groups together and then evaluating whether each group has similarly high ranks.

Python’s scientific computing library, SciPy, includes the Mann-Whitney U test in its suite of statistical functions. This article will walk you through how to conduct a Mann-Whitney U test in Python using SciPy.

Hypothetical Scenario

For illustration, consider a hypothetical scenario: You are a researcher investigating a new medication designed to reduce the duration of common cold symptoms. You have a test group that has received the medication (Group A) and a control group that received a placebo (Group B). Both groups had the common cold, and you have recorded the duration (in days) of their cold symptoms.

The null hypothesis is that there is no difference between the median symptom duration for Group A and Group B. The alternative hypothesis is that there is a difference.

Performing the Mann-Whitney U Test

First, import the necessary Python libraries.

import numpy as np
import pandas as pd
from scipy.stats import mannwhitneyu

Next, let’s assume you have the symptom duration data stored in two Python lists:

groupA = [4.2, 3.8, 4.0, 4.3, 4.1, 4.0, 4.2]
groupB = [5.2, 5.3, 5.0, 5.3, 5.1, 5.2, 5.3]

We can now perform the Mann-Whitney U test using the mannwhitneyu function from scipy.stats:

u_stat, p_value = mannwhitneyu(groupA, groupB, alternative='two-sided')

The mannwhitneyu function takes in two arguments – the two groups you want to compare. It returns the U statistic and the corresponding p-value. By default, the test is conducted as a one-tailed test, but by adding alternative='two-sided', we conduct a two-tailed test.

Interpreting Results

Finally, we can print out the results:

print('U-statistic:', u_stat)
print('P-value:', p_value)

The U statistic is the result of the Mann-Whitney U test calculation. This value by itself doesn’t tell us much, but the p-value is what we’re interested in.

The p-value is the probability of observing the data given that the null hypothesis is true. If the p-value is below our significance level (typically 0.05), we can reject the null hypothesis and infer that there is a significant difference in symptom duration between the two groups.


The Mann-Whitney U test is a powerful tool for statistical hypothesis testing when the assumptions of a t-test can’t be met. It provides a nonparametric alternative that can offer meaningful insights from your data.

One point to note is that while the Mann-Whitney U test determines whether two distributions are different, it does not tell us where the distributions differ or by how much they differ. Furthermore, like all tests, it does not ‘prove’ anything but merely provides evidence to support or refute a hypothesis. The strength of this evidence depends on the quality of your data and the appropriateness of the test for your specific situation.

Also, while the Python code snippets provided here serve to illustrate how one might conduct a Mann-Whitney U test, remember to conduct appropriate exploratory data analyses and check assumptions before conducting any statistical tests.

Python’s statistical and data handling capabilities, combined with good scientific practice, provide a robust framework for analyzing and drawing meaningful insights from your data, regardless of whether they adhere to the assumptions of parametric statistics.

Leave a Reply