Mann-Whitney U Test
Calculator
Learn More
Mann-Whitney U Test
Definition
Mann-Whitney U Test (also known as Wilcoxon rank-sum test) is a non-parametric alternative to the independent t-test. It compares two independent groups by analyzing the rankings of the data rather than the raw values.
Formula
U Statistic:
Where:
- = sample sizes
- = sum of ranks for group 1
Standardized Test Statistic:
Key Assumptions
Independent Samples: Observations must be independent between and within groups
Ordinal Scale: Data must be at least ordinal (can be ranked)
Similar Shapes: Distributions should have similar shapes (for comparing medians)
Common Pitfalls
- Using with paired/dependent samples (use Wilcoxon signed-rank instead)
- Interpreting results as comparing means rather than distributions
- Not checking for ties in the data when using exact calculations
Practical Example
Testing if a treatment affects test scores:
Step 1: State the Data
- Control group: 45, 47, 43, 44
- Treatment group: 52, 48, 54, 50
- Sample sizes:
Step 2: State Hypotheses
- : The distributions are identical
- : The distributions differ
Step 3: Calculate Rankings
Value | Group | Rank |
---|---|---|
43 | Control | 1 |
44 | Control | 2 |
45 | Control | 3 |
47 | Control | 4 |
48 | Treatment | 5 |
50 | Treatment | 6 |
52 | Treatment | 7 |
54 | Treatment | 8 |
Sum of Control ranks:
Step 4: Calculate U Statistic
Using the formula:
Step 5: Calculate Standardized Statistic
Step 6: Draw Conclusion
The -value for this test is 0.021. Since -value , we reject . There is sufficient evidence to conclude that the treatment and control groups have different distributions of scores.
Effect Size
Effect size r for Mann-Whitney U test:
Where:
- = standardized test statistic
- = total sample size
Interpretation:
- Small effect:
- Medium effect:
- Large effect:
In the example above, the effect size is
Code Examples
R
1# Effect size calculation for Mann-Whitney U test
2library(tidyverse)
3
4wilcoxonR <- function(x, y) {
5 n1 <- length(x)
6 n2 <- length(y)
7
8 is_small_sample <- (n1 + n2) <= 30
9
10 # Best practices for parameter 'exact'
11 # - If the sample size is small (n1 + n2 <= 30) and there are no ties, set 'exact = TRUE'
12 # - Otherwise, set 'exact = FALSE'
13
14 if (is_small_sample && !anyDuplicated(c(x, y))) {
15 test <- wilcox.test(x, y, exact = TRUE)
16 } else {
17 test <- wilcox.test(x, y, exact = FALSE)
18 }
19
20 # Extract U statistic
21 W <- as.numeric(test$statistic)
22
23 # Calculate Z-score manually
24 mean_U <- n1 * n2 / 2
25 sd_U <- sqrt((n1 * n2 * (n1 + n2 + 1)) / 12)
26 z <- (W - mean_U) / sd_U
27
28 # Calculate effect size (r)
29 r <- abs(z) / sqrt(n1 + n2)
30
31 return(list(effect_size = r, test_details = test))
32}
33
34# Example data
35control <- c(45, 47, 43, 44)
36treatment <- c(52, 48, 54, 50)
37
38# Calculate effect size
39result <- wilcoxonR(control, treatment)
40print(str_glue("Effect size (r): {round(result$effect_size, 4)}"))
41print(result$test_details)
Python
1from scipy.stats import mannwhitneyu
2import numpy as np
3
4control = [45, 47, 43, 44]
5treatment = [52, 48, 54, 50]
6
7# Perform Mann-Whitney U test
8stat, pvalue = mannwhitneyu(
9 control,
10 treatment,
11 alternative='two-sided',
12 method='auto'
13)
14
15print(f'U-statistic: {stat}')
16print(f'p-value: {pvalue:.4f}')
17
18# Effect size (r = Z/sqrt(N))
19n1, n2 = len(control), len(treatment)
20z_score = (stat - (n1*n2/2)) / np.sqrt((n1*n2*(n1+n2+1))/12)
21effect_size = abs(z_score) / np.sqrt(n1 + n2)
22print(f'Effect size (r): {effect_size:.4f}')
Alternative Tests
Consider these alternatives:
- Independent t-test: When data is normal and interval/ratio
- Mood's Median Test: When only interested in median differences
- Kolmogorov-Smirnov Test: When interested in any distributional differences
Related Calculators
Independent T-Test Calculator
Paired T-Test Calculator
Wilcoxon Signed-Rank Test Calculator
Friedman Test Calculator
Help us improve
Found an error or have a suggestion? Let us know!