EZ Statistics

Quartiles, Percentiles, and Interquartile Range (IQR)

Welcome to the wonderful world of quartiles and percentiles! Whether you're a student, a data enthusiast, or just someone who loves to make sense of numbers, this guide is for you. We'll break it all down with simple examples, friendly language, and helpful visualizations.

What Are Quartiles and Percentiles?

At their core, quartiles and percentiles are tools that help us understand how data is distributed. They divide data into smaller, manageable chunks, making it easier to analyze and interpret.

Quartiles

  • Split data into four equal parts
  • Each part represents 25% of the data
  • Used to identify data distribution

Percentiles

  • Split data into 100 equal parts
  • Each part represents 1% of the data
  • Provides more granular data analysis

Understanding Quartiles

Box Plot Visualization

Box Plot Visualization

A box plot shows the five-number summary: minimum, Q1, median (Q2), Q3, and maximum. The box contains the middle 50% of the data.

Quartiles divide a dataset into four equal parts. These parts help you understand where a particular data point lies relative to the rest of the data.

QuartileDescriptionInterpretation
Q1 (First Quartile)25th percentile25% of data falls below this point
Q2 (Second Quartile)50th percentile (Median)50% of data falls below this point
Q3 (Third Quartile)75th percentile75% of data falls below this point
Q4100th percentileMaximum value

Interquartile Range (IQR)

The Interquartile Range (IQR) is a measure of statistical dispersion that tells us about the spread of the middle 50% of our data. It's calculated as the difference between the third quartile (Q3) and the first quartile (Q1):

IQR=Q3Q1IQR = Q3 - Q1

The IQR is particularly useful because it:

  • Is not affected by extreme values or outliers
  • Gives us a sense of how spread out the "typical" values are
  • Can be used to identify outliers in a dataset

Understanding Percentiles

Percentile Distribution

Percentile Distribution Visualization

The curve shows a typical score distribution which is a normal distribution with marked percentiles. Notice how the 25th, 50th, and 75th percentiles divide the data into quarters.

Percentiles divide data into 100 equal parts. Each percentile tells you what percentage of the data falls below a certain value. For example, if your test score is at the 90th percentile:

  • You scored higher than 90% of other students
  • Only 10% of students scored higher than you

Calculating Quartiles and Percentiles

How to Calculate Quartiles

1. Order the Data

Arrange values in ascending order (same as with quartiles)

40,50,55,60,65,70,75,8040, 50, 55, 60, 65, 70, 75, 80

2. Calculate the Position

For the Pth percentile, use:

Position=P(n+1)100\text{Position} = \frac{P(n+1)}{100}

where PP is the desired percentile and nn is the number of values

3. Find the Value

If position is not a whole number:

Value=xk+d(xk+1xk)\text{Value} = x_{k} + d(x_{k+1} - x_k)

where:

  • kk is the integer part of the position
  • dd is the decimal part of the position
  • xkx_k is the kth value in the ordered data

Example: 70th Percentile

Using our data with n = 8:

Position=70(8+1)100=6.3\text{Position} = \frac{70(8+1)}{100} = 6.3

Therefore:

P70=x6+0.3(x7x6)P_{70} = x_6 + 0.3(x_7 - x_6)=70+0.3(7570)=71.5= 70 + 0.3(75 - 70) = 71.5

How to Calculate Quartiles

Quartiles are special percentiles that divide your data into four equal parts. They are calculated using the same method as percentiles, where:

  • First quartile is the 25th percentile, Q1=52.5Q_1 = 52.5
  • Second quartile is the 50th percentile (median), Q2=62.5Q_2 = 62.5
  • Third quartile is the 75th percentile, Q3=73.75Q_3 = 73.75

Calculating Percentile Rank

What is Percentile Rank?

Percentile rank tells you the percentage of scores that fall below a particular value in a dataset.

The Formula

Percentile Rank=Number of values below X+0.5Total number of values×100\text{Percentile Rank} = \frac{\text{Number of values below } X + 0.5}{\text{Total number of values}} \times 100

where XX is the value whose percentile rank we want to find

Example Calculation

Using our previous dataset:

40,50,55,60,65,70,75,8040, 50, 55, 60, 65, 70, 75, 80

To find the percentile rank of 65:

  • Values below 65: 40, 50, 55, 60 (4 values)
  • Total values: 8
  • Percentile Rank=4+0.58×100=56.25\text{Percentile Rank} = \frac{4 + 0.5}{8} \times 100 = 56.25

Therefore, 65 is at the 56.25th percentile, meaning about 56.25% of the values fall below it.

Implementation Examples

Python Implementation:

Python
1import numpy as np
2import pandas as pd
3
4# Sample data
5data = [40, 50, 55, 60, 65, 70, 75, 80]
6
7# Percentiles
8p75 = np.percentile(data, 75)          # 75th percentile
9p_multiple = np.percentile(data, [25, 50, 75])  # Multiple percentiles at once
10
11# Quartiles
12q1, q2, q3 = np.quantile(data, [0.25, 0.5, 0.75])
13
14# Percentile rank
15def percentile_rank(x, data):
16    return (sum(data < x) + 0.5) / len(data) * 100
17
18rank_65 = percentile_rank(65, np.array(data))

R Implementation:

R
1library(tidyverse)
2
3# Sample data
4data <- c(40, 50, 55, 60, 65, 70, 75, 80)
5
6# Percentiles
7p75 <- quantile(data, 0.75)                    # 75th percentile
8p_multiple <- quantile(data, c(0.25, 0.5, 0.75))  # Multiple percentiles at once
9
10# Quartiles (same as above, just different interpretation)
11q1 <- quantile(data, 0.25)
12q2 <- quantile(data, 0.50)  # median
13q3 <- quantile(data, 0.75)
14
15# Percentile rank
16percentile_rank <- function(x, data) {
17  (sum(data < x) + 0.5) / length(data) * 100
18}
19
20rank_65 <- percentile_rank(65, data)

In Conclusion

Quartiles and percentiles are your trusty tools for understanding and analyzing data. Whether you're ranking test scores, measuring income, or analyzing trends, they provide valuable insights into data distribution.

Next time you hear someone say, "I'm in the 90th percentile," you'll know exactly what they mean!

Additional Resources

Help us improve

Found an error or have a suggestion? Let us know!