Mathematics · Statistics · Descriptive Statistics
Correlation Coefficient Calculator
Calculate Pearson's correlation coefficient (r) to measure the linear relationship between two variables.
Calculator
Formula
r is Pearson's correlation coefficient, n is the number of data pairs, x_i and y_i are individual data values for variables X and Y respectively.
Source: Pearson, K. (1895). Notes on regression and inheritance in the case of two parents. Proceedings of the Royal Society of London, 58, 240–242.
How it works
The calculator takes two equally-sized sets of paired values and applies the Pearson formula. It computes the sums of each variable, their products, and their squares across all n pairs, then combines these into the numerator (measuring co-variation) and denominator (measuring individual variation).
The output r² (coefficient of determination) tells you the proportion of variance in Y that is explained by X. For example, r = 0.90 gives r² = 0.81, meaning 81% of the variability in Y is accounted for by the linear relationship with X.
Worked example
Suppose X = {1, 2, 3, 4, 5} and Y = {2, 4, 5, 4, 5}. We have n = 5.
Compute: ΣX = 15, ΣY = 20, ΣXY = 64, ΣX² = 55, ΣY² = 86.
Numerator: 5(64) − (15)(20) = 320 − 300 = 20.
Denominator: √[(5·55 − 225)(5·86 − 400)] = √[(50)(30)] = √1500 ≈ 38.73.
Result: r = 20 / 38.73 ≈ 0.8165, indicating a strong positive linear relationship.
Limitations & notes
Pearson's r only measures linear relationships — two variables can be strongly related in a non-linear way yet produce r ≈ 0. It is also sensitive to outliers, which can inflate or deflate r substantially. The coefficient does not imply causation; a high r simply indicates association. For non-normal or ordinal data, Spearman's rank correlation may be more appropriate.
Frequently asked questions
What does a correlation coefficient of 0 mean?
An r of 0 indicates no linear relationship between the two variables. The variables may still be related in a non-linear way.
What is a 'strong' correlation?
Conventionally, |r| ≥ 0.7 is considered strong, 0.4 ≤ |r| < 0.7 is moderate, and |r| < 0.4 is weak, though these thresholds vary by field.
Does correlation imply causation?
No — correlation only measures the tendency of two variables to move together. Establishing causation requires controlled experiments or additional causal reasoning.
What is the difference between r and r²?
r measures the strength and direction of the linear relationship, while r² (the coefficient of determination) represents the proportion of variance in Y explained by X.
When should I use Spearman's correlation instead?
Use Spearman's rank correlation when your data are ordinal, contain significant outliers, or follow a monotonic but non-linear relationship.
Last updated: 2025-01-15 · Formula verified against primary sources.