TSE.
MathematicsFinanceHealthPhysicsEngineeringBrowse all

Mathematics · Statistics · Descriptive Statistics

Correlation Coefficient Calculator

Calculate Pearson's correlation coefficient (r) to measure the linear relationship between two variables.

Calculator

Advertisement

Formula

r is Pearson's correlation coefficient, n is the number of data pairs, x_i and y_i are individual data values for variables X and Y respectively.

Source: Pearson, K. (1895). Notes on regression and inheritance in the case of two parents. Proceedings of the Royal Society of London, 58, 240–242.

How it works

The calculator takes two equally-sized sets of paired values and applies the Pearson formula. It computes the sums of each variable, their products, and their squares across all n pairs, then combines these into the numerator (measuring co-variation) and denominator (measuring individual variation).

The output r² (coefficient of determination) tells you the proportion of variance in Y that is explained by X. For example, r = 0.90 gives r² = 0.81, meaning 81% of the variability in Y is accounted for by the linear relationship with X.

Worked example

Suppose X = {1, 2, 3, 4, 5} and Y = {2, 4, 5, 4, 5}. We have n = 5.

Compute: ΣX = 15, ΣY = 20, ΣXY = 64, ΣX² = 55, ΣY² = 86.

Numerator: 5(64) − (15)(20) = 320 − 300 = 20.

Denominator: √[(5·55 − 225)(5·86 − 400)] = √[(50)(30)] = √1500 ≈ 38.73.

Result: r = 20 / 38.73 ≈ 0.8165, indicating a strong positive linear relationship.

Limitations & notes

Pearson's r only measures linear relationships — two variables can be strongly related in a non-linear way yet produce r ≈ 0. It is also sensitive to outliers, which can inflate or deflate r substantially. The coefficient does not imply causation; a high r simply indicates association. For non-normal or ordinal data, Spearman's rank correlation may be more appropriate.

Frequently asked questions

What does a correlation coefficient of 0 mean?

An r of 0 indicates no linear relationship between the two variables. The variables may still be related in a non-linear way.

What is a 'strong' correlation?

Conventionally, |r| ≥ 0.7 is considered strong, 0.4 ≤ |r| < 0.7 is moderate, and |r| < 0.4 is weak, though these thresholds vary by field.

Does correlation imply causation?

No — correlation only measures the tendency of two variables to move together. Establishing causation requires controlled experiments or additional causal reasoning.

What is the difference between r and r²?

r measures the strength and direction of the linear relationship, while r² (the coefficient of determination) represents the proportion of variance in Y explained by X.

When should I use Spearman's correlation instead?

Use Spearman's rank correlation when your data are ordinal, contain significant outliers, or follow a monotonic but non-linear relationship.

Last updated: 2025-01-15 · Formula verified against primary sources.