Mathematics · Statistics · Regression Analysis
Linear Regression Calculator
Calculate the least-squares linear regression line, slope, intercept, and R² coefficient of determination from paired data points.
Calculator
Formula
\hat{y} is the predicted value, m is the slope, b is the y-intercept, n is the number of data points, \bar{x} and \bar{y} are the means of x and y respectively.
Source: Ordinary Least Squares method — standard statistical regression as defined in most introductory statistics textbooks.
How it works
Ordinary least-squares (OLS) regression minimises the sum of squared vertical distances between each observed data point and the fitted line. The slope m and intercept b are computed directly from the sums of x, y, xy, and x² across all n points using closed-form formulas — no iteration required.
R² ranges from 0 to 1 and represents the proportion of variance in y explained by x. An R² of 1 indicates a perfect linear fit, while 0 indicates the line explains none of the variability. The Pearson r is simply the signed square root of R², indicating both the strength and direction of the linear relationship.
Worked example
Suppose you have five data points: (1, 2), (2, 4), (3, 5), (4, 4), (5, 5).
Compute the sums: n = 5, Σx = 15, Σy = 20, Σxy = 65, Σx² = 55. Then: m = (5·65 − 15·20) / (5·55 − 15²) = (325 − 300) / (275 − 225) = 25 / 50 = 0.5. The intercept: b = (20/5) − 0.5·(15/5) = 4 − 1.5 = 2.5. So the regression line is ŷ = 0.5x + 2.5. Computing R² gives approximately 0.6, meaning 60% of the variance in y is explained by x.
Limitations & notes
Linear regression assumes a linear relationship between x and y; if the true relationship is curved, the model will be a poor fit regardless of how much data you have. It is also sensitive to outliers, which can dramatically shift the slope and intercept. This calculator performs simple (one-predictor) linear regression only — for multiple predictors, a dedicated multiple regression tool is needed. Always inspect a scatter plot before interpreting results.
Frequently asked questions
What does R² tell me?
R² is the proportion of variance in y that is explained by the linear relationship with x, on a scale from 0 to 1. An R² of 0.85 means 85% of the variation in y is captured by the regression line.
What is the difference between R² and Pearson r?
Pearson r measures the strength and direction (positive or negative) of the linear correlation, ranging from −1 to 1. R² is r squared, so it is always non-negative and tells you the proportion of explained variance without indicating direction.
How many data points do I need?
You need at least 2 points to fit a line, but 2 points always give a perfect fit (R² = 1) which is meaningless statistically. Aim for at least 10–20 points for a reliable regression.
Can I use this for time-series data?
Yes — enter time (e.g., year or day number) as x and your measured variable as y to fit a linear trend. Be aware that time-series data often violates the independence assumption of OLS, so treat results as exploratory.
What if all my x values are the same?
If all x values are identical, the denominator of the slope formula equals zero and the regression is undefined — the calculator will return NaN. You need variation in x to compute a meaningful regression line.
Last updated: 2025-01-15 · Formula verified against primary sources.