TSE.
MathematicsFinanceHealthPhysicsEngineeringBrowse all

Mathematics · Statistics · Hypothesis Testing

ANOVA Calculator

Performs one-way ANOVA to test whether the means of three or more independent groups are statistically equal.

Calculator

Advertisement

Formula

F is the F-statistic ratio. SS_between is the sum of squares between groups, measuring variability due to treatment. SS_within is the sum of squares within groups, measuring residual error. df_between = k - 1 where k is the number of groups. df_within = N - k where N is the total number of observations. MS denotes mean square (SS divided by its degrees of freedom).

Source: Montgomery, D.C. (2017). Design and Analysis of Experiments, 9th ed. Wiley.

How it works

One-way ANOVA partitions the total variability in a dataset into two components: variability attributable to differences between group means (treatment effect) and variability attributable to random fluctuation within groups (error). If group means truly differ, the between-group variance will be substantially larger than the within-group variance, producing a large F-statistic. The null hypothesis H₀ states that all population means are equal; the alternative H₁ states that at least one mean differs.

The F-statistic is the ratio of the Mean Square Between (MS_between) to the Mean Square Within (MS_within). MS_between = SS_between / (k − 1), where k is the number of groups and k − 1 is the between-group degrees of freedom. MS_within = SS_within / (N − k), where N is the total sample size and N − k is the within-group degrees of freedom. SS_between measures how much each group mean deviates from the grand mean, weighted by group size. SS_within measures the pooled variability of individual observations around their own group mean. The total SS is the sum of both components.

Effect size is quantified by eta-squared (η²), defined as SS_between / SS_total. Values around 0.01 are considered small, 0.06 medium, and 0.14 large by Cohen's conventions. To determine statistical significance, compare the computed F-value against a critical value from the F-distribution table at the desired alpha level (commonly 0.05), with degrees of freedom df₁ = k − 1 and df₂ = N − k. A statistically significant result only indicates that at least one group mean differs; post-hoc tests such as Tukey's HSD or Bonferroni correction are needed to identify which specific pairs differ.

Worked example

Suppose a plant biologist measures root lengths (cm) for seedlings under three light conditions:

  • Group 1 (Red light): 12, 15, 14, 13 — mean = 13.5
  • Group 2 (Blue light): 20, 22, 19, 21 — mean = 20.5
  • Group 3 (White light): 30, 28, 31, 29 — mean = 29.5

Step 1 — Grand mean: All 12 values sum to 214, so the grand mean = 214 / 12 ≈ 17.833.

Step 2 — SS Between: Each group has n = 4 observations.
SS_between = 4 × (13.5 − 17.833)² + 4 × (20.5 ��� 17.833)² + 4 × (29.5 − 17.833)²
= 4 × 18.778 + 4 × 7.111 + 4 × 136.111 = 648.667

Step 3 — SS Within:
Group 1: (12−13.5)² + (15−13.5)² + (14−13.5)² + (13−13.5)² = 2.25 + 2.25 + 0.25 + 0.25 = 5.0
Group 2: (20−20.5)² + (22−20.5)² + (19−20.5)² + (21−20.5)² = 0.25 + 2.25 + 2.25 + 0.25 = 5.0
Group 3: (30−29.5)² + (28−29.5)² + (31−29.5)² + (29−29.5)² = 0.25 + 2.25 + 2.25 + 0.25 = 5.0
SS_within = 15.0

Step 4 — Degrees of freedom: df_between = 3 − 1 = 2; df_within = 12 − 3 = 9.

Step 5 — Mean squares: MS_between = 648.667 / 2 = 324.333; MS_within = 15.0 / 9 = 1.667.

Step 6 — F-statistic: F = 324.333 / 1.667 ≈ 194.5. The critical F-value at α = 0.05 with df(2, 9) is approximately 4.26. Since 194.5 ≫ 4.26, we reject H₀ and conclude light color significantly affects root length.

Eta-squared: η² = 648.667 / (648.667 + 15.0) ≈ 0.977, indicating that 97.7% of total variance is explained by the light treatment — a very large effect.

Limitations & notes

One-way ANOVA rests on three key assumptions that must be verified before trusting results. First, normality: observations within each group should be approximately normally distributed. With small samples this can be checked using a Shapiro-Wilk test; with larger samples (n > 30 per group) the Central Limit Theorem provides robustness. Second, homogeneity of variance (homoscedasticity): population variances across groups should be equal. Levene's test or Bartlett's test can check this; if violated, Welch's ANOVA is a better alternative. Third, independence: observations must be independent within and across groups. ANOVA does not indicate which group means differ — a significant F-statistic only signals that at least one pair differs, requiring follow-up post-hoc testing. This calculator performs a one-way (single-factor) ANOVA only; two-way or repeated-measures designs require different procedures. Results can also be sensitive to extreme outliers, which inflate within-group variance and reduce power.

Frequently asked questions

What does a significant F-statistic tell me in ANOVA?

A significant F-statistic (p < alpha) means you reject the null hypothesis that all group means are equal. It tells you that at least one group mean differs from the others, but it does not specify which pair. You need post-hoc tests like Tukey's HSD, Bonferroni, or Scheffé to identify the specific differing groups.

How many groups can one-way ANOVA handle?

One-way ANOVA can technically handle any number of groups (k ≥ 2), but it is most useful when comparing three or more groups. For exactly two groups, a two-sample t-test gives an equivalent result — the F-statistic will equal t². Practically, the power of the test decreases as the number of groups grows unless sample sizes are large.

What is eta-squared and how do I interpret it?

Eta-squared (η²) is the proportion of total variance in the outcome explained by the group factor, computed as SS_between / SS_total. Values near 0.01 indicate a small effect, around 0.06 a medium effect, and 0.14 or above a large effect, according to Cohen's (1988) guidelines. It is a descriptive measure of practical significance, independent of sample size.

What should I do if the ANOVA assumptions are violated?

If normality is violated and sample sizes are small, consider a non-parametric alternative such as the Kruskal-Wallis test. If variances are unequal across groups (heteroscedasticity), use Welch's ANOVA, which does not assume equal variances. Always visualize your data with boxplots and run diagnostic tests before interpreting ANOVA results.

What is the difference between one-way and two-way ANOVA?

One-way ANOVA tests the effect of a single categorical factor on a continuous outcome. Two-way ANOVA examines the effects of two independent factors simultaneously, including the possibility of an interaction effect between them. For example, one-way ANOVA compares test scores across teaching methods, while two-way ANOVA could simultaneously examine teaching method and student gender as factors.

Last updated: 2025-01-15 · Formula verified against primary sources.