Many interpret error bars to mean that if they do not overlap the difference is statistically significant. This overlap rule is really an overlap myth; the rule does not hold true for any conventional type of error bar. There are rules of thumb for estimating P values, but it would be better to show error bars for which the overlap rule holds true. Here I explain how to calculate what I call comparative confidence intervals which, when plotted as error bars, let us judge significance based on overlap or separation. Others have published on these intervals (the mathematical basis goes back to John Tukey) but here I advertise comparative confidence intervals in the hope that more people use them. Judging statistical significance by eye would be most useful when making multiple comparisons, so I show how comparative confidence intervals can be used to illustrate the results of Tukey tests or Dunnett’s test. I also explain how to use of comparative confidence intervals to explore the effects of multiple independent variables and explore the problems posed by heterogeneity of variance and repeated measures. When families of comparative confidence intervals are plotted around means, I show how box-and-whiskers plots make it easy to judge which intervals overlap with which. Comparative confidence intervals have the potential to be used in a wide variety of circumstances, so I describe an easy way to confirm the intervals’ validity. When sample means are being compared to each other, they should be plotted with error bars that indicate comparative confidence intervals, either along with or instead of conventional error bars. This paper is based on a submission that was rejected by Psychological Methods. The original submission along with the reviewer’s comments and my responses are available at the bottom of this page.
Corotto, Frank, "Making the Error Bar Overlap Myth a Reality: Comparative Confidence Intervals" (2020). Faculty Publications. 1.