Confidence Interval of differences
and Forest Plots


Table of Contents

  1. Introduction
  2. Difference as an estimated mean
  3. Confidence interval of the difference
  4. Forest Plot

Introduction

Two related topics are covered in this web page.  Firstly, the statistical concepts underlying the confidence interval of differences between two groups are explained.  Secondly, procedures to create a Forest plot are given.  The forest plot is one of the most common methods of presenting differences between two groups.

Explanations in this page presume that you have read the probability and the normal distribution sections which are covered in the probability page.  An understanding of these topics is strongly recommended before you proceed.

Historically, confidence intervals became accepted in the 1980s and gradually replaced Type I and Type II errors as the basis for statistical decisions.  Although the confidence interval can be considered independently of Type I and II errors, it may be useful for you to cover the contents in the statistical significance page before you proceed.

Back to Top

Difference as an estimated mean

A quick revision of topics covered in the probability page:

Difference between groups

Early in the last century, R.A. Fisher devised sampling theory and defined the Standard Error of the mean which formed the basis of the confidence interval of the mean.

Fisher also developed the mathematics for the Analysis of Variance (ANOVA) by which variations between measurements can be partitioned into that between two groups and the residual (error) which represent background variation.  The variation between groups is related to the Standard Error of the mean difference.  The mean difference observed between two samples is considered an estimate of the mean difference of the population.  The confidence interval of the difference can be defined through these two parameters.

Although the initial partition of variations were carried out on Normally distributed measurements, other types of measurements, such as counts of events, survival time, and proportions of observations could also be partitioned, provided they are transformed into data types that can be considered normally distributed.  Unique calculations for each data type exist to transform them so that mean differences and its Standard Error can be estimated.  Confidence intervals are useful for statistical decision making.

Back to Top

Confidence interval of the difference

Type I and Type II errors have been the basis of statistical decisions for most of the last century, but their use is associated with difficulties explained in the statistical significance page.

Statistical decisions since the 1980s have become predominantly linked to confidence intervals.  The mean difference, for example, can be confidently assumed to lie within a defined range (confidence interval). Two types of statistical decisions are appropriate.

1. Statistically significant difference
Two groups are considered statistically different if the confidence interval of the mean difference does not include the null value.  The null value is commonly set to zero (0) for a difference between means and one (1) for the ratio of means.

2. Statistically significant equivalence
The confidence interval of the mean difference defines the range and thus the values that the difference does not exceed or fall below.  Equivalence can be concluded if such values are considered trivial or of no practical importance. In other words, the confidence interval lies within the equivalence interval.  Therefore, an equivalence interval defines the difference that is less than a nominated tolerance (a level that is of no practical importance).

Significant equivalence can be divided into three situations.  Significantly not greater than the tolerance is when the confidence interval is on the null side of the positive tolerance value.  Significantly not less than tolerance is when the interval is on the null side of the negative tolerance value (also known as non-inferiority).  True equivalence is when the confidence interval is between the positive and negative tolerance values.

An additional advantage of using confidence interval  is that, when the sample size is very large so that the confidence interval is very narrow, the confidence interval may not overlap the null value, yet within the tolerance values.  In this situation, although the difference is statistically significant, it is of no practical importance.

Equivalence is not the same as finding no difference. A conclusion of both a significant difference and a significant equivalence at the same time is possible.  Also, a conclusion of non-equivalence and no difference is possible.  Tests for both equivalence and differences should be seen as separate tests.

The relationship between confidence interval, null, and tolerance values is described in this Forest plot.

Back to Top

Forest Plot

Forest plots are a convenient and effective way to display confidence intervals, particularly when describing results of multiple comparisons.  The horizontal x-axis is scaled while the vertical y-axis has no scale.  The y-axis sits on the null value of the x-axis.  Remember, the null value for the difference in means and for the ratio of means is usually zero (0) and one (1) respectively.  Note that the x-axis has a logarithmic scale when the ratio of means is used.

Only graph paper, pen and a ruler are needed to produce a Forest plot.  However, a graphics utility (e.g., Excel) is necessary to quickly generate plots that look professional.  Excel seems the most commonly available program.  Therefore, a Java program has been written to convert arrays of mean and confidence intervals into coordinates, which can be copied and pasted into Excel.  A forest plot is then produced via the chart wizard icon.  The program and an example of its use follows.

Back to Top

Version 1.1  Last change 24th August 2006