Confidence Intervals

The confidence interval may be thought of as the range of probable true values for a statistic. When public health practitioners use health statistics, sometimes they are interested in the actual number of health events, but more often they use the statistics to assess the underlying risk in the community. Observed measures of health events, that is, counts, rates or percentages that are calculated from health surveys, vital statistics data or other health surveillance systems, are not a perfect reflection of the true underlying risk in the population. Observed rates can vary from sample to sample or year to year, even when the true underlying risk is identical. Rates that fluctuate widely in the absence of changes in the true underlying risk are called unstable, imprecise, or unreliable.

Contents

1 Use of Confidence Intervals
2 Technical Definition
3 Calculation

References

Use of Confidence Intervals

The confidence interval is an indication of the stability of the statistical estimate. In general, as a population (or sample) size increases, the confidence interval gets smaller, indicating that the estimate is more stable. Conversely, wider confidence intervals indicate less stable estimates. Estimates calculated from small numbers will have wider confidence intervals.

Confidence intervals are often portrayed as in the graph, below. The height of each bar indicates the value of the point estimate, while the fine vertical lines atop each bar represent the size of each confidence interval, that is, the stability of each estimate. Notice that the confidence interval for the bar for persons aged 85 and over is much larger than the confidence intervals for the other age groups. This is primarily because there are fewer persons in the population age 85+ compared to the other groups.

Box and Whisker Bar Graph With Confidence Intervals

Box and Whisker Bar Graph With Confidence Intervals

The confidence interval tells you more than just the possible range around the estimate. It also tells you about the stability of the estimate. A stable estimate is one that would be close to the same value if the observation were repeated. An unstable estimate is one that would vary from one observation to another. Wider confidence intervals in relation to the estimate itself indicate instability.

For instance, because the 85+ population in the above graph is much smaller, a difference of only a few deaths from one year to the next could cause the observed rate to vary considerably. In this case, the confidence interval indicates that the estimate for the 85+ group is less stable, and a greater amount of random variation (e.g., from year to year) is expected to occur. Such random variation will obscure our view of the true underlying risk for that age group. Another term for stability is "reliability."

Even for complete count datasets, such as birth and death certificate datasets, random fluctuations over time will yield estimates that are not reliable. For instance, the death rate for a short time period from a small population will not reflect the true underlying death risk for that population.

Technical Definition

The 95% confidence interval (calculated as 1.96 times the standard error of a statistic) indicates the range of values within which the statistic would fall 95% of the time if the researcher were to calculate the statistic from an infinite number of samples of the same size drawn from the same base population. Unless otherwise stated, a confidence interval will be the "95% confidence interval." The 90% confidence interval is also commonly used. The 90% confidence interval is calculated as 1.65 times the standard error of the estimate.

A confidence interval is typically expressed as a symmetric value (e.g., "plus or minus 5%"). But for percentages, when the point estimate is close to 0% or 100%, the confidence interval will assume an asymmetric shape. That is, when the point estimate is close to 100%, the upper confidence bound will be smaller (so the confidence interval upper bound will not exceed 100%), and for point estimates close to 0%, the lower confidence bound will be smaller (so the confidence interval lower bound will not be lower than 0%). The formula that produces asymmetric confidence intervals has been applied to all the survey estimates in the IBIS query system.

Calculation

For more information on confidence intervals, including formulae for calculating them for various types of rates, a PDF document is available.

References

THIS PAGE NEEDS REFERENCES

Proceed to the page on measurement reliability and validity.

Please feel free to contact us with questions or suggestions.

Community Profile Reports by Geography

Community Profile Reports by Population Group

About Community Data

Find Indicator Report by

About Health Indicators

Vital Records Data

Morbidity Data

Health Survey Data

Demographic Data

Environment Data

Special Options

About Query Datasets

Help and Tutorials

Map Resources

Community Assessment

Statistics

Public Health Data