10. Comparing Two Population Means
10.3 Difference between Two Variances – the F Distributions
Here we have to assume that the two populations (as opposed to sample mean distributions) have a distribution that is almost normal as shown in Figure 10.2.
Figure 10.2: Two normal populations lead to two distributions that represent distributions of sample variances. The
distribution results when you build up a distribution of the ratio of the two
sample values.
The ratio follows an
-distribution if
. That
distribution has two degrees of freedom: one for the numerator (d.f.N. or
) and one for the denominator (d.f.D. or
). So we denote the distribution more specifically as
. For the case of Figure 10.2,
and
. The
ratio, in general is the result of the following stochastic process. Let
be random variable produced by a stochastic process with a
distribution and let
be random variable produced by a stochastic process with a
distribution. Then the random variable
will, by definition, have a
distribution.
The exact shape of the distribution depends on the choice of
and
, But it roughly looks like a
distribution as shown in Figure 10.3.
data:image/s3,"s3://crabby-images/72e69/72e6945d8c3b751c68d3db4014f598bc7581a1c2" alt=""
data:image/s3,"s3://crabby-images/33910/339102466f471c0a870a18ff86d2157b8a6cee89" alt="Rendered by QuickLaTeX.com F"
and
are related :
so the statistic can be viewed as a special case of the
statistic.
For comparing variances, we are interested in the follow hypotheses pairs :
Right-tailed | Left-tailed | Two-tailed |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
We’ll always compare variances () and not standard deviations (
) to keep life simple.
The test statistic is
where (for finding the critical statistic), and
.
Note that when
, a fact you can use to get a feel for the meaning of this test statistic.
Values for the various critical values are given in the F Distribution Table in the Appendix. We will denote a critical value of
with the notation :
Where:
= Type I error rate
= d.f.N.
= d.f.D.
The F Distribution Table gives critical values for small right tail areas only. This means that they are useless for a left-tailed test. But that does not mean we cannot do a left-tail test. A left-tail test is easily converted into a right tail test by switching the assignments of populations 1 and 2. To get the assignments correct in the first place then, always define populations 1 and 2 so that . Assign population 1 so that it has the largest sample variance. Do this even for a two-tail test because we will have no idea what
on the left side of the distribution is.
Example 10.3 : Given the following data for smokers and non-smokers (maybe its about some sort of disease occurrence, who cares, let’s focus on dealing with the numbers), test if the population variances are equal or not at .
Smokers | Nonsmokers |
![]() |
![]() |
![]() |
![]() |
Note that so we’re good to go.
Solution :
1. Hypothesis.
2. Critical statistic.
Use the F Distribution Table; it is a bunch of tables labeled by “” that we will designate at
, the table values that signify right tail areas. Since this is a two-tail test, we need
. Next we need the degrees of freedom:
So the critical statistic is
3. Test statistic.
With this test statistic, we can estimate the -value using the F Distribution Table. To find
, look up all the numbers with d.f.N = 25 and d.f.N = 17 (24
17 are the closest in the tables so use those) in all the the F Distribution Table and form your own table. For each column in your table record
and the
value corresponding to the degrees of freedom of interest. Again,
corresponds to
for a two-tailed test. So make a row above the
row with
. (For a one-tailed test, we would put
.)
![]() ![]() |
0.20 0.10 0.05 0.02 0.01 0.10 0.05 0.025 0.01 0.005 |
![]() |
1.84 2.19 2.56 3.08 3.51 3.6 is over here somewhere so ![]() |
Notice how we put an upper limit on because
was larger than all the
values in our little table.
Let’s take a graphical look at why we use in the little table and
for finding
for two tailed tests :
But in a two-tailed test we want split on both sides:
4. Decision.
Reject . The
-value estimate supports this :
5. Interpretation.
There is enough evidence to conclude, at with an
-test, that the variance of the smoker population is different from the non-smoker population.
▢