This week, you have learned about t-tests and ANOVA and will apply what you have learned in the evaluation of several research scenarios (Hyperlink and the 3 attached files). For this task, you will read each research scenario and answer the questions regarding each one.
Research Scenario 1A researcher is interested in the effects of a new weight loss supplement. Participants in this double-blind study are randomly assigned to either a group that receives the new supplement or a group that receives a placebo. Participants are weighed before starting the program. After 6 weeks of taking either the new supplement or the placebo, participants return to the lab to be weighed.
.
Research Scenario 2A researcher is interested in whether certain memory strategies help people to remember information. This researcher employs students from a local college, and then randomly assigns them to one of three groups—the visualization group, the mnemonic technique group, and the rote repetition group. Participants in each group receive an hour of instruction regarding how to use the particular technique to remember lists of words. After the instruction, all participants are presented with a list of 60 words that they are instructed to remember. The words are presented one-at-a time on a computer screen. After the last word is presented, all participants are instructed to recall as many words as possible by writing them on a blank sheet of paper. All participants are given 10 minutes to recall the words.
Name the independent and dependent variables used in the analysis. What are the levels of the independent variable?Indicate the levels of measurement for each variable.Describe the Type I error for this study.Describe the Type II error for this study
Research Scenario 3A local manufacturing company is interested in determining whether their employees are as happy with their jobs as other employees. The manufacturing company asked the workers, who volunteered to participate, to rate their happiness at work on a scale from 1 to 10 where 1 was not at all happy and 10 was extremely happy. The manufacturing company found that the mean happiness rating for their employees is 7.3. In the general population of workers in the United States, the mean happiness rating is 6.
Describe the Type I error for this study.Describe the Type II error for this study
Length: 1-2 pagesYour paper should demonstrate thoughtful consideration of the ideas and concepts presented in the course by providing new thoughts and insights relating directly to this topic. Your response should reflect scholarly writing and current APA standards.
https://conjointly.com/kb/statistical-student-t-te…
Probability Distributions and One-Sample z
and t Tests
In: Statistics for the Social Sciences
By: R. Mark Sirkin
Pub. Date: 2011
Access Date: February 28, 2022
Publishing Company: SAGE Publications, Inc.
City: Thousand Oaks
Print ISBN: 9781412905466
Online ISBN: 9781412985987
DOI: https://dx.doi.org/10.4135/9781412985987
Print pages: 225-270
© 2006 SAGE Publications, Inc. All Rights Reserved.
This PDF has been generated from SAGE Research Methods. Please note that the pagination of the
online version will vary from the pagination of the print book.
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
Probability Distributions and One-Sample z and t Tests
▾PROLOGUE▾
▾
▾
This chapter is essentially an extension of the previous chapter, except that in addition to presenting several
new topics, we go back to explain the theory that underlies the z formula. What are we actually doing when
we calculate z?
It isn’t absolutely necessary to understand the underlying theory simply to work these formulas, any more than
it is to understand the physics and chemistry of the cooking process in order to cook a meal. Nevertheless,
it can be useful to understand what you process in order to cook a meal. Now that you are hungry, let’s get
back to statistics. You (or your computer) can always calculate z, t, F, and so on. Still, it is useful to be able to
visualize and understand what is actually happening when you do these tests.
Introduction
In the previous chapter, we discussed tests of significance and used the one-sample z formula to illustrate
the entire procedure.
In this chapter, we turn our attention to the origin of that formula and explain what is taking place when we use
it. It is possible to use any statistical formula without such an understanding and simply plug in the numbers
as we did in Chapter 7. But if you can visualize what is going on, your understanding will be enhanced since
essentially the same process takes place no matter what test of significance is being performed.
The z test of significance is based on a frequency distribution known as a normal distribution and is applied
to a specific normal curve called the sampling distribution of sample means. When we perform this test, we
are actually taking the given sample statistics and population parameters and locating them on the sampling
distribution of sample means. In fact, all tests of significance do the same thing, even though their sampling
distributions differ from one another.
At the end of this chapter, we will discuss the one-sample t test, and in subsequent chapters, the other
commonly used tests of significance will be presented. We will begin by discussing an even simpler z formula
than the one in the previous chapter and introducing the concept of a normal distribution.
Normal Distributions
Page 2 of 47
Probability Distributions and One-Sample z and t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
Normal distributions are a family of frequency distributions that, when graphed, often resemble bells.
Generally, they are represented in the form found in Figure 8.1. Such a curve has three major characteristics:
(a) it is unimodal, (b) it is symmetric, and (c) it is asymptotic to the x-axis. This last characteristic, which
becomes very important later on, means that the tails of the curve get closer and closer to the x-axis but
never reach it. Consequently, no matter how far you get from the mean on the x-axis, there will always be a
tail continuing beyond that point. The tail never ends, at least not in the mathematical model.
Normal distributions A family of frequency distributions that, when graphed, often
resemble bells.
The reason for using the term normal distribution is that certain characteristics such as human height, weight,
or intelligence graph in frequency distributions approximating this bell-shaped pattern. The term normal is a
bit misleading, however. Not everything in nature is normally distributed, so nonnormal distributions are not
really abnormal.
Figure 8.1 The Mathematical Model of the Normal Distribution
Page 3 of 47
Probability Distributions and One-Sample z and t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
Figure 8.2 IQ as a Normal Distribution
Let us use measures of intelligence—intelligence quotients or IQs—to illustrate the normal distribution (see
Figure 8.2). IQs are designed to range from 0 to 200 with a mean of 100. Standard deviations vary by the
age of the subjects but are usually around 13 or 14; for computational ease we will use 10. Note that, unlike
the mathematical model in Figure 8.1, in Figure 8.2 the tails do end—at IQs of 0 and 200, respectively. This
indicates that there are natural upper and lower limits in the actual measurement tool being utilized. For
example, say it is a test with 200 questions and someone’s IQ is the number of correct answers. Two geniuses
each score 200, yet, if one of the two is really twice as smart as the other and the IQ test had 400 questions
instead of 200, the one genius would score 400 whereas the other would only get 200.
Suppose Sandra takes this IQ test, and her score, which we designate as x, is 115. We would like to know
what proportion of people would likely have higher IQs than Sandra’s and what proportion would have lower
IQs. (Sorry, this procedure cannot tell us how many will have exactly Sandra’s IQ.) It turns out that the area
under the curve corresponds to the proportion of people with a particular characteristic. The total area under
the curve (1.00 proportion) accounts for all (100%) people. Since a normal curve is symmetric, .50 proportion
(50%) of the area under the curve falls below the mean, and .50 proportion falls above the mean. Thus, half
of all people should have IQs below 100, and half should have IQs above it. This proportion of the area also
pertains to the probability of randomly selecting a person with a particular characteristic. Since .50 proportion
of the area of the curve is below the mean, there is also a .50 probability of randomly selecting a person
whose IQ is below 100. Likewise, there is a .50 probability of randomly selecting someone whose IQ is greater
than 100.
In Figure 8.3, we have added Sandra’s IQ, x = 115. The proportion of people with an IQ greater than 115 is
the shaded area under the curve in the right tail, from x = 115 to x = 200. The proportion of people with IQs
below 115 is represented by the remaining unshaded area under the curve, from the left of x = 115 to x =
0. Note that the unshaded area has two components, the .50 proportion of IQs less than 100 plus the area
under the curve from 100 to 115.
To find these areas, we use a table of areas under the normal curve that applies to all normal distributions.
Page 4 of 47
Probability Distributions and One-Sample z and t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
To use the table, we begin by calculating what is called a standard score, which is universally designated
by the letter z. (Its relationship to our z test of significance will be explained later.) To calculate z, we recast
the distance from the mean (100) to the value of x we are studying (Sandra’s IQ of 115), expressed as
standard deviation units. The distance from the mean of Sandra’s IQ is x − µ = 115 − 100 = 15; her IQ is 15
points greater than the mean. To convert that distance into standard deviation units, we divide by the size of
the standard deviation, σ = 10, which was given to us. Since 15/10 = 1.5, we know that Sandra’s IQ is 1.5
standard deviation units from the mean. Expressing the whole process in a single equation, we get
Figure 8.3
Standard score A score universally designated by the letter z, in which that score is
expressed in standard deviation units from the mean.
We now go to Table 8.1 and see that each page has three blocks of figures, and, in turn, each block has three
columns: A, B, C. Column A lists a value of z. Column B shows the area under the curve from the mean out to
that specified value of z (note the graphs above each column). Column C shows the area in the tail beyond z.
Note that the area in column B plus the area in column C always add to .5000. Also note that as z gets larger,
the area in column B gets larger, and the area in column C gets smaller.
The bigger the z, the smaller the tail.
Page 5 of 47
Probability Distributions and One-Sample z and t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
We find our z of 1.5 at the bottom of the center block on the second page of Table 8.1. Note that at z = 1.5,
the number in column B is .4332, and the number in column C is .0668. The area in the tail, corresponding
to the proportion of people with IQs greater than 115, is the number in column C, .0668—only 6.68% have an
IQ higher than Sandra’s. To find the proportion with IQs less than 115, we take the area in column B and add
to it .50, the proportion with IQs below the mean.
Thus, .9332 proportion of people has IQs below 115. Sandra’s pretty bright!
If Sandra is bright, George is not; his IQ is only 80. Let us find the proportions of area above and below 80
(see Figure 8.4). First we find z.
Here, z is negative since x is less than µ. In Table 8.1, we will look for the absolute value of our z (2.00) but
use the graphs at the bottom of the page to see that now the shaded areas are to the left of the mean. In the
center of the left-hand block of the third page of the table, you will find z = 2.00. The column B area is .4772,
and the column C area is .0228. Accordingly, only .0228 proportion of people has IQs below George’s. To find
the other proportion, add the column B figure to the .50 whose IQs exceed the mean: .4772 + .5000 = .9772
proportion. Poor George! Nearly 98% of all IQs exceed his.
Page 6 of 47
Probability Distributions and One-Sample z and t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
Table 8.1 Proportions of Area Under Standard Normal Curve
Page 7 of 47
Probability Distributions and One-Sample z and t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
Page 8 of 47
Probability Distributions and One-Sample z and t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
Page 9 of 47
Probability Distributions and One-Sample z and t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
Page 10 of 47
Probability Distributions and One-Sample z and t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
Now we can solve a mystery: the source of the critical values of z used in the previous chapter. Although
a much more detailed table of areas under the normal curve is needed to find all the critical values of z
presented in Chapter 7, we can find approximate values using Table 8.1.
Page 11 of 47
Probability Distributions and One-Sample z and t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
Figure 8.4
We simply specify a particular tail area, find it in column C, and read the corresponding z score from column
A.
For example, to find the z value for the one-tailed .05 level, we see that in column C, the two closest
approximations are .0505 (z = 1.64) and .0495 (z = 1.65). Actually, the mean of the two z values, 1.645, is the
true critical value, but when we round to two decimal places, we get 1.65. Likewise, the closest tail area to
.01 is actually .0099, and its z is 2.33. For the .001 level, we find .0010 occurring three times, where z is 3.08,
3.09, and 3.10. If this table had used more decimal places, we would see that 3.09 would be our best-fitting
value of z.
For a two-tailed test, we need two tails whose areas, added together, equal the probability level desired. We
take one half of the probability level as the area to locate in column C. For instance, at p = .05, we need
two equal tails whose areas add to .05, so .05/2 = .025. Finding .0250 in column C, we see that z = 1.96.
(Unfortunately, Table 8.1 is not complete enough for us to find the other two-tailed values of z.)
Figures 8.5 and 8.6 summarize the relationships between z values and probabilities for one and two tails,
respectively.
THE ONE-SAMPLE z TEST FOR STATISTICAL SIGNIFICANCE
The formula we used in the previous chapter to test for statistical significance,
, is
really a reworking of the formula we have been using in this chapter, except that it is applied to a specific type
Page 12 of 47
Probability Distributions and One-Sample z and t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
of frequency distribution known as the sampling distribution of sample means. This sampling distribution
is the frequency distribution that would be obtained from calculating the means of all theoretically possible
samples of a designated size that could be drawn from a given population. To illustrate this definition, let us
imagine a population of 5 people with scores ranging from 1 to 5. Using a sample size of n = 3, what are the
different combinations of scores that we would obtain in selecting all possible samples of 3 people out of the
original 5? If we consider the order of selection of the people (e.g., 5–4-3 is one sample, 4–5-3 is another,
and 3–4-5 yet another), there are actually 60 possible samples that could be drawn from this population. If we
disregard the order of selection, there are only 10 possible combinations of scores.
Page 13 of 47
Probability Distributions and One-Sample z and t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
Figure 8.5 Critical Values of z—One Tail
Page 14 of 47
Probability Distributions and One-Sample z and t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
Figure 8.6 Critical Values of z—Two Tails
Page 15 of 47
Probability Distributions and One-Sample z and t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
Sampling distribution of sample means The frequency distribution that would be
obtained from calculating the means of all theoretically possible samples of a designated
size that could be drawn from a given population.
We will work with these 10 samples of 3 people with the 3 scores shown in the columns. For each of these
samples, we can calculate the mean.
The frequency distribution of all these possible sample means is as follows:
We graph this frequency distribution, which will approximate the sampling distribution of sample means, as a
histogram, as shown in Figure 8.7.
Note that the histogram in Figure 8.7 has a pattern that begins to resemble a normal curve in the sense that it
is unimodal and symmetric. In fact, if either the population from which the samples are drawn is itself normally
distributed along the variable x and/or self-normally distributed along the variable x and/or the samples drawn
from that population are sufficiently large, the sampling distribution of sample means will also be a normal
distribution. This characteristic will prove to be very useful to us.
Page 16 of 47
Probability Distributions and One-Sample z and t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
Figure 8.7 The Sampling Distribution of Sample Means Obtained From 10 Samples of Size n = 3 From
a Specified Population
The formal statement of these characteristics comes from what is known as the central limit theorem and a
related theorem known as the law of large numbers.
The Central Limit Theorem
According to the central limit theorem, if repeated random samples of size n are drawn from a population
that is normally distributed along some variable x, having a mean µ and a standard deviation σ, then the
sampling distribution of all theoretically possible sample means will be a normal distribution having a mean µ
and a standard deviation
.1
Central limit theorem If repeated random samples of size n are drawn from a population
that is normally distributed along some variable x, having a mean μ and a standard
deviation σ, then the sampling distribution of all theoretically possible sample means will
be a normal distribution having a mean μ and a standard deviation
If, for a particular population, some variable (x) is normally distributed and we draw a series of samples of a
predetermined size (n) from that population, the central limit theorem tells us that
The sampling distribution of sample means will be a normal distribution.
Page 17 of 47
Probability Distributions and One-Sample z and t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
The mean of the sampling distribution of sample means, the mean of all the sample means
(designated
), will be equal to µ, the mean of the population from which the samples were
originally drawn.
The standard deviation of the sampling distribution of sample means will be equal to
, the standard deviation of the population from which the samples were drawn divided by the
square root of the size of the samples that we were drawing. This standard deviation of our
sampling distribution,
, is also called the standard error of the mean or, more often, just
the standard error, and is sometimes designated with the symbol
.
Standard error of the mean or the standard error The standard deviation of the
sampling distribution, designated with the symbol
To illustrate: Suppose your state or province mandates a series of competency examinations in reading, math,
and so on, to be taken by all schoolchildren in selected grades. Assume that on the math competency exam
given to all ninth-graders, the mean, µ, is 70, and the standard deviation, σ, is 20. We wish to compare these
results to those we would have found if we had studied random samples of ninth-graders rather than the
whole population. How would the sample means be distributed? Assume we select random samples of size
n = 100.
According to the central limit theorem, the sampling distribution of sample means would be a normal
curve with a mean
and a standard deviation (the standard error of the mean)
. This is graphed in Figure 8.8.
The implications of Figure 8.8 are immense since they suggest that the overwhelming majority of theoretically
possible sample means are going to fall very close to the original population’s mean. There are tails to the
curve in Figure 8.8 above and below the mean, and they are asymptotic to the x-axis, but they are so tiny that
they are barely perceivable.
In fact, we know the following about normal distributions: 68.27% of the area under the normal curve (thus
68.27% of all sample means) falls between
that is, one standard deviation above and
below the mean. Since in this particular example, the standard deviation of the sampling distribution (the
standard error of the mean) is 2.0, we are observing that 68.27% of all sample means will lie between 68
and 72. We also know that 95.45% of the area under the curve (95.45% of the sample means) falls between
, and
99.73% of the area falls between
.
In this particular example:
Page 18 of 47
Probability Distributions and One-Sample z and t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
Figure 8.8 The Actual Appearance of a Sampling Distribution of Sample Means for Samples n = 100
Drawn From a Population “Where µ = 70 and σ = 20
With 99.73% of sample means falling between 64 and 76, the remaining 0.27% of all sample means must
Page 19 of 47
Probability Distributions and One-Sample z and t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
fall below 64 or above 76. Out of 1000 samples, only 2 or 3 would have means below 64 or above 76—an
extremely improbable, but statistically possible, event.
The central limit theorem becomes most useful to us when we are given, as before, µ and σ for a population
and data about one random sample presumably drawn from that population. In that case, we are asking how
likely it is that, from the given population, we could draw a random sample whose
as we observe. If the likelihood is low, we might better conclude that the
differs from µ by as much
reflects a population with a mean
other than the µ of the population from which we initially assumed that the sample was drawn.
This brings us to the kind of problem presented here and in the previous chapter. Suppose in our competency
exam example, we have a random sample of 100 ninth-graders who had been enrolled in a 6-week-long
course to prepare for this examination. This sample’s mean score is 73. Thus, H0: µall= µcourse. Assuming
advance data on which to make a directionality assumption, we could write
We calculate z using the formula from Chapter 7 and compare zobtained to the critical values of z.
Since 1.50 < 1.65, we cannot reject H0. The course appears to have been unsuccessful.
With this formula, we are finding our sample's
finding the distance from that
on the x-axis of the sampling distribution of sample means,
to the mean of the sampling distribution, and converting that distance into
standard deviation units (standard scores) based on the standard deviation of the sampling distribution. To
see how this works, let us start by converting our simple z formula from symbols to words.
Remember that the frequency distribution is the sampling distribution of sample means. Now note the
following:
The value of the variable whose distance from the mean (of the sampling distribution) we wish to
find is the
for those taking the preparatory course.
The mean of our frequency distribution (the sampling distribution),
, according to the central limit
theorem, equals µ for all the ninth-graders.
The standard deviation of our frequency distribution, which for a sampling distribution is called the
standard error, according to the central limit theorem equals σ for all the ninth-graders divided by
Page 20 of 47
Probability Distributions and One-Sample z and t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
the square root of our sample size:
.
Substituting this information from the central limit theorem for the words in our equation, we get
Simplifying,
Or, in general terms,
Thus, the formula for the one-sample z test of significance is really a recasting of the basic formula z = (x
– µ)σ to apply to the sampling distribution. The central limit theorem enables us to find a z value on the
sampling distribution from data pertaining to the population and the sample. This is illustrated in Figure 8.9,
which shows our sampling distribution (not drawn to actual scale) and its components.
We now see why a directional H1 is dubbed a one-tailed H1—we only make use of one tail on the
sampling distribution. Without a directionality assumption, we would move out from the mean of the sampling
distribution toward both the left and the right, examine the size of both of the tails by comparing the absolute
value of zobtained to zcritical, and pay the price of needing a larger zobtained than is needed when using
only one tail.
Page 21 of 47
Probability Distributions and One-Sample z and t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
Figure 8.9 The Sampling Distribution of Sample Means for Samples n = 100 Based on Competency
Exam Data (Hypothetical)
Review
Before we proceed, let us review the fact that in using the central limit theorem, we are working with three
separate frequency distributions: the population, the sample, and the sampling distribution of sample means.
We are given information about the first two distributions. The central limit theorem then enables us to take
data from those two distributions and make use of the properties of the sampling distribution. We know the
following:
The frequency distribution for variable x for some population. We assume that this distribution is
normal. We know its mean i and its standard deviation σ.
The frequency distribution of a particular random sample that we have drawn. We know its size n
and its mean
. The variable x in our sample is the same variable x in our population.
The sampling distribution of sample means. (You never see this distribution; you just make use
of it!) There exists a separate sampling distribution of sample means for each possible sample
size (each n). For any given n, this represents the frequency distribution of all possible sample
means from all possible samples drawn randomly from that population whose mean and standard
deviation along variable x are μ and σ, respectively.
For the specific sample that we have drawn, our sample mean
will be one point (one value of
) on that
sampling distribution. The central limit theorem enables us to find the distance from the sample's mean to the
Page 22 of 47
Probability Distributions and One-Sample z and t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
population's mean, expressed as standard errors or standard deviations of the sampling distribution. Since
the sampling distribution is a normal curve, we may determine the probability of our sample's
reflecting a
population whose mean is i and, based on that probability, either retain or reject our null hypothesis.
The Normality Assumption
Note that the central limit theorem assumes that the population we are studying is normally distributed along
variable x. This is called the normality assumption. If it is true, the sampling distribution of sample means
will be a normal distribution, and we may make use of the z formula to test for statistical significance. (Note
that nothing requires that our sample be normally distributed.) What if we know that the population is not
normally distributed, or more realistically, what if we have no basis for making a normality assumption about
the population in the first place? Even in such cases, if our sample's size is large enough, we may still be able
to make use of the central limit theorem due to the law of large numbers.
Normality assumption The assumption that that the population being studied is normally
distributed along variable x.
The law of large numbers states that if the size of the sample, n, is sufficiently large (no less than 30;
preferably no less than 50), then the central limit theorem will apply even if the population is not normally
distributed along variable x. Thus, if n is large enough, the population distribution need not be normal and
could, in fact, be anything: skewed, bimodal, trimodal, anything. When n is large enough, we relax the
normality assumption for our population, but the sampling distribution of sample means will still be a normal
curve, and the central limit theorem will still apply.
Law of large numbers A law that states that if the size of the sample, n, is sufficiently
large (no less than 30; preferably no less than 50), then the central limit theorem will apply
even if the population is not normally distributed along variable x.
How large must n be to relax the normality assumption? The figures given in the above theorem are rather
arbitrary; other sources give other cutoffs. In fact, in some texts of statistics for psychology (which often only
requires small samples or small experimental and control groups), the minimum sample size is as low as 15,
but that is probably too low Perhaps we ought to put it this way:
Page 23 of 47
Probability Distributions and One-Sample z and t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
In social science survey research, our sample sizes are generally large enough to make use of the law of
large numbers. This is particularly fortunate, since in actual research all too often, the issue of the normality
assumption is not adequately addressed.
Let us look at an example. At a small liberal arts college, an index of support for civil liberties, ranging from
0 (least supportive) to 10 (most supportive), was pilot tested on the entire student body, yielding a mean of
7.5 and a standard deviation of 1.5. A random sample of 100 students who had been the direct victims or
close relatives of victims of serious crimes was also given the test, and their mean score was 7.2. May we
conclude that for all similar victims, the support score for civil liberties differs in general from the population of
all students at that college?
Our hypotheses are
Since n =100, we may relax the normality assumption for the population. We have all necessary data for a
one-sample z test.
We compare the absolute value of z to the two-tailed zcritical values:
We conclude, therefore, that the civil liberties support score for all serious crime victims at this college is lower
than the average for the college as a whole (p < .05). (The sampling distribution is shown in Figure 8.10.)
THE ONE-SAMPLE t TEST
We know that to do the one-sample z test, we need to know or be able to hypothesize two population
parameters, μ and σ. What could we do in the unlikely event that we know μ but not σ? Initially, the sample
standard deviation s was assumed to be a good estimate of σ, so s was substituted when σ was unknown.
Once the sample mean
Page 24 of 47
had been calculated, s was generated using the definitional formula
Probability Distributions and One-Sample z and t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
or one of several possible computational formulas.
Figure 8.10 The Sampling Distribution of Sample Means for Civil Liberties Support Scores
However, it was discovered that, particularly when the sample size n was small, calculating z with s produced
inaccurate conclusions. A British quality control expert2 working for a Dublin brewery discovered that by
calculating a different estimate of σ from sample data, a better test of significance could be developed. This
new best “unbiased” estimate of σ, which we designate
(read as “sigma-hat,” because sigma is wearing a
hat), is created when we substitute n − 1 for n in the standard deviation formula.
Sigma-hat ( ) An estimate of sigma.
This new test of significance is called the tttest to differentiate it from the z test; note that the formulas are the
same except that
is substituted for σ
When n is large, the substitution of σ for s makes very little difference, but as n gets smaller, σ and s diverge,
Page 25 of 47
Probability Distributions and One-Sample z and t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
causing a likewise divergence between t (using σ) and z (using s to estimate σ).
t test A test of significance similar to the z test but used when the population's standard
deviation is unknown.
The sampling distributions of t and z also differ. In the case of the z test, the sampling distribution of sample
means is a normal curve. Since the value of each sample mean can be expressed as a z score (indicating the
distance x is from µ in terms of standard errors), the sampling distribution of sample means is the same as
the distribution of all the z scores from all the theoretically possible sample means that make up the sampling
distribution. Thus, the sampling distribution of z (all the zs from those sample means) is also a normal curve.
If we take the same means in our sampling distribution and calculate t scores instead, the sampling
distribution of t (all the ts from those sample means) is a normal distribution only when the sample sizes are
above 120. As the sample sizes fall below 120 (give or take), the sampling distribution begins to be flatter
than a normal curve (say platykurtic, if you want to impress your friends). When the curve is flatter than a
normal curve at its peak, the tails are also larger than those of a normal curve. (The effect is similar to pushing
a balloon down from its top, thus displacing the air to the sides as we press.) As n gets smaller, the peak
of the sampling distribution gets flatter, and its tails get larger. The important consequence is that as n gets
smaller, we must go ever-greater distances away from the mean to get a tail area equal to .05 proportion of
the area under the curve.
Figure 8.11 shows the changes in the critical values of t (.05 level, one-tailed) as n decreases. At n = 121,
the sampling distribution is nearly a normal curve, and tcritical is 1.658, only slightly larger than zcritical (.05
level, one-tailed), which is 1.65 (actually 1.645 before rounding). In fact, as n increases above 121, the critical
values of z and t get ever closer to each other. As n gets extremely large, approaching infinity as a limit, the
critical values of z and t become the same. However, as n drops below 121, the tcritical value gets larger. In
other words, we have to go farther out to get a tail with .05 of the area under the curve in it. By the time n =
21, tcritical has gone from 1.658 to 1.725, and at n = 6, tcritical has risen to 2.015.
Page 26 of 47
Probability Distributions and One-Sample z and t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
Figure 8.11 Changes in the Sampling Distribution of t as Sample Size Decreases
In comparing the critical values of t to those of z, bear in mind that the sampling distribution of z is always
a normal distribution, and its critical values remain constant, independent of sample size. By contrast, the
critical values of t depend on sample size. At best, when n is large, the critical values of t are almost as
small as those of z. But as n decreases, the critical values of t get larger, making it harder to reject the null
hypothesis. Thus, if we know σ and can therefore do a one-sample z test, we always do the z test, not the
Page 27 of 47
Probability Distributions and One-Sample z and t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
t test. We do the one-sample t test only if σ is unknown and we must estimate it with σ. In fact, when n
gets large (say 30 or more), many statisticians advocate the use of the z test, with s substituting for σ in the
formula, rather than the use of the t test. But with a smaller n where o is unknown, we must always do the t
test and retain the normality assumption for the population.
Degrees of Freedom
Note that in Figure 8.11 under each of the three reported ns—121, 21, and 6—is another number labeled df,
which is one less than n—120, 20, 5. As we learned in Chapter 7, the df stands for degrees of freedom, a
number we generate to make use of a table of critical t values. In the case of the one-sample t test,
Degrees of freedom A number that is generated to make use of a table of critical values.
We need to find the degrees of freedom in order to find the critical values of t against which we compare our
obtained t. As noted, the sampling distribution of t changes from a normal curve as n decreases, and thus the
critical values change as well. As we see in Figure 8.11, at 120 degrees of freedom (n = 121) we need a t of
1.658 to have one tail on the sampling distribution with a .05 area. By the time degrees of freedom drops to
5, we need a t of 2.015.
Tables of critical values for all tests of significance beyond the z test require that we first calculate a degreesof-freedom figure to make use of the table. Why find df? Why not base the tables on n as we did the sampling
distributions in Figure 8.11? The simplest answer to the question is that there are several formulas that
generate t scores, not just the one presented in this chapter. Likewise, for each of the different t formulas,
there is a separate degrees-of-freedom formula. The formula df= n − 1 is used only for the one-sample t
test presented here. In the next chapter, we will discuss some of the other t formulas, each having its own
degrees-of-freedom formula, but all making use of a common table of critical values of t. Without degrees of
freedom, we would need a separate table of critical values for each separate formula.
There is a mathematical meaning to the concept of degrees of freedom, having to do with how many numbers
are free to vary in a formula. For instance, if x1 + x2 + x3 = 10 and you let any two of the scores vary (say
we make x1 = 2 and x2 = 5), then the remaining value of x is fixed. Since 2 + 5 = 7 and 7 + x3 = 10, once x1
and x2 are determined, x3 can take on only one value. In this case, x3 = 3. So three unknowns adding up to
a fixed sum has two degrees of freedom. Only two of the unknowns are free to vary. At the level of applied
statistics that we cover in this book, it is not really necessary to know the definition of degrees of freedom
to make use of the concept. So we will simply move on, referring the curious to more advanced texts. For
our purposes, degrees of freedom are simply numbers that we must calculate to make use of critical values
tables for t and the other tests of significance to be encountered later.
Page 28 of 47
Probability Distributions and One-Sample z and t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
THE t TABLE
The table of critical values of t, found in Table 8.2 and also in the Appendix, is simple to use. At the top are
levels of significance for a one-tailed test (a directional H1), and below it are the corresponding levels for a
two-tailed test. Thus, tcritical one-tailed at the .10 level is the same as tcritical two-tailed at the .20 level. The
one-tailed probability levels are always one half of the corresponding two-tailed levels.
Since we always begin by comparing tobtained to tcritical at the .05 level, we first isolate the appropriate .05
column for whichever H1 (one-tailed or two-tailed) we are using. Then we go down the df column on the far
left until we come to the number that we found in the df formula. Noting the values highlighted earlier in Figure
8.11, if df is 120, we go all the way down the df column until we find 120. We then move across the row until
we are under the .05 level for a one-tailed test. At the intersection of the 120 row and the .05 column, we find
the critical value of t, 1.658. Likewise, in the same .05 column, we find the tcritical of 1.725 in the row for 20
degrees of freedom and 2.015 in the row for 5 degrees of freedom.
Page 29 of 47
Probability Distributions and One-Sample z and t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
Table 8.2 Distribution of t
Page 30 of 47
Probability Distributions and One-Sample z and t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
Page 31 of 47
Probability Distributions and One-Sample z and t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
Under the df = 120 row, we note the symbol for infinity (an eight that has gone down for the count). In this
case, “infinity” is any df above 120. Here, the sampling distribution has become (or is in the act of becoming)
a more perfect normal curve. Note that at this point, there is no difference between the critical values of t and
those of z.
If you cannot find the df that you need in the table, go to the nearest critical value that makes it harder to
reject H0. In the case of Table 8.2, move up to the next lower df For instance, if the df is 35, a number not
presented in the table, go up to 30 df and use those critical values. Thus, if the t obtained in a one-tailed test
at 35 df were 1.7, you would compare it to the .05 critical value at 30 degrees of freedom, 1.697. Since 1.7 is
greater than 1.697, you would reject H0. What if the obtained t were 1.690? That would be less (barely) than
1.697, and you could not reject H0 using this table. However, you would be right in assuming that had you
known tcritical at 35 degrees of freedom, there would be a good chance that it would be equal to or less than
your t of 1.690. In this case, consult a book of tables for statisticians, which would have a more complete t
table than the one used here.3
If the obtained t exceeds tcritical at the .05 level, you then compare it to the critical values to the right of
the .05 column. Following the same procedure used for the z test, you make your probability statement by
seeing how many critical values are less than the obtained t. The only difference is that in the t table, there
are critical values for levels other than .05, .01, and .001. Suppose at 60 df, we obtain a t value of 3.0 using a
nondirectional H1. Going down the .05 level column, for the two-tailed test, we see at the 60 df row a critical
value of 2.000. We can reject H0. We then compare our 3.0 obtained t to the critical values to the right of the
2.000 we exceeded. We exceed the 2.390 (.02 level) and the 2.660 (.01 level) but not the 3.460 critical value
at the .001 level. Thus, we report p < .01. Had this been a one-tailed test, we would be reporting p < .005.
AN ALTERNATIVE t FORMULA
We have been using the following formulas:
where
Suppose you did not have access to but did know the original sample standard deviation of
Rather than recalculating, you may make use of the s in a modified t formula:
Page 32 of 47
Probability Distributions and One-Sample z and t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
Again, remember that if you get the standard deviation from either a computer printout or a calculator with
a standard deviation function built in, consult the appropriate manual to find out how that standard deviation
was calculated to determine whether you have an s or a σ. Then pick the appropriate t formula to use.
A z TEST FOR PROPORTIONS
The formula for the z test for sample means may be modified to test the difference in proportions in a sample
compared to the equivalent difference in proportions in a population.
ztest for proportions A z test designed to test whether the difference between
proportions in a sample reflects the difference in the population.
For instance, suppose that in some small community, the proportions of minorities (people of African or
Hispanic origin) make up 20% (.20 proportion) of the population. The new school superintendent suspects
that minorities are underrepresented among the 100 teachers in her public school system since there are only
15 minority faculty, 15% or a .15 proportion. For such a problem,
where
Ps = the proportion of minorities in the sample = .15,
Pp = the proportion of minorities in the population = .20,
Qp = the proportion of nonminorities in the population = 1 – Pp = 1 – .20 = .80,
n = the size of the sample or group being studied = 100.
Here,
Using the directional zcritical at the .05 level of 1.65, we cannot reject H0 since 1.25 < 1.65. We cannot
Page 33 of 47
Probability Distributions and One-Sample z and t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
conclude that minorities are underrepresented among the teachers.
Interval Estimation
We have already discussed the fact that if we did not know σ, our best estimate of it from sample data would
be
. Likewise, our best estimate of µ would be
. Suppose we wanted to estimate μ from
. We know from
the sampling distribution of sample means that not all sample means will be exactly equal to μ, even though
our one
is the best estimate of that parameter. With interval estimation, we establish an interval of scores
called a confidence interval, and we state with a certain level of confidence that the μ will fall within the limits
of the interval we created.
Interval estimation An interval of scores that is established, within which a population's
mean (or another parameter) is likely to fall, when that parameter is being estimated from
sample data.
Confidence interval (for means and proportions) An estimated interval within which we
are “confident”—based on sampling theory—that the parameter we are trying to estimate
will fall.
For instance, we can see from our sampling distribution that with no directionality assumption, 95% of all
sample means lie between µ and ± 1.96 standard errors. Likewise, 99% lie between µ and ±2.58 standard
errors. The number of standard errors corresponds to the two-tailed zcriticals at the .05 and .01 levels,
respectively. Also, 99.9% of all sample means lie between µ and ± 3.29 standard errors, and 3.29 is the
critical z at the .001 level. Suppose we would be satisfied to find the interval within which 95% of all sample
means would fall. We build an interval around the
and assume that µ will fall within that interval. We
call this the 95% confidence interval, our level of confidence corresponding to the percentage of all means
falling within the interval. Thus, we are 95% confident that µ will lie in the interval between
and
.
Remembering that we already know that
, we find our confidence interval by the following
formula:
Suppose
= 55, σ= 10, and n = 64. The upper limit of our interval would be
Page 34 of 47
Probability Distributions and One-Sample z and t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
Our lower limit would be
Thus, the 95% confidence interval for estimating µ is 52.55 to 57.45. We know that 95% of all sample means
fall within the interval, so we are 95% confident that µ will be between 52.55 and 57.45.
Suppose we wanted a greater level of confidence, 99%. The price we would pay for it would be a wider
confidence interval. For our upper limit,
and for our lower limit
We are 99% confident that µ falls between 51.77 and 58.23.
If σ is unknown, which is generally the case, we may do exactly the same procedure with the t test using
either of the following formulas:
The nondirectional tcritical at df = n − 1 at the .05 level would be used for a 95% confidence interval, the
tcritical at the .01 level would be used for a 99% confidence interval, and so on.
Confidence Intervals for Proportions
Imagine that you are a campaign manager of a presidential candidate in a two-person race. A telephone
survey of 900 voters gives your candidate a 53% lead over the opponent. How likely does that percentage
lead reflect the electorate? You seek to construct a 95% confidence interval around the .53 proportion that
your candidate received in the sample. The formula we use is
The 1.96 is the appropriate critical value of z—in this case, at the .05 level since we chose a 95% confidence
Page 35 of 47
Probability Distributions and One-Sample z and t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
interval.
Ps = your candidate's proportion of support in the sample = .53.
Pp = your candidate's proportion of support in the population, which we estimate with Ps. Thus,
Ps = Pp = .53.
Qp = the opponent's proportion of support in the population, which we estimate from the sample
by subtracting Ps from 1. Thus Qp = 1 – Pp = 1 – Ps = 1 – .53 = .47.
n = the number of cases, which must equal or exceed 5/min(Ps, 1 – Ps), that is, 5 divided by
whichever is smaller, Ps or 1 – Ps.
Thus, the upper limit of our 95% confidence interval is
The lower limit would be
So our confidence interval ranges from .50 to .56. Since in percentages, this is 50% to 56%, a range of 6
percentage points, we report that according to our poll, our candidate has a 53% lead, but our margin of error
is plus or minus 3 percentage points. Our candidate could receive as little as 50% or as much as 56%. If we
had chosen a 99% confidence interval, our confidence interval would be larger and so would the margin of
error reported.
When n is small or if we want to be particularly sure of our estimate, it is safer to make a more conservative
estimation of Pp than to use Ps. Here, we assume that each candidate has half of the vote. Thus, Pp = .50
and Qp = .50. This will yield a larger confidence interval than any other estimate of Pp would generate. By
widening the interval, we minimize the risk in making our estimate. Suppose Ps were .53, but n = 150 instead
of 900. We estimate Pp and Qp as .50, respectively. For the upper limit,
Page 36 of 47
Probability Distributions and One-Sample z and t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
And our lower limit would be
Here, our 95% confidence interval ranges from .45 to .61, and we have an 8 percentage point margin of error.
More on Probability
Suppose we have developed a scale to be used in a survey. This scale measures the extent to which the
respondent is aware of and knowledgeable about HIV and AIDS. Assume that the scale ranges from a low of
0 to a high of 100 and is a normal distribution with a mean of 50 and standard deviation of 15. We may thus
apply the z formula to this distribution to determine the proportion of cases falling within a specified range of
scores. Let us use this scale to extend our discussion of probability.
We begin by outlining some new notations and defining them.
P(A) = the probability of outcome A occurring.
P(A or B) = the probability of either outcome A or outcome B occurring.
P(A and B) = the probability of both outcomes A and B occurring jointly.
P(A | B) = the probability of outcome A occurring given that outcome B has already occurred
(conditional probability).
Conditional probability The probability of outcome A occurring given that outcome B has
already occurred.
Let us illustrate using our AIDS awareness scale. Suppose outcome A is the probability of selecting an
individual with an AIDS awareness score of 70 or above. Since x = 70, µ = 50, and σ = 15, we apply the z
formula and find z = 1.33. Looking at Table 8.1, column C, we find a probability of .0918. Thus, P(A) = .0918.
Let outcome B be the probability of selecting someone with an AIDS awareness score of 40 or below.
Plugging into the z formula, we obtain a z of −0.66 and find a probability of .2546 from Table 8.1. Thus, P(B)
= .2546.
Page 37 of 47
Probability Distributions and One-Sample z and t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
The Addition Rule
Suppose we would like to know the probability of selecting someone whose AIDS awareness score is either
70 or above or 40 or below, P(A or B). Now outcomes A and B are known as mutually exclusive outcomes.
If one has an AIDS awareness score above 70, one cannot also have an AIDS awareness score below 40.
When outcomes are mutually exclusive, a rule known as the addition rule tells us that
In this case,
Addition rule A rule by which when outcomes are mutually exclusive, the probability of
either outcome occurring is the sum of the probabilities of each outcome occurring.
If we had included a third outcome, outcome C, such as awareness between 50 and 55, we would calculate
a z of 0.33 and, looking this time at column B of Table 8.1, find a probability of .1293.
Therefore,
When our events are not mutually exclusive but overlap, we must apply a more complex addition rule.
Suppose outcome A remains a score of 70 and above, and we add another outcome, outcome D. If outcome
D is the probability of selecting a respondent with AIDS awareness between 50 and 75, z will be 1.66, and
column B of Table 8.1 will yield a probability of .4515. This time, however, we cannot simply add P(A) to P(D)
to find P(A or D) since our outcomes are no longer mutually exclusive. Anyone with a score between 70 and
75 will belong jointly to both outcomes. To account for this, we must expand the addition rule as follows:
We will see in a moment how P(A and D) is determined, but for now assume that we are told that it is .0414.
Therefore,
Note that in the first example, P(A or B), A and B had no overlap, so P(A and B) = 0. Applying the longer
addition rule,
Page 38 of 47
Probability Distributions and One-Sample z and t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
This matches the result found earlier.
The Multiplication Rule
As with the addition rule, there are two forms of the multiplication rule, the rule that we use to find P(A and
D). The simple form of this rule applies when the outcomes or events are independent of one another—when
neither event influences the probability of the other event occurring. Symbolically,
Multiplication rule A rule that is used to find P(A and D).
In these two events, A and D are independent; that is, neither event will affect the probability of the other
event's occurrence. In our example, determining the probability of selecting someone whose AIDS awareness
is 70 or more has no impact on determining the probability of selecting someone with an awareness score
between 50 and 75. Two z scores are calculated independently of one another.
In the case of independent events, the multiplication rule becomes
In our example,
That was how the value of P(A and D) used in the addition rule above was determined. Like the addition rule,
the multiplication rule can be extended to more than two independent events.
What about nonindependent events? Let us assume that anyone with a score of 65 or greater has high AIDS
awareness. Here z = 1.00, and column C of Table 8.1 shows that the probability of selecting a high-awareness
person is .1587. If we have a finite group of 13 individuals, we would expect to find .1587 × 13 or 2.06 highawareness scores.
Assume, therefore, that we have 13 people, 2 of whom have high AIDS awareness. What is the probability
of making two selections from the group and selecting the 2 high-awareness individuals? The events
are nonindependent since the outcome of the first selection has an impact on the second selection. The
probability of getting a high-awareness scorer on the first draw would be 2/13 or .1538.
If we select a high scorer on the first draw, the probabilities change on the second draw: There are now 12
people, 1 of whom is a high-awareness scorer. The probability of selecting a high scorer on the second draw
after having selected a high scorer on the first draw, P(H2 | H1), is 1/12 or .0833. (Here H stands for high
scorer; 1 and 2 for the first and second draws, respectively.) So
Page 39 of 47
Probability Distributions and One-Sample z and t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
In more general notation,
If two events are statistically independent events, then
Thus, we return to the simpler formula for the multiplication rule.
Permutations and Combinations
Earlier in this chapter, we discussed how many different samples of n = 3 we could draw from a population
where n = 5. Recall that there were 60 possible samples when the order of selection was considered. When
we drew first a 5 then a 4 then a 3, we considered it a different sample than when we drew first the 4 then the
5 then the 3. The total possible samples that can be drawn from a population when the order of selection is
a factor is called a permutation. The total possible samples when the order of selection is ignored is called
a combination. In the earlier example, there were 10 different possible samples when the order of selection
was ignored.
Permutation The total possible samples that can be drawn from a population when the
order of selection is a factor.
Combination The total possible samples when the order of selection is ignored.
The formulas for finding permutations and combinations are presented below, where N is the number of items
in the population, K is the size of the sample to be drawn, and n! (read n factorial) is a number times each
number lower than itself down to 1. (For example, 5! = 5 × 4 × 3 × 2 × 1 = 120; 4! = 4 × 3 × 2 × 1 = 24; 3! =
3 × 2 × 1 = 6; and so on. Note that we never include zero in the multiplication or our answer would always be
zero!)
For permutations, indicated by the letter P,
Page 40 of 47
Probability Distributions and One-Sample z and t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
In our example, N= 5 and K = 3, so
For combinations, indicated by the letter C,
In our example,
Conclusion
In Chapters 7 and 8, all the basic elements of tests of statistical significance have been presented in a
time-honored sequence, moving from the normal distribution in its basic form to the one-sample z test and
then to the one-sample t test. As stated earlier, every test that follows in this text also follows the same
logical assumptions and basic procedures, starting with the formulation of H0 and H1, calculating the df (if
appropriate), comparing the obtained value to critical values of that statistic, reaching a decision as to whether
or not to reject H0, and, if H0 is rejected, formulating the appropriate probability statement.
However, the tests presented so far have only limited value in that since they are one-sample tests, we are
comparing data from that one sample to data from a population. Rarely do we know population parameters
such as µ and σ, although it might be possible to estimate them, and rarely do we know if it is valid to
assume that these populations, in fact, are normally distributed for the variable in question. More often, we are
comparing the means of two or more samples, and we know no population parameters at all. Often, we have
problems involving nominal or ordinal levels of measurement when a comparison of means is inappropriate.
We cover tests for these purposes in the following chapters.
Chapter 8: Summary of Major Formulas
Page 41 of 47
Probability Distributions and One-Sample z and t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
Exercises
Exercise 8.1
An index of cognitive awareness is normally distributed with a mean of µ = 8.9 and a standard deviation of σ
= 3.1.
What proportion of people would be expected to have awareness scores of 14 and above?
Page 42 of 47
Probability Distributions and One-Sample z and t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
What proportion would have scores between 8.9 and 14?
What proportion would have scores below 14?
What proportion would have scores of 8.5 and below?
What proportion would have scores between the mean and 8.5?
What proportion would have scores ranging from 8.5 to 14?
What proportion would have scores either less than 8.5 or greater than 14?
Remembering that the areas under the normal curve are also the probabilities of randomly
selecting someone with a particular characteristic, what is the probability of randomly selecting a
person with an awareness score of 12.5 or more?
What is the probability of randomly selecting a person with an awareness score between 6 and
8.9?
What is the probability of randomly selecting someone whose awareness level is between 6 and
12.5?
Exercise 8.2
Each of the following problems requires either a one-sample z or a one-sample t test. Select the appropriate
test and perform it. Assume a nondirectional H1 unless the wording of the problem suggests otherwise. For
each test, indicate whether or not the normality assumption may be relaxed for the population. In doing the t
test, make sure you are using the appropriate formula; that is, are you given
or s?
Suppose you know that for the entire United States, the mean age of the population is 32, with
a standard deviation of 14.5 years. Since many retired people move to Florida, you believe that
the mean age of all Florida residents is greater than that for the United States as a whole. You
randomly select a sample of 144 Floridians and obtain a mean sample age of 34.
For the Miami metropolitan area, the mean age of a random sample of 25 residents is 36.5, with
a standard deviation of s = 16 years. Compared to the United States (data given in Part 1), what
may we conclude about Miami residents?
Suppose the sample size in Part 2 had been n = 64. What would your conclusion be?
A scale designed to measure support for gun control legislation has been developed. It ranges
from 0 to 10, with 10 meaning strongest support for such actions as outlawing “Saturday night
specials” and semi-automatic weapons. Suppose it has been determined that for the entire
population of the state of Maryland, the mean support score is 6.0. A random sample of 100
residents of Maryland's Eastern Shore yields a sample support score of 4.8 with
= 4.0. What
do you conclude?
For Baltimore County, a random sample of n = 81 has a mean of 7.0 and a standard deviation of
= 4.5. (For this problem and the ones that follow, use the population figures given in Part 4.)
What do you conclude for each one?
For Baltimore City, a random sample of 31 residents produces a mean of 8.0 and a standard
Page 43 of 47
Probability Distributions and One-Sample z and t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
deviation of s = 3.0.
A random sample of 170 members of the National Rifle Association who live in Maryland yields a
mean of 1.5 and an s = 1.25.
To ascertain the attitudes of all residents of the city of Cumberland, Maryland, a random sample
of 9 members of that city's police department was interviewed. The sample's mean was 8.5, and
its σ was 3.0.
Exercise 8.3
At a state's maximum-security penitentiary, all inmates have taken a battery of psychological tests. Following
are the means and standard deviations for several selected indices developed from those tests.
A random sample of 50 inmates at low- and medium-security institutions in the same state yields the following:
Using one-sample z tests (two-tailed), test for significant differences between these two groups for
What are your conclusions?
Exercise 8.4
Following are the maximum-security penitentiary population means for three other indices.
For the low- and medium-security sample, n = 50, the statistics are as follows:
Using a nondirectional one-sample t test, test for significance and state your conclusions.
Page 44 of 47
Probability Distributions and One-Sample z and t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
Exercise 8.5
Suppose that for the population, it is known that 51% are women and 49% are men. Suppose random
samples of 50 individuals each are drawn from the following occupations, and the proportion of women in
each sample is ascertained to be as follows:
For each of the six samples, test for significance (nondirectional) the null hypothesis that the proportion of
women in each sample equals the proportion of women in the population.
Exercise 8.6
A mental health assessment instrument designed to measure a person's mental health level on a 30 to 70
scale is known to have a population standard deviation of σ= 12. A random sample of n = 25 yields a mean
= 50.
Generate a 95% confidence interval for estimating y.
Generate a 99% confidence interval.
Suppose σ is unknown, but the sample yields a
= 11. Generate a 95% confidence interval.
Suppose σ is unknown, but the sample's standard deviation is s = 9. Generate a 99% confidence
interval.
Exercise 8.7
A telephone survey of 250 voters shows a local school tax levy passing with 55% of the vote.
Construct a 95% confidence interval.
Do the same confidence interval, but assume Pp = Qp = .50. What are your conclusions?
Exercise 8.8
In a training session, a group of managers will be asked to fill out an inventory designed to evaluate their
Page 45 of 47
Probability Distributions and One-Sample z and t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
ability to solve common management problems. The creators of this inventory have estimated that for the
population of all managers, the mean is 10 and the standard deviation is 3. The possible scores on the
inventory range from 0 to 20. Assume normality.
What is the probability of randomly selecting an individual with an inventory score of 15 or above?
(outcome A)
What is the probability of selecting someone whose score is 7 or below? (outcome B)
What is the probability of randomly selecting someone with a score either 15 and above or 7 and
below?
What is the probability of selecting someone with a score between 10 and 11? (outcome C)
What is the probability of selecting a person whose scores are either 15 and above, 7 or below,
or between 10 and 11?
If outcome A remains a score of 15 or above and outcome D is the probability of selecting
someone with a score between 10 and 12, what is the probability of randomly selecting a person
whose score is either 15 and above or between 10 and 12?
Assume that a score of 15 or above is considered an indicator of a very good manager (outcome
A above). If 43 managers filled out the inventory, what is the probability of making 2 random
selections from this group and obtaining 2 very good managers?
Exercise 8.9
How many different samples of size 3 can be drawn from a population of 6? If we disregard the
order of selection, how many different samples can be drawn?
For a population N = 8 and sample size K = 3, calculate the possible number of permutations and
combinations.
Notes
Be aware that there are several other ways of wording the central limit theorem and the law of
large numbers. In addition, these two are sometimes combined into a single theorem.
W S. Gosset, the expert, published his findings using the pen name Student. Thus, this test is
often called Student's t.
Unfortunately, it is hard to find more complete t tables that are relatively simple to read. Try H.
Arkin and R. Colton, eds., Tables for Statisticians (College Outline Series) (New York: Barnes
& Noble, 1963), p. 121, or H. R. Neave, Statistical Tables for Mathematicians, Engineers,
Economists and the Behavioural and Managerial Sciences (London: Allen & Unwin, 1978), p. 41.
▾KEY
▾
CONCEPTS▾▾
Page 46 of 47
Probability Distributions and One-Sample z and t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
http://dx.doi.org/10.4135/9781412985987.n8
Page 47 of 47
Probability Distributions and One-Sample z and t Tests
Two-Sample t Tests
In: Statistics for the Social Sciences
By: R. Mark Sirkin
Pub. Date: 2011
Access Date: February 28, 2022
Publishing Company: SAGE Publications, Inc.
City: Thousand Oaks
Print ISBN: 9781412905466
Online ISBN: 9781412985987
DOI: https://dx.doi.org/10.4135/9781412985987
Print pages: 271-316
© 2006 SAGE Publications, Inc. All Rights Reserved.
This PDF has been generated from SAGE Research Methods. Please note that the pagination of the
online version will vary from the pagination of the print book.
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
Two-Sample t Tests
▾PROLOGUE▾
▾
▾
With this chapter, we come back from the theoretical and study a family of tests with widespread research
applications. Recall that in Chapter 7's prologue, we wanted to study juvenile crime but we couldn't study
every juvenile criminal. We now know that we can use random samples (which are small enough for us to
study) in place of populations (which are too large for us to study). So maybe now we have two samples. One
is of juvenile offenders who did time in a detention facility, and the other is a sample of similar offenders who
received probation instead of detention. You as a researcher have developed an alienation index, which you
administer to everyone in each sample. You then calculate a mean alienation score for each sample: those
in detention and those on probation. Are the sample differences large enough to conclude differences in the
populations?
Another example from Chapter 7's prologue was the study of married couples. Suppose you have a group
of couples who are having problems in their relationships and you want to test the efficacy of a particular
marriage counseling technique. You take your couples and randomly assign each couple to one of two
groups. One group gets the counseling, and the other one (the control group) doesn't. When done, you may
compare a variety of variables to see if there are differences between the two groups, with (hopefully) the
group getting counseling showing improvement in their interpersonal relationships, as compared to the control
group.
Introduction
Like the one-sample t test, the two-sample t test is a comparison of two means, except that both means
are sample means. We no longer know population parameters, though, as before, we must assume that the
populations are normally distributed along the variable of interest, unless both samples are large enough
to relax these normality assumptions. We compare the two sample means to generalize about a difference
between the two respective population means. The null and alternative hypotheses are identical to those in
the one-sample t test, and H1 may be either nondirectional or directional. Since we often do not know or have
no basis for estimating the population parameters necessary for the one-sample test, the two-sample t test is
far more commonly used in actual research situations, where sample statistics alone compose the available
data.
The two-sample t test is more complicated than the one-sample variety. For one thing, this is really a family
of tests, and the researcher must select from this family the formula most appropriate to the data. As we will
see, this selection is based on a variety of factors, such as the way in which the samples were selected and
whether we may assume that the variances of the populations from which the samples were drawn are equal
Page 2 of 53
Two-Sample t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
in magnitude. Second, in most instances, the two-sample t formulas are more complex than the one-sample
formula and take longer to calculate. Third, there is one instance in which the degrees-of-freedom formula
is so onerous that many textbooks leave it out altogether or, as in this book, provide the formula but also
give an easier-to-calculate approximation of it. Despite these difficulties, the calculations may still be made
in a reasonable period of time, and because of its widespread usage, it is a particularly important test to
understand.
Independent Samples Versus Dependent Samples
The first kind of two-sample t test we discuss assumes that there are two samples (or groups) being compared
and that the samples are independent; that is, the composition of one sample is in no way matched or paired
to the composition of the other sample. Thus, the two samples reflect two separate populations. For example,
we select a random sample of 50 men and another random sample of 50 women to investigate genderdetermined views on social issues. Each sample is selected independently of the other. For each sample, we
know its size (n), its mean ( ), and its variance (s2) or, alternatively, its
using s2 rather than
2. (For now, we assume we are
2.)
Independent samples The composition of one sample is in no way matched or paired
to the composition of the other sample. Thus, the two samples reflect two separate
populations.
Our H0 is μ1 = μ2, and our H1 nondirectional is
(or if H1 is directional, either μ1 > μ2 or μ1 < μ2,
depending on prior knowledge).
In experimental research, the procedure would be to take a pool of subjects (or, as they are often called today,
participants) and randomly assign some of the subjects to the experimental group, which will receive some
Page 3 of 53
Two-Sample t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
experimental treatment. The remaining participants will compose the control group, which will not receive any
treatment. The procedure for selecting the experimental group is identical to drawing a random sample from
the available pool of participants. The random assignment of people to the experimental group implies that
the control group will also have been randomly assigned. Suppose that out of a pool of 11 people, 6 are
chosen to be in an experimental group that will watch a one-minute television commercial for a well-known
product. The format of this commercial has been used previously with similar products and has a history of
raising the viewers' levels of preference for them. Will it work here? The experimental group watches the ad
and then rates the product, assigning a favorability score that ranges from 0 to 10. The five-member control
group rates the product without seeing the advertisement.
The random assignment of the two groups should ensure that only viewing versus not viewing the commercial
would account for any difference in favorability scores between the groups. A mean is calculated for each
group, and the two means are compared. If the means differ from one another, is it due to a real difference
(as the result of the commercial) or is it due to random error or chance? In other words, if the experimental
group's mean is higher than the other, there is a possibility that, by chance, more participants favorable to the
advertised product were assigned to the experimental group than the control group. Thus, the mean of the
experimental group could be higher due to factors other than the commercial they watched.
We expect to find, in any sample that we draw or random assignment that we make, a certain degree of
deviation from the population parameters due to sampling error. Our test of significance is designed to tell us
whether the differences between the two sample means that we are comparing reflect a difference between
their respective population means—a statistically significant difference—or merely reflect the expected
sampling variation. In this case, the population means reflect the hypothetical means that would be generated
if the experiment were to be repeated infinitely. If the latter case is correct, the observed difference between
the two sample means is not statistically significant, and we do not have enough evidence to conclude
anything other than equality of the two respective population means.
Our experiment is represented as follows:
Page 4 of 53
Two-Sample t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
Our H0 is µ1 = µ2, and since we have prior evidence of the success of other commercials using the same
format, our H1 is directional: µ1 > µ2.
There is another type of research design, used mainly in experimental research, where the samples are
dependent. Members of one sample are not selected independently but are instead determined by the
makeup of the other sample. We call this a matched pairs situation, and another type of t test, a dependent
samplesttest, is used. To understand what we mean by a matched pair, imagine a situation where we start
with a set of pairs of identical twins. One of each pair of twins is assigned to the experimental group, and the
other member of that pair goes to the control group.
Thus, being included in the control group is dependent on one’s twin being included in the experimental
group. Then, presumably, each group would share identical inherited traits, making any difference between
the groups a function of environmental (as opposed to hereditary) differences—notably, the effects of the
experiment. Sometimes, the pairs are not twins but are related in other ways (for example, wives and their
respective husbands). Or the pairs could be based on other factors such as age and race (for example, if
one group includes a 35-year-old Caucasian female, the second group would include another 35-year-old
Caucasian female).
Dependent samples or matched pairs Situation in which members of one sample are
not selected independently but are instead determined by the makeup of the other sample.
Dependent samplesttest The t test used when the two samples are dependent samples.
A very common matched pairs situation is a before-after or repeated-measures experiment. Imagine that
each member of a panel was asked to rate each of two candidates contending for the same elective office.
Page 5 of 53
Two-Sample t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
Later, the panel watches a 2-hour debate between the contenders. At the end of the debate, each panel
member rates the candidates again. The matched pair is each person’s “before” score with that person’s
“after” score. Any difference would presumably be due to the debate watched. We will return to this subject
later when the dependent samples t test is discussed.
THE TWO-SAMPLE t TEST FOR INDEPENDENTLY DRAWN SAMPLES
In this instance, there is no matching but rather two independently drawn samples or randomly selected
groups. The sampling distribution of which we make use is the t distribution, except it is derived from the
sampling distribution of the differences between all theoretically possible pairs of sample means. (Consult a
more advanced text for more details.) As with the one-sample t test, the difference between our two sample
means is divided by the standard deviation of the sampling distribution, the standard error, to find t. Now,
however, our standard error is the standard deviation of sample mean differences, a different standard error
than the
or the
used in the one-sample t test.
This raises a new complication: The standard error we seek is an estimate based on the variances of our two
samples. If the two sample variances are close in magnitude, we may assume that both parent populations
from which the samples were drawn have the same variance, that is,
. If these population variances
are the same, we can calculate the standard error of our t formula using what we call a pooled estimate
of common variance. This is based on a weighted average of our two sample variances being used to
estimate the population variance in finding the standard error. If we can make use of this pooled estimate,
we generally will have a greater opportunity to reject the null hypothesis than would be the case when equal
population variances cannot be assumed. If, however, we cannot assume equal population variances, that is,
, we must use a different formula for finding the standard error and, thus, a different t formula. To
determine which t formula is most appropriate, we can use the Ftest for homogeneity of variances. (The
test is shorter than its title!) Computer programs that run t tests will usually run both the equal and unequal
population variance t formulas and also do an F test to help the reader ascertain which obtained t value is
more accurate.
Pooled estimate of common variance Estimate based on a weighted average of two
sample variances being used to estimate the population variance in finding the standard
error.
Ftest for homogeneity of variances A test, based on the sample variances, used to
determine the most appropriate t test formula to use.
To avoid confusion, let us stop here and lay out a set of steps for the two-sample t test (independent samples)
and then illustrate these steps with an example. At the appropriate points, the formulas will be given and
Page 6 of 53
Two-Sample t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
explained for the F test for homogeneity of variances, the equal population variance t test, and the unequal
population variance t test. First, the steps:
Write out H0 and H1 for the original problem, the comparison of the two sample means.
For each sample, determine its n, its x, and its variance, s2.
To determine which t formula to use (that for equal population variances or that for unequal
population variances), do the F test for homogeneity of variances.
a. Write out H0 and H1 for the F test (these are not the same as the ones for the t
test).
b. Calculate F and its two degrees of freedom.
c.
Compare the obtained F to Fcritical, .05 level, taken from a table of critical
values of F.
d. If Fobtained ≥ Fcritical, assume unequal population variances. If
Fobtained < Fcritical, assume equal population variances.
Perform the appropriate t test as determined by the F test.
Example 1. Let us work through these steps, using the example of the TV commercial. Recall that 6 people
(the experimental group) will see the commercial and then evaluate the product's favorability The other 5
(the control group) will evaluate favorability without seeing the commercial. Remember also that because of
previous success with the same format for the ad, we expect favor-ability to rise once the viewing is complete,
so our H1 is directional. Note also that because of the smallness of our groups (n1 = 6 and n2 = 5), we must
assume that favorability is normally distributed in our two populations.
Write out H0 and H1:
(Group 1 is the experimental group.)
Determine n,
Page 7 of 53
, and s2 for each sample. Following are the data:
Two-Sample t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
To get the sample variances using the computational formula, we must square our two sets of
scores. (Later on, we will ease your burden by providing you with the variances in advance.)
Summarizing our results:
Page 8 of 53
Two-Sample t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
We perform the F test for homogeneity of variances.
a. In the F test we are doing, the null hypothesis is that the population variances are
equal, and the alternative hypothesis is that they are not.
(In the F test, H1 is always nondirectional.)
b. To calculate F divide the larger of the two sample variances by the smaller one.
In this case, the larger s2 is the one for Group 2, the control group. Thus,
The F has two degrees of freedom, one associated with the numerator and one associated with the
denominator. In each case, we subtract one from the sample size. The numerator degrees of freedom is one
less than the size of the sample having the larger variance. The denominator degrees of freedom is one less
than the size of the sample having the smaller variance. In this case,
c.
Our obtained F is then compared to Fcritical, .05 level, df = 4 and 5. Table 9.1, a portion of a fuller
F table to be presented later, gives Fcriticals for the .05 level only. In the table, n1 means the
numerator degrees of freedom. (n1 and n2 here mean degrees of freedom, not sample sizes!) We
move along the row until we find the column for the appropriate degrees of freedom. In this case,
n1 = 4. We go down the n2 column until we find the row for our degrees of freedom, n2 = 5. We
move along the n2 = 5 row until it intersects the n1 = 4 column. The number at that intersection,
5.19, is our Fcritical.1
d.
To reject the H0 for the F test, Fobtained must equal or exceed this Fcritical. But Fobtadned =
Page 9 of 53
Two-Sample t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
1.64 and Fcritical = 5.19. Fobtained = 1.64 > Fobtained .05, (df = 4 and 5) = 5.19. We cannot
reject H0, so we now perform the t test designed for equal population variances, to be presented
in Step 4.
It should be noted here that many computer routines, including two of those to be discussed shortly, calculate
F from
2 rather than s2. Had we done that here, our obtained F would have been 3.700/2.166 = 1.708,
still less than Fcritical. The difference between 1.708 and the 1.64 above would have been smaller if our
sample sizes had been larger, which is almost always the case in nonexperimental research. Rarely would
the discrepancy between the two ways of obtaining F change our decision as to which t test to use.
Since equal population variances may be assumed, we use the following formula in which the
denominator is the pooled variance estimate.
Page 10 of 53
Two-Sample t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
Table 9.1 Critical Values of F (.05 level only) for the F Test for Homogeneity of Variances
Page 11 of 53
Two-Sample t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
Page 12 of 53
Two-Sample t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
It’s easiest to first do the components of the t formula and then put them together.
Thus,
The df = n1 + n2 − 2 = 6 + 5 − 2 =11 − 2 = 9. We use the critical values of t for a directional H1. See Table
9.2.
Note that if H1 had been nondirectional,
and we would not have been able to reject H0.
Since H0 has been rejected, we conclude our alternative hypothesis of μ1 > μ2. As in previous instances,
this particular commercial resulted in increased favorability ratings for the product featured in the commercial.
Since H0 has been rejected, we continue to assume that if the entire consumer population had viewed the
advertisement, their mean support score µ1 would also increase.
Remember that there are risks in using directional alternative hypotheses. Not only do we need prior evidence
of the assumed direction, as was the case here, but we must always make sure that our findings are
consistent with the direction assumed. Suppose
had been 3.77 instead of 7.83, but we retained the μ1 > μ2
alternative hypothesis. The numerator of the t formula would be 3.77 − 5.80 = −2.03, and t would be −1.971.
Using its absolute value of 1.971, we would reject H0 in the same way as just done. Yet, if anything, our two
sample means suggest an H1 of μ1 < μ2, and we have findings inconsistent with the original H1. Even though
Page 13 of 53
Two-Sample t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
we reject H0, we cannot conclude μ1 > μ2. In this case, our only option would have been a two-tailed H1, but
we have already seen that, in such a case, we cannot reject H0.
Page 14 of 53
Two-Sample t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
Table 9.2 Distribution of t
Page 15 of 53
Two-Sample t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
Page 16 of 53
Two-Sample t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
Note, finally, that it is conceivable—albeit improbable—that really μ1 > μ2 but, due to sampling error,
2. However, we have no basis for knowing that fact when we do our study. Accordingly, if
said μ1 > μ2, or the reverse,
1>
1<
1<
2 but H1
2 but H1 stated μ1 < μ2, do not proceed with a one-tailed test.
To summarize to this point, we established H0 and H1 for our data; found n,
, and s2 for each sample;
and did the F test for homogeneity of variances to determine the appropriate t formula to use. In this case,
the F test led us to use the t formula where equal population variances are assumed. Using the appropriate
t test, we were able to reject H0 with a probability of p < .05 and conclude that in the population, viewing
the commercial enhances support for the product featured. Now, let us see what would happen if the F test
concluded unequal population variances.
Example 2. Suppose Sample 1 remained the same, but the scores for Sample 2 were as follows:
The sample size and sample mean stay the same as before, but the sample variance is now larger.
Thus,
Page 17 of 53
Two-Sample t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
Redoing the F test for homogeneity of variances,
At df = 4 and 5,
We may not assume equal population variances.
Once again, if Fobtained were generated using the
, we would find that F = 14.700/2.166 = 6.786. Just as
above, we may not assume equal population variances.
Where population variances are assumed to be unequal, we use the following formula:
As before, we first calculate the components.
Using as df the smaller value of n1 or n2, df = 5.
For a directional H1, tcritical, .05 level, (df = 5) = 2.015 > 1.115. We cannot reject H0.
The degrees of freedom for this problem, the lesser of n1 or n2, is only a substitute for a complex degreesof-freedom formula used by packaged statistical computer programs. This larger formula is a more accurate
approximation but is often too cumbersome for noncomputer applications.
Page 18 of 53
Two-Sample t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
If we did use it in the above t test, we would also work it in stages.
Thus,
A computer would list df as 4.97.
If we were doing this by hand, we would round down (not up) and use df = 4 in our t table. Again, this makes
it harder to reject H0 than it would have been had we been able to round up to 5 degrees of freedom. We
get essentially the same results as before: tcritical, one-tailed, .05 level, (df = 4) = 2.132 > 1.115, and we
cannot reject H0. The ad has no effect on favorability scores (in this example, as modified for the assumption
of unequal population variances).
Page 19 of 53
Two-Sample t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
ADJUSTMENTS FOR SIGMA-HAT SQUARED ( 2)
As was the case with the one-sample t test, it is possible that instead of s2 for each sample, n − 1 replaced
n in the denominator of the formula, and consequently, what was calculated was
2, not s2. We then would
need to modify our t formulas accordingly.
In the first example, where population variances were assumed
We would make the following modification in the t formula for equal population variances:
As before,
1–
2 = 2.03 and 1/n1 + 1/n2 = 0.37. Recalculating the remaining expression to adjust for
2 ,
1
just as it was in the original formula using s2.
In the example where unequal population variances were assumed, we would find:
We modify the t formula for unequal population variances as follows:
Page 20 of 53
Two-Sample t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
is still 2.03. Recalculating the denominator to adjust for
2,
So, t = 2.03/1.82 =1.115 as it was when we used s21 and s22.
Finally, the alternative longer degrees-of-freedom formula for unequal variance t tests would become
INTERPRETING A COMPUTER-GENERATED t TEST
Figure 9.1 presents the SPSS T-TEST printout for Example 1 in this chapter. Starting on the left of Figure
9.1, some general statistical information is presented. The dependent variable has been coded VAR00002 by
the researcher. For each group (category of VAR00001), the printout lists its size, mean, standard deviation
(SPSS uses the
formula and not s2), and standard error. Note the box below labeled “Independent
Samples Test.”
Page 21 of 53
Two-Sample t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
Figure 9.1 SPSS Printout for Example 1
Instead of using the F test for the homogeneity of variances presented in this chapter, SPSS uses Levene’s
test for equality of variances, which is less dependent on the normality assumption than the F test for
homogeneity of variances presented earlier in this chapter. We interpret the Levene findings exactly the same
way as the F test. Since Levene’s F of .258 generated a probability of .623 (Sig.), greater than .05, we cannot
reject a null hypothesis of equal population variances.
Accordingly, we will use the t test, with equal variances assumed. Under the title “t Test for Equality of Means,”
note the t value of 1.990. That is the one we want. (Below it is 1.938, which is t when equal variances are not
Page 22 of 53
Two-Sample t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
assumed.) To the right of the t values are the degrees of freedom. To the right of degrees of freedom, under
Sig. (two-tailed) is the probability—the exact p value (not a < .05 statement but the actual probability). The t,
df, and p values on this line are all based on the formula for equal population variances. On the line below,
we find the t, df, and p values from the formula that assumes unequal population variances. The degrees of
freedom for the unequal population variances t test is based on the long formula. Recall that for Example 1,
we calculated the t only for equal population variances. Our calculated t of 1.971 differs from SPSS's 1.990
since we used fewer decimal places in calculating t. Our df and SPSS's are the same, 9. The probability on the
printout is the two-tailed probability .078, not the one-tailed probability that we used in our hand calculations.
To get the one-tailed probability, divide .078 by 2. Thus, the exact directional probability is .039—less than .05
and greater than 01.
Examine the probability of F.
a. If the probability of F exceeds .05, we cannot reject a null hypothesis (for the F
test) of equal population variances. Use the t test results to the right of “equal
variances assumed.”
b. If the probability of F is less than or equal to .05, we reject the F test's H0 and
assume unequal population variances. Use the t test results to the right of
“equal variances not assumed.”
Examine the appropriate (EQUAL or UNEQUAL) probability of t.
a. If the probability (Sig.) of |T| exceeds .05, we cannot reject the initial null
hypothesis (the one for the t test). The difference is not statistically significant, μ1
= μ 2.
b. If the probability of | T| is less than or equal to .05, we reject the t test's null
hypothesis and conclude H1. (Note: If H1 is directional, divide the significance
by 2 before working Step 2.) For this specific problem, the probability of F, .623,
exceeds .05. We cannot reject
, so we will use the t test for
equal population variances. Looking along the “Equal Variances Assumed”
row, we find a probability (significance) of .078. Since our original H1 was onetailed, divide the .078 probability by 2. The result is .039 (as we previously
demonstrated). Since .039 is less than .05, we reject our original H0 of μ1 = μ2
and conclude H1:
COMPUTER APPLICATIONS: INDEPENDENT SAMPLES t TESTS
Let us take the data from Example 1 upon which Figure 9.1 was based, set it up, and run the two-sample t
test using SPSS. We will then do the same for SAS and Excel. Before starting, you may want to review the
setup instructions for SPSS and SAS presented in Chapter 6. (Excel was not presented then because it has
no current routine for crosstabs.) We will also compare the outputs from the three programs.
Page 23 of 53
Two-Sample t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
The SPSS
Variable 00001 will be whether or not the respondents saw the commercial, coding 1 if they saw it and coding
2 if they did not see it. Variable 00002 will be the favorability rating. Table 9.3 shows the data list. We then
click on the following menu options:
Table 9.3
In the Dialog box, highlight VAR00002 and use the top button with the pointer to click it over to the Test
Variable box. We then highlight VAR00001 and move it into the Grouping Variable box, using the lower button
with the pointer. We then click the define groups button and type 1 to the right of “Group 1” and 2 to the right
of “Group 2.” Then click continue, bringing us back to the first Dialog box. Then click ok to get the output
presented in Figure 9.1.
The SAS
At the top of the screen, click as follows:
While it is possible to list the data the way it was done for SPSS and run the same SAS t test routine, it
is easier to code both SAS and Excel differently. In column A, enter the six scores for Sample 1 (saw the
commercial), and in column B, enter the five scores of the Sample 2 members (didn't see the commercial)
(see Figure 9.2).
Page 24 of 53
Two-Sample t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
Figure 9.2 SAS Data and Printouts for Example 1
Page 25 of 53
Two-Sample t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
Page 26 of 53
Two-Sample t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
This is because with SAS, the F test for homogeneity of variances is run separately from the t tests. This is
similar to the F value we calculated earlier in this chapter. In the upper left portion of the Dialog box, you will
see Groups are in and, below it, one variable with two variables below that. Click on two variables. Below,
just above the word remove, you will see a box with A and B in it. You will need to highlight (left click) each
letter and move it to either group 1 or group 2 by clicking on the appropriate button. Here you must decide
whether A should be moved to group 1 or group 2. Remember, we must divide the larger variance by the
smaller! Now we know from our earlier calculation that Sample 2 (B) had a variance of 3.700, and Sample 1
(A) had a variance of 2.166, so B, with the larger variance, should go into group 1, and A, with the smaller
variance, should go to group 2. What if we didn't know the variances? We could use the descriptive statistics
SAS routine, but there is an easier way. Click B into group 2 and A into group 1. Then click the ok button.
The first output reproduced in Figure 9.2 appears. Note under Sample Statistics that B, with the larger
variance (3.7), appears first and A, with the smaller variance, appears under B. As variance, 2.166667,
appears directly under B's variance. If the larger variance is not above the smaller one, redo the test,
reversing the letters in group 1 and group 2! Or simply examine F. If it is less than one, reverse the groups
and redo the run, or recalculate F by hand from the variances on the printout.
Below, where it says Hypothesis Test, you will note that F = 1.71, degrees of freedom are 4 and 5, and
the exact probability is 0.5674, way above .05, so we cannot reject the null hypothesis of equal population
variances. Now we run the t test:
The Dialog box resembles the one you just had for the F test. Click the groups are in to the two variables
category. Move A to group 1 and B to group 2. (If you reversed the groups for A and B, all it would do is
change the sign of t from positive to negative.) Click ok.
Figure 9.2 also displays the t test run. (Don't be concerned about the way the null and alternative hypotheses
are stated; they mean the same as what we have been using.)
You see that the t statistic for equal population variances is 1.990, df = 9, and the probability (two-tailed) of
0.0778 must be divided by two for our directional alternative hypothesis, yielding 0.0389, less than .05. (If you
don't feel like dividing, you could have moved the alternative hypothesis in the Dialog box to mean 1 – mean
2 < 0.) So we reject the null hypothesis for the t test. The commercial had the desired effect.
Excel
Page 27 of 53
Two-Sample t Tests
SAGE
SAGE Research Methods
2006 SAGE Publications, Ltd. All Rights Reserved.
Make sure your Excel program has the Data Analysis Toolpack Add-ins installed. When you open the
program, a spreadsheet appears. Note at the bottom left-hand side that it says sheet 1. The rows are
numbered, and the columns are labeled with letters of the alphabet, just as with SAS. The data are entered
exactly as they were with SAS, resembling the way the data for Example 1 first appeared in this chapter,
with each column on the spreadsheet being a different group. Column A contains the scores for Sample 1,
those who saw the commercial, and column B contains the scores for Sample 2, those who did not see the
commercial. The data entry pattern and the output subsequently generated...
Delivering a high-quality product at a reasonable price is not enough anymore.
That’s why we have developed 5 beneficial guarantees that will make your experience with our service enjoyable, easy, and safe.
You have to be 100% sure of the quality of your product to give a money-back guarantee. This describes us perfectly. Make sure that this guarantee is totally transparent.
Read moreEach paper is composed from scratch, according to your instructions. It is then checked by our plagiarism-detection software. There is no gap where plagiarism could squeeze in.
Read moreThanks to our free revisions, there is no way for you to be unsatisfied. We will work on your paper until you are completely happy with the result.
Read moreYour email is safe, as we store it according to international data protection rules. Your bank details are secure, as we use only reliable payment systems.
Read moreBy sending us your money, you buy the service we provide. Check out our terms and conditions if you prefer business talks to be laid out in official language.
Read more