Inferential
Statistics
The statistics we’ve considered
so far allow us to summarize the data we’ve collected, but we often want to do
more than this. We want to reach beyond our data and make broader
claims—usually claims about the correctness (or the falsity) of our hypothesis.
For this purpose, we need to make inferences based on our data—and so we turn
to inferential statistics.
In the example we’ve been
discussing, we began with a question: Do boys and girls dif-fer in how
aggressive they are? This comparison is easy for our sample because we know
that in the (fictional) data we’ve collected, the boys’ mean level of
aggression is 17, and the girls’ mean is 12. Obviously, the boys’ scores are
higher—providing a clear answer to our question for this sample. Ultimately, though, the sample isn’t our concern.
Instead,we want to make claims about a broader population; and so we still want
to ask: Are boys in general (i.e., boys outside of our sample) more aggressive
than girls in general?
To answer this question, we need
to make some estimates based on our sample. Researchers use specific
calculations for this purpose, but leaving the details aside, these
calculations center on three points. First, how big a difference is found in
the sample? In our example, the effect
size—the difference between the groups—is 5 units (this is the difference
between the boys’ average, 17, and the girls’ average, 12). In gen-eral, the
larger the effect size, the more likely it is that the result can be taken at
face value.
Second, how great is the variability of the data? The logic here
is straightforward: If the variability in our sample is low, essentially this
means that we kept seeing the same scores again and again. In that case, we’re
likely to continue getting the same scores if we look beyond our sample—and so
we should be more comfortable in extrapolating from our sample. If, on the
other hand, the data in our sample are highly variable, then we know that the
broader population is diverse—so if we looked beyond this sample, we might find
much higher scores or much lower ones. In this case, we’d have to be much more
cautious in drawing conclusions from our sample.
Third, how many observations does
the data set include? All things being equal, the greater the number of
observations, the more likely the sample is to reflect the proper-ties of the
broader population. Thus, the greater the number of observations, the more
trust we can put in our results.
Let’s be clear, though, that
these three factors—effect size, variability, and number of observations—can trade
off against each other. If, for example, the effect size is small, then we will
draw conclusions from our data only if we’re certain that our measure-ments are
quite accurate—and so we will insist on a low level of variability and a large
number of observations. Similarly, if the variability is very low, then there’s
no reason to collect a huge amount of data—this would simply give us the same
observations, over and over. Therefore, with low variability, we need
relatively few data points.
These trade-offs are, in fact,
built directly into the calculations researchers use to evaluate the statistical significance of a result.
This calculation takes as its input the effect size, the variability, and the
number of observations; and ultimately, its output is something called a p-value. This value is literally the
probability of getting the data pat-tern we did purely by accident. After all,
we know that peculiar accidents can happen: Sometimes, when you toss a coin,
you get five “heads” in a row, entirely by chance. When you’re rolling dice,
you might—just by luck—go 30 or 40 rolls without rolling a seven. Our
calculations tell us whether the result in our experiment is a similar sort of
fluke—a chance occurrence only.
What could “chance” mean in the
context of our comparison between boys and girls? Bear in mind that some boys
are simply more aggressive than others, and likewise for girls. So it’s
possible that—just by luck—our sample of boys happens to include a couple of
particularly aggressive individuals who are driving up the group’s average.
It’s also possible that—again, just by luck—our sample of girls includes
several especially well-behaved individuals who are driving the group’s average
down. This is the sort of possibility that our statistics evaluate—by asking
how consistent the results are (meas-ured by the variability) as well as how
likely it is that just a few chance observations (e.g., a few misbehaved boys
or a few especially docile girls) could be causing the differ-ence we’ve
observed between the groups (i.e., the effect size).
If the p-value we calculate is high, then we’ve observed an outcome that could easily occur by chance. In that case, we would cautiously conclude that our result might have been a random fluke, and so we should draw no conclusions from our data. If the p-value is low (i.e., we observe an outcome that would rarely occur by chance), we conclude the opposite—the outcome is likely not the product of chance and should be taken seriously.
In most cases, psychologists use
a 5% rule for making this determination— usually expressed with the cutoff of p
< .05. In other words, if the probability of getting a particular outcome
just by chance is less than 5%, then we conclude that the particular result we
observed is unlikely to be an accident. Let’s note, though, that there’s
nothing sacred about this 5% rule. We use other, stricter rules when
evaluat-ing especially important results or when considering patterns of evidence rather than just a
single result.
So, with all of this said, what’s
the actual difference between boy’s and girl’s aggression—with real data, not
the fictional data we’ve been considering so far (Figure 1.14)? The answer,
with carefully collected data, turns out to depend on how exactly we define the
dependent variable. If we measure physical
aggression—pushing, physical intimidation, punching, kicking, biting—then males
do tend to be more aggressive; and it doesn’t matter whether we’re considering
children or adults, or whether we assess males in Western cultures or Eastern
(e.g., Geary & Bjorklund, 2000). On the other hand, if we measure social aggression—ignoring someone,
gossiping about someone, trying to isolate someone from their friends—then the
evi-dence suggests that females are, in this regard, the more aggressive sex
(Oesterman et al., 1998). Clearly, our answer is complicated—and this certainly
turns out to be a study in which defining the dependent variable is crucial.
Related Topics
Privacy Policy, Terms and Conditions, DMCA Policy and Compliant
Copyright © 2018-2023 BrainKart.com; All Rights Reserved. Developed by Therithal info, Chennai.