When you flip a coin, there are two possible outcomes: heads and
tails. Each outcome has a fixed probability, the same from trial
to trial. In the case of coins, heads and tails each have the
same probability of 1/2. More generally, there are situations in
which the coin is biased, so that heads and tails have different
probabilities. In the present section, we consider probability
distributions for which there are just two possible outcomes
with fixed probability summing to one. These distributions are
called are called binomial
distributions.
The four possible outcomes that could occur if you flipped a
coin twice are listed in Table 1. Note that
the four outcomes are equally likely: each has probability
14. To
see this, note that the tosses of the coin are independent
(neither affects the other). Hence, the probability of a head
on Flip 1 and a head on Flip 2 is the product of
PrH
H
and
PrH
H
, which is
1/2×1/2=1/4
12
12
14
. The same calculation applies to the probability of
a head on Flip one and a tail on Flip 2. Each is
1/2×1/2=1/4
12
12
14
.
Four Possible Outcomes
|
Outcome
|
First
Flip
|
Second
Flip
|
|
1
|
Heads
|
Heads
|
|
2
|
Heads
|
Tails
|
|
3
|
Tails
|
Heads
|
|
4
|
Tails
|
Tails
|
The four possible outcomes can be classifid in
terms of the number of heads that come up. The number could be
two (Outcome 1), one (Outcomes 2 and 3) or 0 (Outcome 4). The
probabilities of these possibilities are shown in Table 2 and in Figure 1. Since two of the
outcomes represent the case in which just one head appears in
the two tosses, the probability of this event is equal to
1/4+1/4=1/2
14
14
12
. Table 1 summarizes the situation.
Probabilities of Getting 0,1, or 2 heads.
|
Number
of
Heads
|
Probability
|
|
0
|
1/4
|
|
1
|
1/2
|
|
2
|
1/4
|
Figure 1 is a discrete probability distribution:
It shows the probability for each of the values on the
X-axis. Defining a head as a "success," Figure 1
shows the probability of 0, 1, and 2 successes for two trials
(flips) for an event that has a probability of 0.5 of being a
success on each trial. This makes Figure 1 an
example of a binomial distribution.
We toss a coin 12 times. What is the probability that we get
from 0 to 3 heads? The answer is found by computing the
probability of exactly 0 heads, exactly 1 head, exactly 2
heads, and exactly 3 heads. The probability of getting from 0
to 3 heads is then the sum of these probabilities. The
probabilities are: 0.0002, 0.0029, 0.0161, and 0.0537. The sum
of the probabilities is 0.073. The calculation of cumulative
binomial probabilities can be quite tedious. Therefore we have
provided a binomial calculator to make it easy to calculate
these probabilities.
Click
here for the binomial
calculator.
Consider a coin-tossing experiment in which you tossed a coin
12 times and recorded the number of heads. If you performed
this experiment over and over again, what would the mean
number of heads be? On average, you would expect half the coin
tosses to come up heads. Therefore the mean number of heads
would be 6. In general, the mean of a binomial distribution
with parameters NN (the number of
trials) and π (the
probability of success for each trial) is:
m=Nπ
m
N
where
m m is the mean of the binomial
distribution. The variance of the binomial distribution is:
s2=Nπ1-π
s
2
N
1
where
s2
s
2
is the variance of the binomial distribution.
Let's return to the coin tossing experiment. The coin was
tossed 12 times so
N=12
N
12
. A coin has a probability of 0.5 of coming up
heads. Therefore,
π=0.5
0.5
. The mean and standard deviation can therefore be
computed as follows:
m=Nπ=12×0.5=6
m
N
12
0.5
6
s2=Nπ1-π=12×0.51.0-0.5=3.0
s
2
N
1
12
0.5
1.0
0.5
3.0
Naturally, the standard deviation
s
s
is the square root of the variance
s2
s
2
.
- binomial distributions:
A
probability distribution for
independent events for which
there are only two possible outcomes such as a coin flip. If
one of the two outcomes is defined as a success, then the
probability of exactly
x x
successes out of
N N trials
(events) is given by:
Prx=N!x!N-x!πx1-πN-x
x
N
x
N
x
x
1
N
x
where
π is the probability of
success one one trial.
- conditional probability:
The probability that event A occurs given that event B has
already occurred is called the conditional probability of A
given B. Symbolically, this is written as
PrA|B
B
A
. The probability it rains on Monday given that it
rained on Sunday would be written as
PrRain on SundayRain on Monday
Pr
Rain on Sunday
Rain on Monday
.
- continuous variables:
Variables that can take on any
value in a certain range. Time and distance are continuous;
gender, SAT score and "time rounded to the nearest second" are
not. Variables that are not continuous are known as
discrete variables. No measured
variable is truly continuous; however, discrete variables
measured with enough precision can often be considered
continuous for practical purposes.
- discrete:
Variables that can only take on a finite number of values are
called "discrete variables." All
qualitative variables are
discrete. Some
quantitative
variables are discrete, such as performance rated as
1,2,3,4, or 5, or temperature rounded to the nearest
degree. Sometimes, a variable that takes on enough discrete
values can be considered to be continuous for practical
purposes. One example is time to the nearest millisecond.
Variables that can take on an infinite number of possible
values are called
continuous
variables.
- independent events:
Intuitively, two events A and B are independent if the
occurrence of one has no effect on the probability of the
occurrence of the other. For example, if you throw two dice,
the probability that the second one comes up 1 is independent
of whether the first die came up 1. Formally, this can be
stated in terms of
conditional
probabilities:
PrA|B=PrA
B
A
A
and
PrB|A=PrB
A
B
B
.
- levels of measurement:
Measurement scales differ in their level of measurement. There
are four common levels of measurement:
-
Nominal scales are only
labels.
-
Ordinal Scales are ordered but
are not truly quantitative. Equal intervals on the ordinal
scale do not imply equal intervals on the underlying
trait.
-
Interval scales are are ordered and equal
intervals equal intervals on the underlying
trait. However, interval scales do not have a true zero
point.
-
Ratio scales are interval scales
that do have a true zero point. With ratio scales, it is
sensible to talk about one value being twice as large as
another, for example.
- nominal scale:
A nominal scale is one of four
Levels of Measurement. No ordering
is implied, and addition/subtraction and
multiplication/division would be inappropriate for a variable
on a nominal scale.
FemaleMale
Female
Male
and
BuddhistChristianHinduMuslim
Buddhist
Christian
Hindu
Muslim
have no natural ordering (except
alphabetic). Occasionally, numeric values are nominal: for
instance, if a variable was coded as Female=1, Male=2, the set
12
1
2
is still nominal.
- ordinal scale:
One of four
levels of
measurement, an ordinal scale is a set of ordered
values. However, there is no set distance between scale
values. For instance, for the scale: (Very Poor, Poor,
Average, Good, Very Good) is an ordinal scale. You can assign
numerical values to an ordinal scale: rating performance such
as 1 for "Very Poor," 2 for "Poor," etc, but there is no
assurance that the difference between a score of 1 and 2 means
the same thing as the difference between a score of and 3.
- probability distribution:
For a
discrete random variable, a
probability distribution contains the probability
of each possible outcome. The sum of all probabilities is
always 1.0.
- qualitative variables:
Categorical Variable: Also known as categorical
variables, qualitative variables are
variables with no natural sense of
ordering. For instance, hair color (Black, Brown, Gray, Red,
Yellow) is a qualitative variable, as is name (Adam, Becky,
Christina, Dave . . .). Qualitative variables can be coded to
appear numeric but their numbers are meaningless, as in
male=1, female=2. Variables that are not qualitative are known
as
quantitative variables.
- quantitative variables:
Variables that have are measured
on a numeric or quantitative scale.
Ordinal,
interval and
ratio scales are quantitative. A country's
population, a person's shoe size, or a car's speed are all
quantitative variables. Variables that are not quantitative
are known as
qualitative
variables.
- ratio scale:
One of the four basic
levels of
measurement, a ratio scale is a numerical scale with a
true zero point and in which a given size interval has the
same interpretation for the entire scale. Weight is a ratio
scale, Therefore it is meaningful to say that a 200 pound
person weighs twice as much as a 100 pound person.
- variables:
Something that can take on different values. For example,
different subjects in an experiment weight different
amounts. Therefore "weight" is a variable in the
experiment. Or, subjects may be given different doses of a
drug. This would make "dosage" a variable. Variables can be
dependent or
independent,
qualitative or
quantitative, and
continuous or
discrete.