Suppose you measure a collection of scalars
x
1
,
…
,
x
N
x
1
,
…
,
x
N
. You believe the data is distributed in one of two
ways. Your first model, call it
H
0
H
0
, postulates the data to be governed by the density
f
0
x
f
0
x
(some fixed density). Your second model,
H
1
H
1
, postulates a different density
f
1
x
f
1
x
. These models, termed hypotheses, are
denoted as follows:
H
0
:
x
n
∼
f
0
x
,
n
=
1
…
N
H
0
:
x
n
f
0
x
,
n
=
1
…
N
H
1
:
x
n
∼
f
1
x
,
n
=
1
…
N
H
1
:
x
n
f
1
x
,
n
=
1
…
N
A hypothesis test is a rule that, given a
measurement xx, makes
a decision as to which hypothesis best "explains" the data.
Suppose you are confident that your data is
normally distributed with variance 1, but you are uncertain about
the sign of the mean. You might postulate
H
0
:
x
n
∼-11
H
0
:
x
n
-1
1
H
1
:
x
n
∼11
H
1
:
x
n
1
1
These densities are depicted in Figure 1.
Assuming each hypothesis is
a priori
equally likely, an intuitively appealing hypothesis test is to
compute the sample mean
x¯=1N∑n=1N
x
n
x
1
N
n
1
N
x
n
, and choose
H
0
H
0
if
x¯≤0
x
0
, and
H
1
H
1
if
x¯>0
x
0
. As we will see later, this test is in fact optimal
under certain assumptions.
The concepts introduced above can be extended in
several ways. In what follows we provide more rigorous
definitions, describe different kinds of hypothesis testing, and
introduce terminology.
In the most general setup, the observation is
a collection
x1
,
…
,
xN
x
1
,
…
,
x
N
of random vectors. A common assumption, which facilitates
analysis, is that the data are independent and identically
distributed (IID). The random vectors may be continuous,
discrete, or in some cases mixed. It is generally assumed
that all of the data is available at once, although for some
applications, such as Sequential
Hypothesis Testing, the data is a never ending
stream.
When there are two competing hypotheses, we
refer to a binary hypothesis test. When the
number of hypotheses is
M≥2
M
2
, we refer to an M-ary hypothesis
test. Clearly, binary is a special case of
MM-ary, but binary tests are
accorded a special status for certain reasons. These include
their simplicity, their prevalence in applications, and
theoretical results that do not carry over to the
MM-ary case.
Suppose we
wish to transmit a binary string of length
rr over a noisy communication
channel. We assign each of the
M=2r
M
2
r
possible bit sequences to a signal
sk
s
k
,
k=1…M
k
1
…
M
where
s
n
k
=cos2π
f
0
n+2πk-1M
s
n
k
2
f
0
n
2
k
1
M
This symboling scheme is known as phase-shift
keying (PSK). After transmitting a signal across
the noisy channel, the receiver faces an
MM-ary hypothesis testing
problem:
H
0
:
x=s1+w
H
0
:
x
s
1
w
⋮⋮
H
M
:
x=sM+w
H
M
:
x
s
M
w
where
w∼0σ2I
w
0
σ
2
I
.
In many binary hypothesis tests, one
hypothesis represents the absence of a ceratin
feature. In such cases, the hypothesis is usually
labelled
H
0
H
0
and called the null hypothesis.
The other hypothesis is labelled
H
1
H
1
and called the alternative
hypothesis.
Consider the problem of detecting a known
signal
s=
s
1
…
s
N
T
s
s
1
…
s
N
in additive white Gaussian noise (AWGN). This
scenario is common in sonar and radar systems. Denoting
the data as
x=
x
1
…
x
N
T
x
x
1
…
x
N
, our hypothesis testing problem is
H
0
:
x=w
H
0
:
x
w
H
1
:
x=s+w
H
1
:
x
s
w
where
w∼0σ2I
w
0
σ
2
I
.
H
0
H
0
is the null hypothesis, corresponding to
the absence of a signal.
Consider the general hypothesis testing
problem where we have NN
dd-dimensional observations
x1
,
…
,
xN
x
1
,
…
,
x
N
and MM hypotheses. If
the data are real-valued, for example, then a hypothesis
test is a mapping
φ
:
ℝdN
→1…M
→
φ
:
d
N
1
…
M
For every possible realization of the input, the test
outputs a hypothesis. The test
φφ partitions the input
space into a disjoint collection
R
1
,
…
,
R
M
R
1
,
…
,
R
M
, where
R
k
=
(
x1
,
…
,
xN
)
|
φx1…xN=k
R
k
(
x
1
,
…
,
x
N
)
|
φ
x
1
…
x
N
k
The sets
R
k
R
k
are called decision regions. The boundary
between two decision regions is a decision
boundary. Figure 2
depicts these concepts when
d=2
d
2
,
N=1
N
1
, and
M=3
M
3
.
If the distribution of the data under a
certain hypothesis is fully known, we call it a
simple hypothesis. All of the hypotheses in the
examples above are simple. In many cases, however, we only
know the distribution up to certain unknown parameters. For
example, in a Gaussian noise model we may not know the
variance of the noise. In this case, a hypothesis is said to
be composite.
Consider the problem of detecting the signal
s
n
=cos2π
f
0
n-k
∀n:n=1…N
s
n
2
f
0
n
k
n
n
1
…
N
where kk is an unknown delay
parameter. Then
H
0
:
x=w
H
0
:
x
w
H
1
:
x=s+w
H
1
:
x
s
w
is a binary test of a simple hypothesis
(
H
0
H
0
) versus a composite alternative. Here we
are assuming
wn∼0σ2
w
n
0
σ
2
, with
σ2
σ
2
known.
Often a test involving a composite
hypothesis has the form
H
0
:
θ=
θ
0
H
0
:
θ
θ
0
H
1
:
θ≠
θ
0
H
1
:
θ
θ
0
where
θ
0
θ
0
is fixed. Such problems are called
two-sided because the composite
alternative "lies on both sides of
H
0
H
0
." When
θθ is a scalar,
the test
H
0
:
θ≤
θ
0
H
0
:
θ
θ
0
H
1
:
θ>
θ
0
H
1
:
θ
θ
0
is called one-sided. Here, both
hypotheses are composite.
Suppose a coin turns up heads with
probability pp. We want to
assess whether the coin is fair
(
p=12
p
1
2
). We toss the coin
NN times and record
x
1
,
…
,
x
N
x
1
,
…
,
x
N
(
x
n
=1
x
n
1
means heads and
x
n
=0
x
n
0
means tails). Then
H
0
:
p=12
H
0
:
p
1
2
H
1
:
p≠12
H
1
:
p
1
2
is a binary test of a simple hypothesis
(
H
0
H
0
) versus a composite alternative. This is
also a two-sided test.
In binary hypothesis testing, assuming at least
one of the two models does indeed correspond to reality, there
are four possible scenarios:
- Case 1 -
H
0
H
0
is true, and we declare
H
0
H
0
to be true
- Case 2 -
H
0
H
0
is true, but we declare
H
1
H
1
to be true
- Case 3 -
H
1
H
1
is true, and we declare
H
1
H
1
to be true
- Case 4 -
H
1
H
1
is true, but we declare
H
0
H
0
to be true
In cases 2 and 4, errors occur. The names given to these
errors depend on the area of application. In statistics, they
are called
type I and
type II errors
respectively, while in signal processing they are known as a
false alarm or a
miss.
Consider the general binary hypothesis testing
problem
H
0
:
x∼
f
θ
x
,
θ∈
Θ
0
H
0
:
x
f
θ
x
,
θ
Θ
0
H
1
:
x∼
f
θ
x
,
θ∈
Θ
1
H
1
:
x
f
θ
x
,
θ
Θ
1
If
H
0
H
0
is simple, that is,
Θ
0
=
θ
0
Θ
0
θ
0
, then the size (denoted
αα), also called the
false-alarm probability
(
P
F
P
F
), is defined to be
α=
P
F
=Pr
θ
0
declare
H
1
α
P
F
θ
0
declare
H
1
When
Θ
0
Θ
0
is composite, we define
α=
P
F
=
sup
θ
∈
Θ
0
Prθ
declare
H
1
α
P
F
sup
θ
∈
Θ
0
θ
declare
H
1
For
θ∈
Θ
1
θ
Θ
1
, the power (denoted
ββ), or detection
probability
(
P
D
P
D
), is defined to be
β=
P
D
=Prθ
declare
H
1
β
P
D
θ
declare
H
1
The probability of a type II error, also called the miss
probability, is
P
M
=1-
P
D
P
M
1
P
D
If
H
1
H
1
is composite, then
β=βθ
β
β
θ
is viewed as a function of
θθ.
The design of a hypothesis test/detector often
involves constructing the solution to an optimization
problem. The optimality criteria used fall into two classes:
Bayesian and frequent.
Representing the former approach is the Bayes Risk Criterion. Representing the
latter is the Neyman-Pearson
Criterion. These two approaches are developed at length
in separate modules.
The following table, adapted from Kay, p.65, summarizes the different terminology for
hypothesis testing from statistics and signal processing:
| Statistics |
Signal Processing |
| Hypothesis Test |
Detector |
| Null Hypothesis |
Noise Only Hypothesis |
| Alternate Hypothesis |
Signal + Noise Hypothesis |
| Critical Region |
Signal Present Decision Region |
| Type I Error |
False Alarm |
| Type II Error |
Miss |
| Size of Test
(αα) |
Probability of False Alarm
(
P
F
P
F
)
|
| Power of Test
(ββ) |
Probability of Detection
(
P
D
P
D
) |
-
S. Kay. (1998). Detection Theory. Prentice Hall.