One of the more powerful tools in statistical communication
theory is the abstract concept of a linear vector
space. The key result that concerns us is the
representation theorem: a deterministic time
function can be uniquely represented by a sequence of numbers.
The stochastic version of this theorem states that a process can
be represented by a sequence of uncorrelated random variables.
These results will allow us to exploit the theory of hypothesis
testing to derive the optimum detection
strategy.
Basics
definition 1
A linear vector space
S
S
is a collection of elements called vectors having the
following properties:
-
The vector-addition operation can be defined so that if
x∧y∧z∈S
x
y
z
S
:
-
x+y∈S
x
y
S
(the space is closed under addition)
-
x+y=y+x
x
y
y
x
(Commutivity)
-
x
+
y
+z=x+
y
+
z
x
+
y
z
x
y
+
z
(Associativity)
-
The zero vector exists and is always an element of
S
S.
The zero vector is defined by
x+0=x
x
0
x
.
-
For each
x∈S
x
S
,
a unique vector
-x
x
is also an element of
S
S
so that
x+-x=0
x
x
0
, the zero vector.
-
Associated with the set of vectors is a set of scalars
which constitute an algebraic field. A
field is a set of elements which obey the
well-known laws of associativity and commutivity for
both addition and multiplication. If
a
a,
b
b
are scalars, the elements
x
x,
y
y
of a linear vector space have the properties that:
-
ax
a
x
(multiplication by scalar
a
a) is defined and
ax∈S
a
x
S
.
-
a
b
x
=
a
b
x
a
b
x
a
b
x
.
-
If "1" and "0" denotes the multiplicative and
additive identity elements respectively of the
field of scalars; then
1x=x
1
x
x
and
0x=0
0
x
0
-
ax+y=ax+ay
a
x
y
a
x
a
y
and
a+bx=ax+bx
a
b
x
a
x
b
x
.
There are many examples of linear vector spaces. A familiar
example is the set of column vectors of length
N
N.
In this case, we define the sum of two vectors to be:
x
1
x
2
⋮
x
N
+
y
1
y
2
⋮
y
N
=
x
1
+
y
1
x
2
+
y
2
⋮
x
N
+
y
N
x
1
x
2
⋮
x
N
y
1
y
2
⋮
y
N
x
1
y
1
x
2
y
2
⋮
x
N
y
N
(1)
and scalar multiplication to be
a
x
1
x
2
…
x
N
T=a
x
1
a
x
2
…a
x
N
T
a
x
1
x
2
…
x
N
a
x
1
a
x
2
…
a
x
N
.
All of the properties listed above are satisfied.
A more interesting (and useful) example is the collection of
square integrable functions. A square-integrable
function
xt
x
t
satisfies:
∫
T
i
T
f
|xt|2dt<∞
t
T
i
T
f
x
t
2
(2)
One can verify that this collection constitutes a linear
vector space. In fact, this space is so important that it has
a special name -
L
2
T
i
T
f
L
2
T
i
T
f
(read this as
el-two); the arguments
denote the range of integration.
definition 2
Let SS be a linear vector
space. A subspace 𝒯𝒯 of
SS is a subset of
SS which is closed. In other
words, if
x∧y∈𝒯
x
y
𝒯
, then
x∧y∈S
x
y
S
and all elements of 𝒯𝒯
are elements of SS, but some
elements of SS are not
elements of 𝒯𝒯.
Furthermore, the linear combination
ax+by∈𝒯
a
x
b
y
𝒯
for all scalars aa,
bb. A subspace is sometimes
referred to as a closed linear manifold.
Inner Product Spaces
A structure needs to be defined for linear vector spaces so
that definitions for the length of a vector and for the
distance between any two vectors can be obtained. The notions
of length and distance are closely related to the concept of
an inner product.
definition 3
An
inner product of two real vectors
x∧y∈S
x
y
S
, is denoted by
<x,y>
x
y
and is a
scalar assigned to the
vectors
x
x
and
y
y
which satisfies the following properties:
-
<x,y>=<y,x>
x
y
y
x
-
<ax,y>=a<x,y>
a
x
y
a
x
y
,
a
a is a scalar
-
<x+y,z>=<x,z>+<y,z>
x
y
z
x
z
y
z
,
z
z a vector.
-
<x,x>>0
x
x
0
unless
x=0
x
0
.
In this case,
<x,x>=0
x
x
0
.
As an example, an inner product for the space consisting
of column matrices can be defined as
<x,y>=xTy=∑i=1N
x
i
y
i
x
y
x
y
i
1
N
x
i
y
i
The reader should verify that this is indeed a valid inner
product (i.e., it satisfies all of the properties given
above). It should be noted that this definition of an inner
product is
not unique: there are other
inner product definitions which also satisfy all of these
properties. For example, another valid inner product is
<x,y>=xTKy
x
y
x
K
y
where
KK is an
N
x
N
N
x
N
positive-definite matrix. Choices of the matrix
KK which are not positive
definite do not yield valid inner products (
property 4 is not satisfied). The
matrix
KK is termed
the
kernel of the inner product. When this
matrix is something other than an identity matrix, the inner
product is sometimes written as
x
,
y
K
x
,
y
K
to denote explicitly the presence of the kernel in the
inner product.
definition 4
The
norm of a vector
x∈S
x
S
is denoted by
∥x∥
x
and is defined by:
∥x∥=<x,x>1/2
x
x
x
12
(3)
Because of the properties of an inner product, the norm of a
vector is always greater than zero unless the vector is
identically zero. The norm of a vector is related to the
notion of the length of a vector. For
example, if the vector
x
x
is multiplied by the constant scalar
a
a,
the norm of the vector is also multiplied by
a
a.
∥ax∥=<ax,ax>1/2=a∥x∥
a
x
a
x
a
x
12
a
x
In other words, "longer" vectors
(
a>1
a
1
) have larger norms. A norm can also be defined when
the inner product contains a kernel. In this case, the norm
is written
∥x∥K
K
x
for clarity.
definition 5
An
inner product space is a linear vector
space in which an inner product can be defined for all
elements of the space and a norm is given by
Equation 3. Note in particular
that every element of an inner product space must satisfy
the axioms of a valid inner product.
For the space
S
S
consisting of column matrices, the norm of a vector is given
by (consistent with the first choice of an inner product)
∥x∥=∑i=1N
x
i
21/2
x
i
1
N
x
i
2
12
This choice of a norm corresponds to the Cartesian definition
of the length of a vector.
One of the fundamental properties of inner product spaces is the
Schwarz inequality
|<x,y>|≤∥x∥∥y∥
x
y
x
y
(4)
This is one of the most important inequalities we shall
encounter. To demonstrate this inequality, consider the norm
squared of
x+ay
x
a
y
.
∥x+ay∥2=<x+ay,x+ay>=∥x∥2+2a<x,y>+a2∥y∥2
x
a
y
2
x
a
y
x
a
y
x
2
2
a
x
y
a
2
y
2
Let
a=-<x,y>∥y∥2
a
x
y
y
2
. In this case:
∥x+ay∥2=∥x∥2-2|<x,y>|2∥y∥2+|<x,y>|2∥y∥4∥y∥2=∥x∥2-|<x,y>|2∥y∥2
x
a
y
2
x
2
2
x
y
2
y
2
x
y
2
y
4
y
2
x
2
x
y
2
y
2
As the left hand side of this result is non-negative, the
right-hand side is lower-bounded by zero. The
Schwarz inequality is thus obtained.
Note that the equality occurs
only when
x=-ay
x
a
y
, or equivalently when
x=cy
x
c
y
, where
c
c
is any constant.
definition 6
Two vectors are said to be orthogonal if the
inner product of the vectors is zero:
<x,y>=0
x
y
0
.
Consistent with these results is the concept of the
"angle" between two vectors. The cosine
of this angle is defined by:
cos
x
,
y
=<x,y>∥x∥∥y∥
x
,
y
x
y
x
y
Because of the Schwarz inequality,
|cos
x
,
y
|≤1
x
,
y
1
.
The angle between the orthogonal vectors is
±π2
±
2
and the angle between vectors satisfying the
Schwarz inequality with equality
x∝y
∝
x
y
is zero (the vectors are parallel to each other).
definition 7
The distance between two vectors is taken to
be the norm of the difference of the vectors.
dxy=∥x-y∥
d
x
y
x
y
In our example of the normed space of column matrices, the
distance between
xx
and
yy would be
∥x-y∥=∑i=1N
x
i
-
y
i
21/2
x
y
i
1
N
x
i
y
i
2
12
which agrees with the Cartesian notion of
distance. Because of the properties of the inner product, this
distance measure (or
metric) has the following
properties:
-
dxy=dyx
d
x
y
d
y
x
(Distance does not depend on how it is measured.)
-
dxy=0⇒x=y
d
x
y
0
x
y
(Zero distance means equality)
-
dxz≤dxy+dyz
d
x
z
d
x
y
d
y
z
(Triangle inequality)
We use this distance measure to define what we mean by
convergence. When we say the sequence of vectors
x
n
x
n
converges to
xx
(
x
n
→x
x
n
x
), we mean
limn→∞∥
x
n
-x∥=0
n
n
x
n
x
0