We shall represent a monochrome (luminance) image by a matrix
xx whose elements are
xn
x
n
, where
n=n1n2
n
n1
n2
is the integer vector of row and column indexes. The
energy of xx is
defined as
Energy of x=∑nx2n
Energy of x
n
x
n
2
(1)
where the sum is performed over all
nn in
xx.
Figure 1 shows the main blocks in
any image coding system. The decoder is the inverse of the
encoder. The three encoder blocks perform the following tasks:
-
Energy compression - This is usually a
transformation or filtering process which aims to
concentrate a high proportion of the energy of the image
xx into as few
samples (coefficients) of yy as
possible while preserving
the total energy of xx in
yy. This
minimises the number
of non-zero samples of yy which
need to be transmitted for a given level of distortion in
the reconstructed image
x
^
x
^
.
-
Quantisation - This represents the samples of
yy to a given
level of accuracy in the integer matrix qq. The quantiser step size
controls the tradeoff between distortion and bit rate and
may be adapted to take account of human visual
sensitivities. The inverse quantiser reconstructs
y
^
y
^
, the best estimate of
yy from qq.
-
Entropy coding - This encodes the integers in
qq into a serial
bit stream dd, using
variable-length entropy codes which attempt to minimise the
total number of bits in dd,
based on the statistics (PDFs) of various classes of samples
in qq.
The energy compression / reconstruction and the entropy coding /
decoding processes are normally all lossless. Only the quantiser
introduces loss and distortion:
y
^
y
^
is a distorted version of
yy, and hence
x
^
x
^
is a distorted version of
xx. In the absence of quantisation,
if
y
^
=y
y
^
y
, then
x
^
=x
x
^
x
.