The power-of-two FFT algorithms, such as the radix-2
and radix-4 FFTs, and the common-factor
and prime-factor FFTs, achieve great reductions in computational complexity
of the DFT when the length, NN,
is a composite number.
DFTs of prime length are sometimes needed, however, particularly for the short-length DFTs in
common-factor or prime-factor algorithms.
The methods described here, along with the composite-length algorithms, allow fast computation
of DFTs of any length.
There are two main ways of performing DFTs of prime length:
- Rader's conversion, which is most efficient, and the
- Chirp-z transform, which is simpler and more general.
Oddly enough, both work by turning prime-length DFTs into convolution!
The resulting convolutions can then be computed efficiently by either
- fast convolution via composite-length FFTs (simpler) or by
- Winograd techniques (more efficient)
Rader's conversion is a one-dimensional index-mapping scheme that turns a
length-NN DFT
(NN prime) into a
length-(
N-1
N
1
) convolution and a few additions.
Rader's conversion works only for prime-length NN.
An index map simply rearranges the order of the sum operation in the
DFT definition.
Because addition is a commutative operation, the same mathematical result is produced
from any order, as long as all of the same terms are added once and only once. (This is
the condition that defines an index map.)
Unlike the multi-dimensional index maps used in deriving
common factor and prime-factor FFTs,
Rader's conversion uses a one-dimensional index map in a finite group of
NN integers:
k=rmmodN
k
r
m
N
If NN is prime, there exists an
integer "rr" called a
primitive root, such that the index map
k=rmmodN
k
r
m
N
,
m=012…N-2
m
0
1
2
…
N
2
, uniquely generates all elements
k=123…N-1
k
1
2
3
…
N
1
N=5
N
5
,
r=2
r
2
20mod5=1
2
0
5
1
21mod5=2
2
1
5
2
22mod5=4
2
2
5
4
23mod5=3
2
3
5
3
For NN prime, the inverse of
rr (i.e.
r-1rmodN=1
r
r
N
1
is also a primitive root (call it
r-1
r
).
N=5
N
5
,
r=2
r
2
r-1=3
r
3
2×3mod5=1
2
3
5
1
30mod5=1
3
0
5
1
31mod5=3
3
1
5
3
32mod5=4
3
2
5
4
33mod5=2
3
3
5
2
So why do we care? Because we can use these facts to turn a
DFT into a convolution!
Let
∀mn,m=01…N-2∧n∈12…N-1:n=r-mmodN
m
n
m
0
1
…
N
2
n
1
2
…
N
1
n
r
m
N
,
∀pk,p=01…N-2∧k∈12…N-1:k=rpmodN
p
k
p
0
1
…
N
2
k
1
2
…
N
1
k
r
p
N
Xk=∑n=0N-1xn
W
N
n
k
=
x0+∑n=1N-1xn
W
N
n
k
ifk≠0∑n=0N-1xnifk=0
X
k
n
N
1
0
x
n
W
N
n
k
x
0
n
N
1
1
x
n
W
N
n
k
k
0
n
N
1
0
x
n
k
0
where for convenience
W
N
n
k
=ⅇ-ⅈ2πnkN
W
N
n
k
2
n
k
N
in the DFT equation.
For
k≠0
k
0
XrpmodN=∑m=0N-2xr-mmodN
W
r
p
r
-
m
+x0=∑m=0N-2xr-mmodN
W
r
p
-
m
+x0=x0+xr-lmodN*
W
r
l
X
r
p
N
m
N
2
0
x
r
m
N
W
r
p
r
-
m
x
0
m
N
2
0
x
r
m
N
W
r
p
-
m
x
0
x
0
x
r
l
N
W
r
l
(1)
where
l=01…N-2
l
0
1
…
N
2
N=5
N
5
,
r=2
r
2
,
r-1=3
r
3
X0X1X2X3X4=0000001234024130314204321x0x1x2x3x4
X
0
X
1
X
2
X
3
X
4
0
0
0
0
0
0
1
2
3
4
0
2
4
1
3
0
3
1
4
2
0
4
3
2
1
x
0
x
1
x
2
x
3
x
4
X0X1X2X4X3=0000001342021340421103423x0x1x3x4x2
X
0
X
1
X
2
X
4
X
3
0
0
0
0
0
0
1
3
4
2
0
2
1
3
4
0
4
2
1
1
0
3
4
2
3
x
0
x
1
x
3
x
4
x
2
where for visibility the matrix entries represent only the power, mm of the corresponding
DFT term
W
N
m
W
N
m
Note that the 4-by-4 circulant matrix
1342213442113423
1
3
4
2
2
1
3
4
4
2
1
1
3
4
2
3
corresponds to a length-4 circular convolution.
Rader's conversion turns a prime-length DFT into a few adds
and a composite-length
(
N-1
N
1
) circular convolution, which can be computed
efficiently using either
- fast convolution via FFT and IFFT
- index-mapped convolution algorithms and short
Winograd convolution alogrithms. (Rather complicated, and trades fewer multiplies
for many more adds, which may not be worthwile on most modern processors.) See R.C. Agarwal and J.W. Cooley
S. Winograd has proved that a
length-NN circular or linear
convolution or DFT requires less than
2N
2
N
multiplies (for real data), or
4N
4
N
real multiplies for complex data. (This doesn't
count multiplies by rational fractions, like
33 or
1N
1
N
or
517
5
17
, which can be computed with additions and one
overall scaling factor.) Furthermore, Winograd showed how to
construct algorithms achieving these counts. Winograd
prime-length DFTs and convolutions have the following
characteristics:
- Extremely efficient for small
NN
(
N<20
N
20
)
- The number of adds becomes huge
for large NN.
Thus Winograd's minimum-multiply FFT's are useful only for
small
NN. They are
very important for
Prime-Factor
Algorithms, which generally use Winograd modules to
implement the short-length DFTs. Tables giving the
multiplies and adds necessary to compute Winograd FFTs for
various lengths can be found in
C.S. Burrus (1988). Tables and FORTRAN
and TMS32010 programs for these short-length transforms can
be found in
C.S. Burrus and
T.W. Parks (1985). The theory and derivation of these
algorithms is quite elegant but requires substantial
background in number theory and abstract algebra.
Fortunately for the practitioner, all of the short
algorithms one is likely to need have already been derived
and can simply be looked up without mastering the
details of their derivation.
The Winograd Fourier Transform Algorithm (WFTA) is
a technique that recombines the short Winograd modules in a
prime-factor FFT into a composite-NN structure with
fewer multiplies but more adds. While theoretically interesting,
WFTAs are complicated and different for every length, and on modern
processors with hardware multipliers the trade of multiplies for many
more adds is very rarely useful in practice today.
-
R.C. Agarwal and J.W. Cooley. (1977, Oct). New Algorithms for Digital Convolution. IEEE Trans. on Acoustics, Speech, and Signal Processing, 25, 392-410.
-
C.S. Burrus. (1988). Efficient Fourier Transform and Convolution Algorithms. In J.S. Lin and A.V. Oppenheim (Eds.), Advanced Topics in Signal Processing. Prentice-Hall.
-
C.S. Burrus and T.W. Parks. (1985). DFT/FFT and Convolution Algorithms. Wiley-Interscience.