Friday, February 29, 2008

Types and the Meaning of DFT

Sometimes it's good to get outside your comfort zone. These past few weeks have been like that for me, leaving behind the comfortable world of lambda calculus, type-directed translation, web frameworks, and instead trying my hand at some digital signal processing (DSP). In this post we're going to wrap up the discussion of types, meaning, contract and documentation using the topic of the Discrete Fourier Transform (DFT).

DSP: A Different World

This all started with some customer visits I made earlier this month, customers at large corporations doing lots of DSP, with more of a fondness for producing field-programmable gate arrays than writing news aggregators in 600 lines or less. It's refreshing to peek under the lid of other technical fields and see their complexity, to read things like "this project required tight coordination between the FGPA designers and DSP programmers who typically do not share design tools." It's the realization that it's a big technical world out there - that there are whole categories of developers with different skillsets and toolchains, where in a sense C is not close enough to the metal. Remember the advice that you always need to ask "for what" when hearing a recommendation for a new language or framework? Yeah. It matters what you're doing.

One outgrowth of those visits was an initiative on my part to write some basic signal processing programs. For example, I wanted to simulate the Amplitude Modulation process: modulate a .wav file sound clip onto the carrier frequency from the standard US commercial AM radio band (600KHz to 1600KHz), look at the frequency domain graph to see the expected carrier frequency spike and sidebands (corresponding to the original sound clip), and then demodulate it and play back the resulting sound file. Apparently this style of simulating the over-the-air bitstream is commonplace for those in the business of making things that go over the air (or through optic fibers).

I Heard This is Supposed to Hertz

The essence of the DFT is not hard to grasp, once you understand the basic premise of Fourier analysis: that any given time-based function (a.k.a signal) can be equivalently represented as a sum of periodic functions, sines, of various frequencies and magnitudes. In very simple terms, if you hit four tuning forks of different music pitches, the resulting sound could be represented as either a complex-looking wavy line or the sum of four sine waves of different frequencies (and phases). The DFT simply works with discrete samples of the input signal or function and is more suited to processing with a digital computer.

What I discovered and reported in Look at the Types... If You're Lucky is that the writings about the Discrete Fourier Transform (DFT) are surprisingly slippery on the topic of physical interpretation. What I wanted most from the DFT was a frequency spectrum plot with Hertz (SI unit Hz, defined as cycles per second) on the bottom axis and power on the vertical axis. I could tell what programming language type the DFT functions took and returned (arrays of complex numbers). Given the array of complex numbers it is easily converted to an array of magnitudes which you can then plot. But the units of the plot? That was another matter.

Both in general discussions of the DFT and in library function documentation, I got lots of (a) definitions of the DFT array and (b) useful tips like "you can plot the magnitude to see the spectrum " and "the first element in the returned vector is the DC component".

But over and over I could not get a straight answer to the question "what are the actual frequencies" and "what units are the magnitudes"; essentially what are the units when plotting the magnitude of the DFT complex numbers.

Typical DFT Documentation

Here's a typical example of DFT documentation, the Apache Math library's FastFourierTransformer class:

public Complex[] transform(double[] f)
Transform the given real data set.

The formula is $ y_n = \Sigma_{k=0}^{N-1} e^{-2 \pi i nk/N} x_k $

f - the real data array to be transformed

the complex transformed array

Notice the DFT definition is given, though typically the free variables y_n, N, and x_n are not defined. The reader is assumed to be familiar with the definition.

How to Get the Hertz, the Power

Eventually I stumbled on Julius O. Smith III's excellent summary of the DFT. I say "excellent" because it spells out in satisfying detail what the notation means and what the units are.

Briefly, if the DFT function returns an zero-based array X, then X[k] represents the frequency k/Ttotal where Ttotal is the time duration of your input sample. That's it!

So, Smith gives us the Hertz. For the vertical axis, I had to consult a friend with a PhD in communication theory to get the explanation for the magnitude. Of course the meaning of the magnitude depends on the meaning of your input signal. After all, the time domain samples could have a range of 1 Volt or 1millivolt, the input array doesn't have units either. Briefly, whatever units of power the input squared has, the DFT magnitude squared also has.

Types, Contracts and the Real World

Every programmer learns how to model the real world with types as well as code. We realize there's not a one to one match between data and reality, and finding the right tradeoff is important to success. Some kinds of bugs can trace their origin to a data model that is too simple: there are real world states that can't be represented. Other bugs arise when the data representation has states that correspond to no real world state. Other bugs arise when the data model is so complex that the programmers can't keep up with it. I claim that the best software solutions are written by experts in the problem domain, and have a data model that models the problem domain as exactly as possible. The problem domain experts have no problem keeping up with the data model because it matches their understanding of the real-world processes that they carry with them when they're away from the source code.

The Discrete Fourier Transform is best described as a library function, because it has applications in so many fields and areas. The best software solutions using the DFT are undoubtedly written by people who have studied the topic away from a source code editor. Sitting in front of a Javadoc is no place to learn DSP. It's also no place to learn about Swing, or X.509 certificates. But I do think a modicum of domain explanation is appropriate in documentation for these types of libraries, for newcomers who know what they need out of the library without needing to know implementation details. It's a tricky line, but after being in this situation, I'm more sympathetic to this position of a library consumer.

In summary, I'd like to propose that library documentation generally have three kinds of information:

1. The function prototype, describing the types of parameters and the return value.

2. Low-level invariants and contract information about the arguments and return value.

3. A little bit about how the function relates to the real world.

Improved Documentation

Finally, here's my documentation attempt for the DFT function:

Complex[] dft(double[] signal);
returns the Discrete Fourier Transform of the input array signal.

If the input signal is an array of N signal samples taken at sampling period T, implying that the input signal has time duration NT seconds, then the returned Complex array also has N frequency components. The kth element of the array corresponds to the frequency k/(NT) cycles/second. If the squared value of the elements of signal is in power units P, then then squared value of the elements of the returned array also have power units P.

The elements in the first half of the returned array are mirrored by the elements in the second half, as they correspond to the positive and negative frequencies respectively.

signal - Input signal, a length N array of signal samples taken at sampling period T.

Array of length N of complex numbers. The kth element of the array corresponds to the time-domain frequency k/(NT) cycles/second. The magnitude squared of a complex-valued element in the array corresponds to the power for that frequency, and the angle of the complex-valued element corresponds to its phase.

IANADSPG (I Am Not A Digital Signal Processing Guy), so don't bet your company on this description, and do please send me corrections (citing sources). This is simply the type of thing I'd have liked to know when first coming to the DFT, and I hope by analogy you can find insights how users of your library will view it and how you may want to document it.

No comments: