FIR filters are one of two primary types of digital filters used in Digital Signal Processing (DSP) applications, the other type being IIR.
"FIR" means "Finite Impulse Response". If you put in an impulse, that is, a single "1" sample followed by many "0" samples, zeroes will come out after the "1" sample has made its way through the delay line of the filter.
In the common case, the impulse response is finite because there is no feedback in the FIR. A lack of feedback guarantees that the impulse response will be finite. Therefore, the term "finite impulse response" is nearly synonymous with "no feedback".
However, if feedback is employed yet the impulse response is finite, the filter still is a FIR. An example is the moving average filter, in which the Nth prior sample is subtracted (fed back) each time a new sample comes in. This filter has a finite impulse response even though it uses feedback: after N samples of an impulse, the output will always be zero.
Some people say the letters F-I-R; other people pronounce as if it were a type of tree. We prefer the tree. (The difference is whether you talk about an F-I-R filter or a FIR filter.)
DSP filters can also be "Infinite Impulse Response" (IIR). (See dspGuru's IIR FAQ.) IIR filters use feedback, so when you input an impulse the output theoretically rings indefinitely.
Each has advantages and disadvantages. Overall, though, the advantages of FIR filters outweigh the disadvantages, so they are used much more than IIRs.
Compared to IIR filters, FIR filters offer the following advantages:
- They can easily be designed to be "linear phase" (and usually are). Put simply, linear-phase filters delay the input signal but don’t distort its phase.
- They are simple to implement. On most DSP microprocessors, the FIR calculation can be done by looping a single instruction.
- They are suited to multi-rate applications. By multi-rate, we mean either "decimation" (reducing the sampling rate), "interpolation" (increasing the sampling rate), or both. Whether decimating or interpolating, the use of FIR filters allows some of the calculations to be omitted, thus providing an important computational efficiency. In contrast, if IIR filters are used, each output must be individually calculated, even if it that output will discarded (so the feedback will be incorporated into the filter).
- They have desireable numeric properties. In practice, all DSP filters must be implemented using finite-precision arithmetic, that is, a limited number of bits. The use of finite-precision arithmetic in IIR filters can cause significant problems due to the use of feedback, but FIR filters without feedback can usually be implemented using fewer bits, and the designer has fewer practical problems to solve related to non-ideal arithmetic.
- They can be implemented using fractional arithmetic. Unlike IIR filters, it is always possible to implement a FIR filter using coefficients with magnitude of less than 1.0. (The overall gain of the FIR filter can be adjusted at its output, if desired.) This is an important consideration when using fixed-point DSP's, because it makes the implementation much simpler.
Compared to IIR filters, FIR filters sometimes have the disadvantage that they require more memory and/or calculation to achieve a given filter response characteristic. Also, certain responses are not practical to implement with FIR filters.
- Impulse Response - The "impulse response" of a FIR filter is actually just the set of FIR coefficients. (If you put an "impluse" into a FIR filter which consists of a "1" sample followed by many "0" samples, the output of the filter will be the set of coefficients, as the 1 sample moves past each coefficient in turn to form the output.)
- Tap - A FIR "tap" is simply a coefficient/delay pair. The number of FIR taps, (often designated as "N") is an indication of 1) the amount of memory required to implement the filter, 2) the number of calculations required, and 3) the amount of "filtering" the filter can do; in effect, more taps means more stopband attenuation, less ripple, narrower filters, etc.
- Multiply-Accumulate (MAC) - In a FIR context, a "MAC" is the operation of multiplying a coefficient by the corresponding delayed data sample and accumulating the result. FIRs usually require one MAC per tap. Most DSP microprocessors implement the MAC operation in a single instruction cycle.
- Transition Band - The band of frequencies between passband and stopband edges. The narrower the transition band, the more taps are required to implement the filter. (A "small" transition band results in a "sharp" filter.)
- Delay Line - The set of memory elements that implement the "Z^-1" delay elements of the FIR calculation.
- Circular Buffer - A special buffer which is "circular" because incrementing at the end causes it to wrap around to the beginning, or because decrementing from the beginning causes it to wrap around to the end. Circular buffers are often provided by DSP microprocessors to implement the "movement" of the samples through the FIR delay-line without having to literally move the data in memory. When a new sample is added to the buffer, it automatically replaces the oldest one.
Most FIRs are linear-phase filters; when a linear-phase filter is desired, a FIR is usually used.
"Linear Phase" refers to the condition where the phase response of the filter is a linear (straight-line) function of frequency (excluding phase wraps at +/- 180 degrees). This results in the delay through the filter being the same at all frequencies. Therefore, the filter does not cause "phase distortion" or "delay distortion". The lack of phase/delay distortion can be a critical advantage of FIR filters over IIR and analog filters in certain systems, for example, in digital data modems.
FIR filters are usually designed to be linear-phase (but they don't have to be.) A FIR filter is linear-phase if (and only if) its coefficients are symmetrical around the center coefficient, that is, the first coefficient is the same as the last; the second is the same as the next-to-last, etc. (A linear-phase FIR filter having an odd number of coefficients will have a single coefficient in the center which has no mate.)
The formula is simple: given a FIR filter which has N taps, the delay is: (N - 1) / (2 * Fs), where Fs is the sampling frequency. So, for example, a 21 tap linear-phase FIR filter operating at a 1 kHz rate has delay: (21 - 1) / (2 * 1 kHz)=10 milliseconds.
Non-linear phase, of course. ;-) Actually, the most popular alternative is "minimum phase". Minimum-phase filters (which might better be called "minimum delay" filters) have less delay than linear-phase filters with the same amplitude response, at the cost of a non-linear phase characteristic, a.k.a. "phase distortion".
A lowpass FIR filter has its largest-magnitude coefficients in the center of the impulse response. In comparison, the largest-magnitude coefficients of a minimum-phase filter are nearer to the beginning . (See dspGuru's tutorial How To Design Minimum-Phase FIR Filters for more details.)
For an N-tap FIR filter with coefficients h(k), whose output is described by:
y(n)=h(0)x(n) + h(1)x(n-1) + h(2)x(n-2) + ... h(N-1)x(n-N-1),
the filter's Z transform is:
H(z)=h(0)z-0 + h(1)z-1 + h(2)z-2 + ... h(N-1)z-(N-1) , or
The variable z in H(z) is a continuous complex variable, and we can describe it as: z=r·ejw, where r is a magnitude and w is the angle of z. If we let r=1, then H(z) around the unit circle becomes the filter's frequency response H(jw). This means that substituting ejw for z in H(z) gives us an expression for the filter's frequency response H(w), which is:
H(jw)=h(0)e-j0w + h(1)e-j1w + h(2)e-j2w + ... h(N-1)e-j(N-1)w , or
Using Euler's identity, e-ja=cos(a) - jsin(a), we can write H(w) in rectangular form as:
H(jw)=h(0)[cos(0w) - jsin(0w)] + h(1)[cos(1w) - jsin(1w)] + ... h(N-1)[cos((N-1)w) - jsin((N-1)w)] , or
Yes. For an N-tap FIR, you can get N evenly-spaced points of the frequency response by doing a DFT on the filter coefficients. However, to get the frequency response of the filter at any arbitrary frequency (that is, at frequencies between the DFT outputs), you will need to use the formula above.
Consider a DC (zero Hz) input signal consisting of samples which each have value 1.0. After the FIR's delay line had filled with the 1.0 samples, the output would be the sum of the coefficients. Therefore, the gain of a FIR filter at DC is simply the sum of the coefficients.
This intuitive result can be checked against the formula above. If we set w to zero, the cosine term is always 1, and the sine term is always zero, so the frequency response becomes:
Simply multiply all coefficients by the scale factor.
Yes. Since they have no feedback elements, any bounded input results in a bounded output.
Again, the key is the lack of feedback. The numeric errors that occur when implementing FIR filters in computer arithmetic occur separately with each calculation; the FIR doesn't "remember" its past numeric errors. In contrast, the feedback aspect of IIR filters can cause numeric errors to compound with each calculation, as numeric errors are fed back.
The practical impact of this is that FIRs can generally be implemented using fewer bits of precision than IIRs. For example, FIRs can usually be implemented with 16 bits, but IIRs generally require 32 bits, or even more.
Because only a fraction of the calculations that would be required to implement a decimating or interpolating FIR in a literal way actually needs to be done.
Since FIR filters do not use feedback, only those outputs which are actually going to be used have to be calculated. Therefore, in the case of decimating FIRs (in which only 1 of N outputs will be used), the other N-1 outputs don't have to be calculated. Similarly, for interpolating filters (in which zeroes are inserted between the input samples to raise the sampling rate) you don't actually have to multiply the inserted zeroes with their corresponding FIR coefficients and sum the result; you just omit the multiplication-additions that are associated with the zeroes (because they don't change the result anyway.)
In contrast, since IIR filters use feedback, every input must be used, and every input must be calculated because all inputs and outputs contribute to the feedback in the filter.
Aside from "regular" and "extra crispy" there are:
- Boxcar - Boxcar FIR filters are simply filters in which each coefficient is 1.0. Therefore, for an N-tap boxcar, the output is just the sum of the past N samples. Because boxcar FIRs can be implemented using only adders, they are of interest primarily in hardware implementations, where multipliers are expensive to implement.
- Hilbert Transformer - Hilbert Transformers shift the phase of a signal by 90 degrees. They are used primarily for creating the imaginary part of a complex signal, given its real part.
- Differentiator - Differentiators have an amplitude response which is a linear function of frequency. They are not very popular nowadays, but are sometimes used for FM demodulators.
- Lth-Band - Also called "Nyquist" filters, these filters are a special class of filters used primarily in multirate applications. Their key selling point is that one of every L coefficients is zero--a fact which can be exploited to reduce the number of multiply-accumulate operations required to implement the filter. (The famous "half-band" filter is actually an Lth-band filter, with L=2.)
- Raised-Cosine - This is a special kind of filter that is sometimes used for digital data applications. (The frequency response in the passband is a cosine shape which has been "raised" by a constant.) See dspGuru's Raised-Cosine FAQ for more information.
- Lots of others.
The three most popular design methods are (in order):
- Parks-McClellan: The Parks-McClellan method (inaccurately called "Remez" by Matlab) is probably the most widely used FIR filter design method. It is an iteration algorithm that accepts filter specifications in terms of passband and stopband frequencies, passband ripple, and stopband attenuation. The fact that you can directly specify all the important filter parameters is what makes this method so popular. The PM method can design not only FIR "filters" but also FIR "differentiators" and FIR "Hilbert transformers".
- Windowing:. In the windowing method, an initial impulse response is derived by taking the Inverse Discrete Fourier Transform (IDFT) of the desired frequency response. Then, the impulse response is refined by applying a data window to it.
- Direct Calculation: The impulse responses of certain types of FIR filters (e.g. Raised Cosine and Windowed Sinc) can be calculated directly from formulas.
With a FIR filter design program, of course. ;-) Although it's possible to design FIR filters using manual methods, it is a whole lot easier just to use a FIR filter design program.
FIR filter design programs come in three broad categories:
- Filter Design Applications: See dspGuru's Digital Filter Design Software page for a list of filter design programs.
Near and dear to us here at dspGuru is Iowegian's own ScopeFIR product. We believe ScopeFIR offers an excellent combination of professional features, smooth user interface, and affordable price. We sell it for just $299, with a 30 day trial period. Even if you already use Matlab, ScopeFIR's "point and shoot" capabilities can improve your FIR filter design productivity.
- Math Programs: Matlab and its Free Clones offer built-in FIR filter design functions.
- Source code: One of the best places on the net to find source code to design FIR filters is Charles Poynton's Filter Design Software page.
Structurally, FIR filters consist of just two things: a sample delay line and a set of coefficients. To implement the filter:
- Put the input sample into the delay line.
- Multiply each sample in the delay line by the corresponding coefficient and accumulate the result.
- Shift the delay line by one sample to make room for the next input sample.
There are lots of possibilities, including a few tricks. To illustrate, we have provided a set of FIR filter algorithms implemented in C called "FirAlgs.c" in ScopeFIR's distribution file. FirAlgs.c includes the following functions:
- fir_basic: Illustrates the basic FIR calculation described above by implementing it very literally.
- fir_circular: Illustrates how circular buffers are used to implement FIRs.
- fir_shuffle: Illustrates the "shuffle down" technique used by some of Texas Instruments' processors.
- fir_split: Splits the FIR calculation into two flat (non-circular) pieces to avoid the use circular buffer logic and shuffling.
- fir_double_z: Uses a double-sized delay line so that the FIR calculation can be done using a flat buffer.
- fir_double_h: Similar to fir_double_z, this uses a double-sized coefficient so that the FIR calculation can be done using a flat buffer.
FIR assembly algorithms are quite processor-specific, but the most common system uses a circular buffer mechanism provided by the DSP processor. The basic steps are:
- Configure the circular buffer; load the coefficient and delay-line pointers. Then, for each input sample:
- Store the incoming data in the delay line; increment the delay-line pointer.
- Clear the multiplier-accumulator.
- Loop over all coefficients/delays; accumulate the values obtained by multiplying the coefficients by the delayed samples.
- Round or truncate the result as the FIR output.
Alternatively, a "shuffle down" method is used in Texas Instruments' older fixed-point processors to implement circular buffers. The processor literally moves each sample delay values by one slot during each multiply-accumulate (via the "MACD" instruction).
Each DSP microprocessor manufacturer provides example FIR assembly code in its data books or its application handbooks, so be sure to look at those before you "reinvent the circular buffer".
Here are a few methods:
- Impulse Test: A very simple and effective test is to put an impulse into it (which is just a "1" sample followed by at lest N - 1 zeroes.) You can also put in an "impulse train", with the "1" samples spaced at least N samples apart. If all the coefficients of the filter come out in the proper order, there is a good chance your filter is working correctly. (You might want to test with non-linear phase coefficients so you can see the order they come out.) We recommend you do this test whenever you write a new FIR filter routine.
- Step Test: Input N or more "1" samples. The output after N samples, should be the sum (DC gain) of the FIR filter.
- Sine Test: Input a sine wave at one or more frequencies and see if the output sine has the expected amplitude.
- Swept FM Test: From Eric Jacobsen: "My favorite test after an impulse train is to take two identical instances of the filter under test, use them as I and Q filters and put a complex FM linear sweep through them from DC to Fs/2. You can do an FFT on the result and see the complete frequency response of the filter, make sure the phase is nice and continuous everywhere, and match the response to what you'd expect from the coefficient set, the precision, etc."
FIR tricks center on two things 1) not calculating things that don't need to be calculated, and 2) "faking" circular buffers in software.
First, if your filter has zero-valued coefficients, you don't actually have to calculate those taps; you can leave them out. A common case of this is "half-band" filter, which have the property that every-other coefficient is zero.
Second, if your filter is "symmetric" (linear phase), you can "pre-add" the samples which will be multiplied by the same coefficient value, prior to doing the multiply. Since this technique essentially trades an add for a multiply, it isn't really useful in DSP microprocessors which can do a multiply in a single instruction cycle. However, it is useful in ASIC implementations (in which addition is usually much less expensive than multiplication); also, some newer DSP processors now offer special hardware and instructions to make use of this trick.
When hardware support for circular buffers isn't available, you have to "fake" them. Also, since ANSI C has no construct to describe circular buffers, most C compilers can't generate code to use them, even if the target processor has them.
You can always implement a circular buffer by duplicating the logic of a circular buffer in software (and many have), but the overhead can be prohibitive; the circular-fake might take several instructions to implement, compared to just a single instruction to do the multiply-accumulate operation. Therefore you need to fake it.
Here are several basic techniques to fake circular buffers:
- Split the calculation: You can split any FIR calculation into its "pre-wrap" and "post-wrap" parts. By splitting the calculation into these two parts, you essentially can do the circular logic only once, rather than once per tap. (See fir_double_z in FirAlgs.c above.)
- Duplicate the delay line: For a FIR with N taps, use a delay line of size 2N. Copy each sample to its proper location, as well as at location-plus-N. Therefore, the FIR calculation's MAC loop can be done on a flat buffer of N points, starting anywhere within the first set of N points. The second set of N delayed samples provides the "wrap around" comparable to a true circular buffer. (See fir_double_z in FirAlgs.c above.)
- Duplicate the coefficients: This is similar to the above, except that the duplication occurs in terms of the coefficients, not the delay line. Compared to the previous method, this has a calculation advantage of not having to store each incoming sample twice, and it also has a memory advantage when the same coefficient set will be used on multiple delay lines. (See fir_double_h in FirAlgs.c above.)
- Use block processing: In block processing, you use a delay line which is a multiple of the number of taps. You therefore only have to move the data once per block to implement the delay-line mechanism. When the block size becomes "large", the overhead of a moving the delay line once per block becomes negligible.
IIR filters are one of two primary types of digital filters used in Digital Signal Processing (DSP) applications (the other type being FIR). "IIR" means "Infinite Impulse Response".
The impulse response is "infinite" because there is feedback in the filter; if you put in an impulse (a single "1" sample followed by many "0" samples), an infinite number of non-zero values will come out (theoretically).
DSP filters can also be "Finite Impulse Response" (FIR). FIR filters do not use feedback, so for a FIR filter with N coefficients, the output always becomes zero after putting in N samples of an impulse response.
IIR filters can achieve a given filtering characteristic using less memory and calculations than a similar FIR filter.
- They are more susceptable to problems of finite-length arithmetic, such as noise generated by calculations, and limit cycles. (This is a direct consequence of feedback: when the output isn't computed perfectly and is fed back, the imperfection can compound.)
- They are harder (slower) to implement using fixed-point arithmetic.
- They don't offer the computational advantages of FIR filters for multirate (decimation and interpolation) applications.
Multirate simply means "multiple sampling rates". A multirate DSP system uses multiple sampling rates within the system. Whenever a signal at one rate has to be used by a system that expects a different rate, the rate has to be increased or decreased, and some processing is required to do so. Therefore "Multirate DSP" really refers to the art or science of changing sampling rates.
The most immediate reason is when you need to pass data between two systems which use incompatible sampling rates. For example, professional audio systems use 48 kHz rate, but consumer CD players use 44.1 kHz; when audio professionals transfer their recorded music to CDs, they need to do a rate conversion.
But the most common reason is that multirate DSP can greatly increase processing efficiency (even by orders of magnitude!), which reduces DSP system cost. This makes the subject of multirate DSP vital to all professional DSP practitioners.
Multirate consists of:
Right here. Our "multirate_algs" package includes a decimation, interpolation, and resampling routines. You can download it from dspGuru's DSP Algorithm Library.
Many DSP books omit the important subject of multirate altogether. But two introductory texts that briefly go into it are:
But it's a big subject. The two most popular "industrial strength" (advanced) books that cover multirate in depth are:
Loosely speaking, "decimation" is the process of reducing the sampling rate. In practice, this usually implies lowpass-filtering a signal, then throwing away some of its samples.
"Downsampling" is a more specific term which refers to just the process of throwing away samples, without the lowpass filtering operation. Throughout this FAQ, though, we'll just use the term "decimation" loosely, sometimes to mean "downsampling".
The decimation factor is simply the ratio of the input rate to the output rate. It is usually symbolized by "M", so input rate / output rate=M.
Tip: You can remember that "M" is the symbol for decimation factor by thinking of "deci-M-ation". (Exercise for the student: which letter is used as the symbol for interpo-L-ation factor?)
The most immediate reason to decimate is simply to reduce the sampling rate at the output of one system so a system operating at a lower sampling rate can input the signal. But a much more common motivation for decimation is to reduce the cost of processing: the calculation and/or memory required to implement a DSP system generally is proportional to the sampling rate, so the use of a lower sampling rate usually results in a cheaper implementation.
To that, Jim Thomas adds:
Almost anything you do to/with the signal can be done with fewer operations at a lower sample rate, and the workload is almost always reduced by more than a factor of M.
For example, if you double the sample rate, an equivalent filter will require four times as many operations to implement. This is because both amount of data (per second) and the length of the filter increase by two, so convolution goes up by four. Thus, if you can halve the sample rate, you can decrease the work load by a factor of four. I guess you could say that if you reduce the sample rate by M, the workload for a filter goes down to (1/M)^2.
Yes. Decimation involves throwing away samples, so you can only decimate by integer factors; you cannot decimate by fractional factors. (However, you can do interpolation prior to decimation to achieve an overall rational factor, for example, "4/5"; see Part 4: Resampling.)
A signal can be downsampled (without doing any filtering) whenever it is "oversampled", that is, when a sampling rate was used that was greater than the Nyquist criteria required. Specifically, the signal's highest frequency must be less than half the post-decimation sampling rate. (This just boils down to applying the Nyquist criteria to the input signal, relative to the new sampling rate.)
In most cases, though, you'll end up lowpass-filtering your signal prior to downsampling, in order to enforce the Nyquist criteria at the post-decimation rate. For example, suppose you have a signal sampled at a rate of 30 kHz, whose highest frequency component is 10 kHz (which is less than the Nyquist frequency of 15 kHz). If you wish to reduce the sampling rate by a factor of three to 10 kHz, you must ensure that you have no components greater than 5 kHz, which is the Nyquist frequency for the reduced rate. However, since the original signal has components up to 10 kHz, you must lowpass-filter the signal prior to downsampling to remove all components above 5 kHz so that no aliasing will occur when downsampling.
This combined operation of filtering and downsampling is called decimation.
You get aliasing--just as with other cases of violating the Nyquist criteria. (Aliasing is a type of distortion which cannot be corrected once it occurs.)
Yes, so long as the decimation factor, M, is not a prime number. For example, to decimate by a factor of 15, you could decimate by 5, then decimate by 3. The more prime factors M has, the more choices you have. For example you could decimate by a factor of 24 using:
- one stage: 24
- two stages: 6 and 4, or 8 and 3
- three stages: 4, 3, and 2
- four stages: 3, 2, 2, and 2
If you are simply downsampling (that is, throwing away samples without filtering), there's no benefit. But in the more common case of decimating (combining filtering and downsampling), the computational and memory requirements of the filters can usually be reduced by using multiple stages.
That's a tough one. There isn't a simple answer to this one: the answer varies depending on many things, so if you really want to find the optimum, you have to evaluate the resource requirements of each possibility.
However, here are a couple of rules of thumb which may help narrow down the choices:
The multirate book references give additional, more specific guidance.
Decimation consists of the processes of lowpass filtering, followed by downsampling.
To implement the filtering part, you can use either FIR or IIR filters.
To implement the downsampling part (by a downsampling factor of "M") simply keep every Mth sample, and throw away the M-1 samples in between. For example, to decimate by 4, keep every fourth sample, and throw three out of every four samples away.
Beauty, eh? ;-)
You may be onto something. In the case of FIR filters, any output is a function only of the past inputs (because there is no feedback). Therefore, you only have to calculate outputs which will be used.
For IIR filters, you still have to do part or all of the filter calculation for each input, even when the corresponding output won't be used. (Depending on the filter topology used, certain feed-forward parts of the calculation can be omitted.),. The reason is that outputs you do use are affected by the feedback from the outputs you don't use.
The fact that only the outputs which will be used have to be calculated explains why decimating filters are almost always implemented using FIR filters!
Since you compute only one of every M outputs, you save M-1 operations per output, or an overall "savings" of (M-1)/M. Therefore, the larger the decimation factor is, the larger the savings, percentage-wise.
A simple way to think of the amount of computation required to implement a FIR decimator is that it is equal to the computation required for a non-decimating N-tap filter operating at the output rate.
None. You still have to store every input sample in the FIR's delay line, so the memory requirement is the same size as for a non-decimated FIR having the same number of taps.
Just use your favorite FIR design method. The design criteria are:
- The passband lower frequency is zero; the passband upper frequency is whatever information bandwidth you want to preserve after decimating. The passband ripple is whatever your application can tolerate.
- The stopband lower frequency is half the output rate minus the passband upper frequency. The stopband attenuation is set according to whatever aliasing your application can stand. (Note that there will always be aliasing in a decimator, but you just reduce it to a negligible value with the decimating filter.)
- As with any FIR, the number of taps is whatever is required to meet the passband and stopband specifications.
A decimating FIR is actually the same as a regular FIR, except that you shift M samples into the delay line for each output you calculate. More specifically:
- Store M samples in the delay line.
- Calculate the decimated output as the sum-of-products of the delay line values and the filter coefficients.
- Shift the delay line by M places to make room for the inputs of the next decimation.
Also, just as with ordinary FIRs, circular buffers can be used to eliminate the requirement to literally shift the data in the delay line.
The major DSP vendors provide examples of FIR decimators in their data books and application notes; check their web sites.
You can test a decimating FIR in most of the ways you might test an ordinary FIR:
- A special case of a decimator is an "ordinary" FIR. When given a value of "1" for M, a decimator should act exactly like an ordinary FIR. You can then do impulse, step, and sine tests on it just like you can on an ordinary FIR.
- If you put in a sine whose frequency is within the decimator's passband, the output should be distortion-free (once the filter reaches steady-state), and the frequency of the output should be the same as the frequency of the input, in terms of absolute Hz.
- You also can extend the "impulse response" test used for ordinary FIRs by using a "fat impulse", consisting of M consecutive "1" samples followed by a series of "0" samples. In that case, if the decimator has been implemented correctly, the output will not be the literal FIR filter coefficients, but will be the sum of every subset of M coefficients.
- You can use a step response test. Given a unity-valued step input, the output should be the sum of the FIR coefficients once the filter has reached steady state.
"Upsampling" is the process of inserting zero-valued samples between original samples to increase the sampling rate. (This is called "zero-stuffing".) Upsampling adds to the original signal undesired spectral images which are centered on multiples of the original sampling rate.
"Interpolation", in the DSP sense, is the process of upsampling followed by filtering. (The filtering removes the undesired spectral images.) As a linear process, the DSP sense of interpolation is somewhat different from the "math" sense of interpolation, but the result is conceptually similar: to create "in-between" samples from the original samples. The result is as if you had just originally sampled your signal at the higher rate.
The primary reason to interpolate is simply to increase the sampling rate at the output of one system so that another system operating at a higher sampling rate can input the signal.
The interpolation factor is simply the ratio of the output rate to the input rate. It is usually symbolized by "L", so output rate / input rate=L.
Tip: You can remember that "L" is the symbol for interpolation factor by thinking of "interpo-L-ation".
Yes. Since interpolation relies on zero-stuffing you can only interpolate by integer factors; you cannot interpolate by fractional factors. (However, you can combine interpolation and decimation to achieve an overall rational factor, for example, 4/5; see Part 4: Resampling.)
All. There is no restriction.
Yes. Otherwise, you're doing upsampling. ;-)
Upsampling adds undesired spectral images to the signal at multiples of the original sampling rate, so unless you remove those by filtering, the upsampled signal is not the same as the original: it's distorted.
Some applications may be able to tolerate that, for example, if the images get removed later by an analog filter, but in most applications you will have to remove the undesired images via digital filtering. Therefore, interpolation is far more common that upsampling alone.
Yes, so long as the interpolation ratio, L, is not a prime number. For example, to interpolate by a factor of 15, you could interpolate by 3 then interpolate by 5. The more factors L has, the more choices you have. For example you could interpolate by 16 in:
- one stage: 16
- two stages: 4 and 4
- three stages: 2, 2, and 4
- four stages: 2, 2, 2, and 2
Just as with decimation, the computational and memory requirements of interpolation filtering can often be reduced by using multiple stages.
There isn't a simple answer to this one: the answer varies depending on many things. However, here are a couple of rules of thumb:
The multirate book references give additional, more specific guidance.
Interpolation always consists of two processes:
- Inserting L-1 zero-valued samples between each pair of input samples. This operation is called "zero stuffing".
- Lowpass-filtering the result.
The result (assuming an ideal interpolation filter) is a signal at L times the original sampling rate which has the same spectrum over the input Nyquist (0 to Fs/2) range, and with zero spectral content above the original Fs/2.
- The zero-stuffing creates a higher-rate signal whose spectrum is the same as the original over the original bandwidth, but has images of the original spectrum centered on multiples of the original sampling rate.
- The lowpass filtering eliminates the images.
This idea is appealing because, intuitively, this "stairstep" output seems more similar to the original than the zero-stuffed version. But in this case, intuition leads us down the garden path. This process causes a "zero-order hold" distortion in the original passband, and still creates undesired images (see below).
Although these effects could be un-done by filtering, it turns out that zero-stuffing approach is not only more "correct", it actually reduces the amount of computation required to implement a FIR interpolation filter. Therefore, interpolation is always done via zero-stuffing.
The output of a FIR filter is the sum each coefficient multiplied by each corresponding input sample. In the case of a FIR interpolation filter, some of the input samples are stuffed zeros. Each stuffed zero gets multiplied by a coefficient and summed with the others. However, this adding-and-summing processing has no effect when the data sample is zero--which we know in advance will be the case for L-1 out of each L input samples of a FIR interpolation filter. So why bother to calculate these taps?
The net result is that to interpolate by a factor of L, you calculate L outputs for each input using L different "sub-filters" derived from your original filter.
Here's an example of a 12-tap FIR filter that implements interpolation by a factor of four. The coefficients are h0-h11, and three data samples, x0-x2 (with the newest, x2, on the left) have made their way into the filter's delay line:
h0 h1 h2 h3 h4 h5 h6 h7 h8 h9 h10 h11 Result x2 0 0 0 x1 0 0 0 x0 0 0 0 x2·h0+x1·h4+x0·h8 0 x2 0 0 0 x1 0 0 0 x0 0 0 x2·h1+x1·h5+x0·h9 0 0 x2 0 0 0 x1 0 0 0 x0 0 x2·h2+x1·h6+x0·h10 0 0 0 x2 0 0 0 x1 0 0 0 x0 x2·h3+x1·h7+x0·h11
The table suggests the following general observations about FIR interpolators:
- Since the interpolation ratio is four (L=4), there are four "sub-filters" (whose coefficient sets are marked here with matching colors.) These sub-filters are officially called "polyphase filters".
- For each input, we calculate L outputs by doing L basic FIR calculations, each using a different set of coefficients.
- The number of taps per polyphase filter is 3, or, expressed as a formula: Npoly=Ntotal / L.
- The coefficients of each polyphase filter can be determined by skipping every Lth coefficient, starting at coefficients 0 through L-1, to calculate corresponding outputs 0 through L-1.
- Alternatively, if you rearranged your coefficients in advance in "scrambled" order like this:
h0, h4, h8, h1, h5, h9, h2, h6, h10, h3, h7, h11
then you could just step through them in order.
- We have hinted here at the fact that N should be a multiple of L. This isn't absolutely necessary, but if N isn't a multiple of L, the added complication of using a non-multiple of L often isn't worth it. So if the minimum number of taps that your filter specification requires doesn't happen to be a multiple of L, your best bet is usually to just increase N to the next multiple of L. You can do this either by adding some zero-valued coefficients onto the end of the filter, or by re-designing the filter using the larger N value.
Since each output is calculated using only N/L coefficients (rather than N coefficients), you get an overall computational "savings" of (N - N/L) per output .
A simple way to think of the amount of computation required to implement a FIR interpolator is that it is equal to the computation required for a non-interpolating N-tap filter operating at the input rate. In effect, you have to calculate L filters using N/L taps each, so that's N total taps calculated per input.
Compared to the straight-forward implementation of interpolation by upsampling the signal by stuffing it with L-1 zeros , then filtering it, you save memory by a factor of (L-1)/L. In other words, you don't have to store L-1 zero-stuffed "upsamples" per actual input sample.
Just use your favorite FIR design method. The design criteria are:
An interpolating FIR is actually the same as a regular FIR, except that, for each input, you calculate L outputs per input using L polyphase filters, each having N/L taps. More specifically:
- Store a sample in the delay line. (The size of the delay line is N/L.)
- For each of L polyphase coefficient sets, calculate an output as the sum-of-products of the delay line values and the filter coefficients.
- Shift the delay line by one to make room for the next input.
Also, just as with ordinary FIRs, circular buffers can be used to eliminate the requirement to literally shift the data in the delay line.
The major DSP vendors provide examples of FIR interpolators in their data books and application notes, so check their web sites.
You can test an interpolating FIR in most of the ways you might test an ordinary FIR:
- A special case of an interpolator is an ordinary FIR. When given a value of 1 for L, an interpolator should act exactly like an ordinary FIR. You can then do impulse, step, and sine tests on it just like you can on an ordinary FIR.
- If you put in a sine whose frequency is within the interpolator's passband, the output should be distortion-free (once the filter reaches steady state), and the frequency of the output should be the same as the frequency of the input, in terms of absolute Hz.
- You can use a step response test. Given a unity-valued step input, every group of L outputs should be the same as the sums of the coefficients of the L individual polyphase filters, once the filter has reached steady state.
"Resampling" means combining interpolation and decimation to change the sampling rate by a rational factor.
Resampling is usually done to interface two systems which have different sampling rates. If the ratio of two system's rates happens to be an integer, decimation or interpolation can be used to change the sampling rate (depending on whether the rate is being decreased or increased); otherwise, interpolation and decimation must be used together to change the rate
A practical and well-known example results from the fact that professional audio equipment uses a sampling rate of 48 kHz, but consumer audio equipment uses a rate of 44.1 kHz. Therefore, to transfer music from a professional recording to a CD, the sampling rate must be changed by a factor of:
(44100 / 48000) = (441 / 480) = (147 / 160)
There are no common factors in 147 and 160, so we must stop factoring at that point. Therefore, in this example, we would interpolate by a factor of 147 then decimate by a factor of 160.
The interpolation factor is simply the ratio of the output rate to the input rate. Given that the interpolation factor is L and the decimation factor is M, the resampling factor is L / M. In the above example, the resampling factor is 147 / 160 = 0.91875
Yes. As always, the Nyquist criteria must be met relative to the resulting output sampling rate, or aliasing will result. In other words, the output rate cannot be less than twice the highest frequency (of interest) of the input signal.
Yes. Since resampling includes interpolation, you need an interpolation filter. Otherwise, the images created by the zero-stuffing part of interpolation will remain, and the interpolated signal will not be "the same" as the original.
Likewise, since resampling includes decimation, you seemingly need a decimation filter. Or do you? Since the interpolation filter is in-line with the decimation filter, you could just combine the two filters by convolving their coefficients into a single filter to use for decimation. Better yet, since both are lowpass filters, just use whichever filter has the lowest cutoff frequency as the interpolation filter.
As hinted at above:
Yes, but there are a couple of restrictions:
- If either the interpolation or decimation factors are prime numbers, you won't be able to decompose those parts of the resampler into stages.
- You must preserve the Nyquist criteria at each stage or else aliasing will result. That is, no stage can have an output rate which is less than twice the highest frequency of interest.
Just as with interpolation and decimation, the computational and/or memory requirements of the resampling filtering can sometimes be greatly reduced by using multiple stages.
The straight-forward implementation of resampling is to do interpolation by a factor of L, then decimation by a factor of M. (You must do it in that order; otherwise, the decimator would remove part of the desired signal--which the interpolator could not restore.)
No. The problem is that for resampling factors close to 1.0, the interpolation factor can be quite large. For example, in the case described above of changing from the sampling rate from 48 kHz to 44.1 kHz, the ratio is only 0.91875, yet the interpolation factor is 147!
Also, you are filtering the signal twice: once in the interpolator and once in the decimator. However, one of the filters has a larger bandwidth than the other, so the larger-bandwidth filter is redundant.
Just combine the computational and memory advantages that FIR interpolator and decimator implementations can provide. (If you don't already understand those, be sure to read and understand Part 2: Decimation, and Part 3: Interpolation before continuing.)
First, let's briefly review what makes FIR interpolation and decimation efficient:
- When interpolating by a factor of L, you only have to actually calculate 1/L of the FIR taps per interpolator output.
- When decimating by a factor of M, you only have to calculate one output for every M decimator inputs.
So, combining these ideas, we will calculate only the outputs we actually need, using only a subset of the interpolation coefficients to calculate each output. That makes it possible to efficiently implement even FIR resamplers which have large interpolation and/or decimation factors.
The tricky part is figuring out which polyphase filters to apply to which inputs, to calculate the desired outputs, as a function of L and M. There are various ways of doing that, but they're all beyond our scope here.
Iowegian's ScopeFIR comes with a free set of multirate algorithms, including FIR resampling functions in C. Just download and install the ScopeFIR distribution file.
The Fast Fourier Transform (FFT) is simply a fast (computationally efficient) way to calculate the Discrete Fourier Transform (DFT).
By making use of periodicities in the sines that are multiplied to do the transforms, the FFT greatly reduces the amount of calculation required. Here's a little overview.
Functionally, the FFT decomposes the set of data to be transformed into a series of smaller data sets to be transformed. Then, it decomposes those smaller sets into even smaller sets. At each stage of processing, the results of the previous stage are combined in special way. Finally, it calculates the DFT of each small data set. For example, an FFT of size 32 is broken into 2 FFTs of size 16, which are broken into 4 FFTs of size 8, which are broken into 8 FFTs of size 4, which are broken into 16 FFTs of size 2. Calculating a DFT of size 2 is trivial.
Here's a slightly more rigorous explanation: It turns out that it is possible to take the DFT of the first N/2 points and combine them in a special way with the DFT of the second N/2 points to produce a single N-point DFT. Each of these N/2-point DFTs can be calculated using smaller DFTs in the same way. One (radix-2) FFT begins, therefore, by calculating N/2 2-point DFTs. These are combined to form N/4 4-point DFTs. The next stage produces N/8 8-point DFTs, and so on, until a single N-point DFT is produced.
The DFT takes N^2 operations for N points. Since at any stage the computation required to combine smaller DFTs into larger DFTs is proportional to N, and there are log2(N) stages (for radix 2), the total computation is proportional to N * log2(N). Therefore, the ratio between a DFT computation and an FFT computation for the same N is proportional to N / log2(n). In cases where N is small this ratio is not very significant, but when N becomes large, this ratio gets very large. (Every time you double N, the numerator doubles, but the denominator only increases by 1.)
No. The most common and familiar FFTs are "radix 2". However, other radices are sometimes used, which are usually small numbers less than 10. For example, radix-4 is especially attractive because the "twiddle factors" are all 1, -1, j, or -j, which can be applied without any multiplications at all.
Also, "mixed radix" FFTs also can be done on "composite" sizes. In this case, you break a non-prime size down into its prime factors, and do an FFT whose stages use those factors. For example, an FFT of size 1000 might be done in six stages using radices of 2 and 5, since 1000 = 2 * 2 * 2 * 5 * 5 * 5. It might also be done in three stages using radix 10, since 1000 = 10 * 10 * 10.
Yes, although these are less efficient than single-radix or mixed-radix FFTs. It is almost always possible to avoid using prime sizes.
The "radix" is the size of an FFT decomposition. In the example above, the radix was 2. For single-radix FFTs, the transform size must be a power of the radix. In the example above, the size was 32, which is 2 to the 5th power.
"Twiddle factors" are the coefficients used to combine results from a previous stage to form inputs to the next stage.
An "in place" FFT is simply an FFT that is calculated entirely inside its original sample memory. In other words, calculating an "in place" FFT does not require additional buffer memory (as some FFTs do.)
"Bit reversal" is just what it sounds like: reversing the bits in a binary word from left to right. Therefore the MSBs become LSBs and the LSBs become MSBs. But what does that have to do with FFTs? Well, the data ordering required by radix-2 FFTs turns out to be in "bit reversed" order, so bit-reversed indexes are used to combine FFT stages. It is possible (but slow) to calculate these bit-reversed indices in software; however, bit reversals are trivial when implemented in hardware. Therefore, almost all DSP processors include a hardware bit-reversal indexing capability (which is one of the things that distinguishes them from other microprocessors.)
FFTs can be decomposed using DFTs of even and odd points, which is called a Decimation-In-Time (DIT) FFT, or they can be decomposed using a first-half/second-half approach, which is called a "Decimation-In-Frequency" (DIF) FFT. Generally, the user does not need to worry which type is being used.
Except as a learning exercise, you generally will never have to. Many good FFT implementations are available in C, Fortran and other languages, and microprocessor manufacturers generally provide free optimized FFT implementations in their processors' assembly code, Therefore, it is not so important to understand how the FFT really works, as it is to understand how to use it.
(Gosh you're difficult!) Well, virtually every DSP book on the planet covers the FFT in detail. However, if you want to read something online right now, see The Scientists and Engineer's Guide to DSP.
- If you want an assembly language implementation, check out the web site of the manufacturer of your chosen DSP microprocessor. They generally provide highly optimized assembly implementations in their user's guides and application manuals, and also as part of the library of their C compilers.
- Here are a couple of the best C implemenations:
- There are several great FFT hotlists on the net. One of the best is jj's useful and ugly FFT page, and FFTW also has a very good FFT links page.
The following people have contributed questions, answers, or helpful suggestions: Dale Grover, Marius Vollmer
COordinate Rotation DIgital Computer. (Doesn't help much, does it?!)
It calculates the trigonometric functions of sine, cosine, magnitude and phase (arctangent) to any desired precision. It can also calculate hyperbolic functions, but we don't cover that here.
CORDIC revolves around the idea of "rotating" the phase of a complex number, by multiplying it by a succession of constant values. However, the multiplies can all be powers of 2, so in binary arithmetic they can be done using just shifts and adds; no actual multiplier is needed.
Compared to other approaches, CORDIC is a clear winner when a hardware multiplier is unavailable, e.g. in a microcontroller, or when you want to save the gates required to implement one, e.g. in an FPGA. On the other hand, when a hardware multiplier is available, e.g. in a DSP microprocessor, table-lookup methods and good old-fashioned power series are generally faster than CORDIC.
Funny you should ask: I just happen to: cordic-xls.zip. I highly recommend you open this up and look it over a little before you ask any more questions.
Yup, it's your lucky day: cordic.zip.
|Given a complex value:||C = Ic + jQc|
|we will create a rotated value:||C' = Ic' + jQc'|
|by multiplying by a rotation value:||R = Ir + jQr|
|To add R's
phase to C:
|To subtract R's
phase from C:
|To add 90 degrees,
multiply by R = 0 + j1:
|To subtract 90 degrees,
multiply by R = 0 - j1:
|To add a phase,
multiply by R = 1 + jK:
|To subtract a phase,
multiply by R = 1 - jK:
|L||K = 2^-L||R = 1 + jK||Phase of R
|Magnitude of R||CORDIC Gain|
|0||1.0||1 + j1.0||45.00000||1.41421356||1.414213562|
|1||0.5||1 + j0.5||26.56505||1.11803399||1.581138830|
|2||0.25||1 + j0.25||14.03624||1.03077641||1.629800601|
|3||0.125||1 + j0.125||7.12502||1.00778222||1.642484066|
|4||0.0625||1 + j0.0625||3.57633||1.00195122||1.645688916|
|5||0.03125||1 + j0.031250||1.78991||1.00048816||1.646492279|
|6||0.015625||1 + j0.015625||0.89517||1.00012206||1.646693254|
|7||0.007813||1 + j0.007813||0.44761||1.00003052||1.646743507|
You can calculate the magnitude of a complex number C = Ic + jQc if you can rotate it to have a phase of zero; then its new Qc value would be zero, so the magnitude would be given entirely by the new Ic value.
"So how do I rotate it to zero," you ask? Well, I thought you might ask:
- You can determine whether or not the complex number "C" has a positive phase just by looking at the sign of the Qc value: positive Qc means positive phase. As the very first step, if the phase is positive, rotate it by -90 degrees; if it's negative, rotate it by +90 degrees. To rotate by +90 degrees, just negate Qc, then swap Ic and Qc; to rotate by -90 degrees, just negate Ic, then swap. The phase of C is now less than +/- 90 degrees, so the "1 +/- jK" rotations to follow can rotate it to zero.
- Next, do a series of iterations with successively smaller values of K, starting with K=1 (45 degrees). For each iteration, simply look at the sign of Qc to decide whether to add or subtract phase; if Qc is negative, add a phase (by multiplying by "1 + jK"); if Qc is positive, subtract a phase (by multiplying by "1 - jK"). The accuracy of the result converges with each iteration: the more iterations you do, the more accurate it becomes.
[Editorial Aside: Since each phase is a little more than half the previous phase, this algorithm is slightly underdamped. It could be made slightly more accurate, on average, for a given number of iterations, by using "ideal" K values which would add/subtract phases of 45.0, 22.5, 11.25 degrees, etc. However, then the K values wouldn't be of the form 2^-L, they'd be 1.0, 0.414, 0.199, etc., and you couldn't multiply using just shift/add's (which would eliminate the major benefit of the algorithm). In practice, the difference in accuracy between the ideal K's and these binary K's is generally negligible; therefore, for a multiplier-less CORDIC, go ahead and use the binary Ks, and if you need more accuracy, just do more iterations.]
Now, having rotated our complex number to have a phase of zero, we end up with "C = Ic + j0". The magnitude of this complex value is just Ic, since Qc is zero. However, in the rotation process, C has been multiplied by a CORDIC Gain (cumulative magnitude) of about 1.647. Therefore, to get the true value of magnitude we must multiply by the reciprocal of 1.647, which is 0.607. (Remember, the exact CORDIC Gain is a function of the how many iterations you do.) Unfortunately, we can't do this gain-adjustment multiply using a simple shift/add; however, in many applications this factor can be compensated for in some other part of the system. Or, when relative magnitude is all that counts (e.g. AM demodulation), it can simply be neglected.
To calculate phase, just rotate the complex number to have zero phase, as you did to calculate magnitude. Just a couple of details are different.
- For each phase-addition/subtraction step, accumulate the actual number of degrees (or radians) you have rotated. The actuals will come from a table of "atan(K)" values like the "Phase of R" column in the table above. The phase of the complex input value is the negative of the accumulated rotation required to bring it to a phase of zero.
- Of course, you can skip compensating for the CORDIC Gain if you are interested only in phase.
Yes--you're very astute.
You basically do the inverse of calculating magnitude/phase by adding/subtracting phases so as to "accumulate" a rotation equal to the given phase. Specifically:
- Start with a unity-magnitude value of C = Ic + jQc. The exact value depends on the given phase. For angles greater than +90, start with C = 0 + j1 (that is, +90 degrees); for angles less than -90, start with C = 0 - j1 (that is, -90 degrees); for other angles, start with C = 1 + j0 (that is, zero degrees). Initialize an "accumulated rotation" variable to +90, -90, or 0 accordingly. (Of course, you also could do all this in terms of radians.)
- Do a series of iterations. If the desired phase minus the accumulated rotation is less than zero, add the next angle in the table; otherwise, subtract the next angle. Do this using each value in the table.
- The "cosine" output is in "Ic"; the "sine" output is in "Qc".
A couple of notes:
- Again, the accuracy improves by about a factor of two with each iteration; use as many iterations as your application's accuracy requires.
- This algorithm gives you both cosine (Ic) and sine (Qc). Since CORDIC uses complex values to do its magic, it's not possible to calculate sine and cosine separately.
- A survey of CORDIC algorithms for FPGAs by Ray Andraka of Andraka Consulting Group. This very readable paper's review of CORDIC's principles is more comprehensive and rigorous than this FAQ, so it's a good place to go from here.
- Wikipedia's CORDIC Page
- Fixed-Point Trigonometry With CORDIC Iterations by Ken Turkowski
- New Virtually Scaling-Free Adaptive CORDIC Rotator
- Jack E. Volder, The CORDIC Trigonometric Computing Technique, IRE Transactions on Electronic Computers, September 1959.
- J. E. Meggitt, Pseudo Division and Pseudo Multiplication Processes, IBM Journal, April 1962.
- M. E. Frerking , Digital Signal Processing in Communication Systems [Fre94].
- Henry Briggs, Arithmetica Logarithmica, 1624.
Thanks to Ray Andraka for helping me understand. Thanks to Rick Lyons for review. Thanks to Mark Brown and Bill Wiese for providing links.
RC and RRC filters are generally implemented as FIR (Finite Impulse Response) filters. The linear-phase property of FIR filters makes them. Also, FIR filters are easy-to-implement and efficient. See our FIR Filter FAQ for more FIR information.
The frequency responses of RC and RRC filters are fully specified by a single paramter, the "rolloff factor"; this is a number between 0.0 and 1.0. It often is symbolized as either alpha or beta. But to design a FIR filter that will actually implement an RC or RRC response, you also need two more parameters, the number of taps ("NTaps"), and the number of samples per symbol ("
RC and RRC filters can be designed in either the time domain (FIR impulse reponse) or the frequency domain (FIR frequency response). To design in the time domain, calculate the FIR filter coefficients directly from a formula (see below). To design in the frequency domain, use the definition of the RC or RRC response to fill an array representing the RC/RRC frequency response. Then, use the Inverse Discrete Fourier Transform (IDFT) to calculate the FIR filter's impulse response (coefficients).
See dspGuru's Raised-Cosine and Root-Raised-Cosine Formulas page.