I finally got off my lazy behind and got an addons account…I was going to make a small ofxFFT addon for the audacity FFT…using the LGPL. I’m assuming it makes sense to have 2 FFTs…one compact LGPL one and a faster GPL one…or what do you think? I think people have been using the code on my server for a while…but it makes more sense to put it up on an official OF repository.
I agree, it totally makes sense to have two. It would be nice to have yours in the core, as FFT is helpful in general. What would be awesome is if both addons had the exact same interface, so you could swap them out just by changing the datatype and include. We could:
Just agree on a good structure to use, or
Propose a pure virtual FFT interface for the core that we both extend, or
You make your methods virtual, and I could extend yours (you could just copy the windowing code, for example, and I wouldn’t have to rewrite it)
A few questions:
1 Is it ok that I have an extra argument to the constructor, so long as it has a default value? (i.e., you won’t need quickStart)
2 How should IFFT be handled? I was thinking void ifft(float*, float*) for symmetry purposes.
3 Is the “assume the user has memory allocated for you” model the right way of moving things around, or should we store things internally and let the user use memcpy on a returned ptr?
4 We should double check that our results are normalized the same way. Mine are normalized 0-1, where 1 is a sine wave centered on a frequency bin with gain 1.
5 Are there any other structural differences you were thinking of between your addon and this on?
Having the same interface sounds like a great idea. If we can come up with a generic interface that we can both derive from that should be good. However I’d like to make sure (at least in my version) that people need very limited FFT experience to be able to use the addon, which means providing minimal parameters other than the necessary ones. This probably can be achieved using default values. Is there any way to avoid the quickstart argument (perhaps pass it to a setup() function)?
I think memcpy should be avoided…so encouraging users to allocate stuff previously is better, and is more typical of OF style I think (video stuff is typically done this way).
I posted a new version above that divides up the generic fft stuff from the fftw stuff (in ofFft and ofxFftw respectively). I was thinking yours would be ofxFft?
I didn’t put normalization or cartesian to polar conversion in ofFft, as those depend on the fft implementation. I did, however, include:
Great job with the addon!
I am trying to use this for inverse ffts. I noticed that the ofxFftw._ifft is a blank implementation. Have you gotten around to implementing it yet?
Let me know.
Ok, I just upped a new version that wraps the ifft functionality of fftw.
If you want to do frequency domain processing of audio, like audio->fft->transform->ifft->audio, I’ll need to add a few more things. Really, I’m just not dealing with the phase information (which is essential for reconstructing a signal from its fft).
An ifft is basically a bank of oscillators, and without phase you’ll definitely be able to hear that
hey, I tried the addon and it seems to work great.- Still I am a little bit confused about the normalization, since I also get values bigger than one (ie 1.5034) so I was wondering if thats an inaccuracy, what can I do about it?
Also what is a good way to bring every band of the spectrum to a similar value? since “raw” fft has way bigger values in the low frequencies compared to the high or even mid ones.
Hey moka, if you could post an example of a sounds that produces larger values than 1, that would be great!
The thing you’re seeing with the lower frequencies having larger values is due to the fact that the Fourier transform decomposes into linearly spaced frequency bins. Let’s say most musical content is in the 100Hz - 2kHz range. If you’re working with a 512 size FFT at 44.1 kHz, that gives you 256 bins for 0-22.05 kHz, which is 86 Hz per bin. That means only the first 35 out of 256 bins provide information that you’re “looking for”.
You can either sample just that region, or you can use a larger FFT size, or you can try to interpolate between the bins (this is what most “logarithmic” spectrum analyzers do – a linear transform, then interpolation using logarithmically spaced samples).
The normalization I applied is to sine waves with a gain of 1, maybe if your input is peaking it creates a square-ish wave that has more energy than a sine wave? I never checked with square waves I’m pretty confident that the values won’t be larger than 1 as long as there is no clipping happening.
Assuming that it’s not clipping, in which case the distortion will produce bad data, you can make it independent of the input by normalizing the spectrum to the maximum value.
In one pass you look for the maximum value, then in a second you divide everything by that maximum.
If you do this constantly, you won’t be able to tell the difference between “quiet” and “loud” though. So you want to have a maximum value that slowly adapts over time. This is sometimes called automatic gain control (AGC).