it’s a very simple thing i want: i want to get the spectrum of my audio file without playing it. i want to give an interval of lets say 100ms and get the spectrum of this 100ms for every part of the song without playing it!
in short i want to have a GetSpectrumData Function without the “must play” limitation.
i want to analyze the whole audio in a short time before i play it!
can’t be too difficult right? well i searched hours for a solution - nothing. maybe you guys can help me because i’m completely desperated!
I just implemented the FFT algorithm myself in C#. In addition i first created a Complex number struct which is used by the FFT function. You can find it [here on my pastebin]. As i mentioned in the info header i basically implemented [Paul Bourke’s version] but instead of using two arrays i used one array of my Complex type. This method calculates the FFT in-place. So it transforms the given sample array (which need to have a length that is a power of two) from the time domain into the frequency domain or the other way round.
A few important notes:
- As already mentiond the array always need to have a size that is a power of two (i.e. 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192, 16384, 32768, 65536, …)
- If you have your samples as a float array you need to copy them into a Complex array first. Keep in mind that you can reuse the array.
- The transformed frequencies reach from 0 up to the sample frequency. However it makes no sense to look at frequencies that are higher than the [Nyquist-frequency] which is half the sampling frequency. So the second half of the FFT can be ignored. The usual way is to only take the first half and multiply the results by two to compensate the power levels.
- Based on the last point it should be clear that if you want for example 1024 frequency bins / bands you need 2048 samples.
- If you want to use the FFT class as a filter you should use the whole FFT so when calculating the inverse FFT you get the exact same samples back that you originally feed in. Here are three examples [Ex1], [Ex2], [Ex3]. Green is the input signal, red the FFT and yellow the inverse FFT. Those examples have been made from 65k samples. The FFT of 65k (1<<16) samples took around 60ms on my PC. 4k(1<<12) samples took about 3ms
- I haven’t implemented any windowing functions. So the results are like you used a “rectangle window”. If you want / need one you just need to preprocess your data by your desired windowing function.
I’ve made a quick test to compare Unity’s “GetSpectrumData” with my own implementation. The result is pretty much the same:
// [ ... ]
float spec = new float;
float tmp = new float;
Complex spec2 = new Complex;
// Unity's FFT function
AudioListener.GetSpectrumData(spec, 0, FFTWindow.Rectangular);
for (int i = 0; i < spec.Length; i++)
Debug.DrawLine(new Vector3(i, 0), new Vector3(i, spec*), Color.cyan);*
// My FFT based on the output samples.
// copy the output data into the complex array
for(int i = 0; i < tmp.Length; i++)
spec2 = new Complex(tmp*,0);*
// calculate the FFT
for (int i = 0; i < spec2.Length/2; i++) // plot only the first half
// multiply the magnitude of each value by 2
Debug.DrawLine(new Vector3(i, 4), new Vector3(i, 4+(float)spec2_.magnitude2), Color.white);
Of course instead of feeding the FFT function the samples from “AudioListener.GetOutputData” you can also feed it chunks of samples from an audio file. Since the FFT function is static and don’t use any global variables / state it can be easily multithreaded if needed. Just ensure that each FFT method / thread has it’s own sample array it’s working on.
If you’re interested in how the FFT (or DFT in general) works i recommend [this video]. Even though the guy messes up a lot of his math and equations, most are fixed by annotations. What’s great about that presentation is the visual representation of what’s happening. If you want to make sense of the code i recommend to first look at the [DFT implementation] (Appendix A.). To make sense of the FFT code you should be familiar with complex numbers. Though you don’t have to understand it to use it.
: Nyquist frequency - Wikipedia*
_: Dropbox - FFT02.png - Simplify your life
*: Dropbox - FFT03.png - Simplify your life
: Intuitive Understanding of the Fourier Transform and FFTs - YouTube
I don’t know how to make your own FFT but if you want a hack for using GetSpectrumData without hearing the audio you can do this:
Go to the Audio Mixer and add a new group (I called mine “input”)
Click on the Input group and add the effect Duck Volume
In Duck Volume set Make-up Gain all the way to the left and Ratio all the way to the right. Make sure Duck Volume is at the bottom of the stack. This effectively mutes the channel after you’ve read the spectrum data.
On your Audio Source, set the Output to your new group in the Audio Mixer (in my case, “input”)
Adding to Bunny’s awesome code, here’s my findings… It may be lacking windows, i.e. rectangular/blackman-harris etc etc. The default window seems to be giving major sound-blurring on the frequency axis, translated by long lines where there is some chirp noise and the algo is not totally sure where the noise is. I know there shouldn’t be so many lines as this, will have to work to figure out what it is:
Okay… I have researched the same birdsong from other FFT’s, it has frequency axis lines also. these are noise chirps, although I am working with filter banks at the moment which gives me 100 times higher resolution than FFT, so i will use the FFT to optimize the high res scan as a 2nd sweep. nice…