I made an onset detector in C#, with Unity and games in mind.
It tries to detect beats, snares, hats and other peaks in certain frequency ranges. It does a decent job with most genres. It can detect the presence of singing and melodies too. It analyzes an entire song within a couple of seconds. The data can then be used in a game, or something else.
I’m planning to make a game out of this. In the video you can see some of the basic functionality.
Another feature is that it can import MP3 files at runtime, which is important for this kind of game.
I’ll probably release the source for the onset-detector, when it’s all cleaned up. It’s a huge mess now.
I’m doing a fourier transform on samples from AudioClip.GetData. with the resulting spectrum, I can detect the onsets. More info on onset detection can be found here.
I have already cleaned up and released the MP3 import at runtime scripts by the way.
Thanks for that!
I don’t have Unity installed, so I can’t test the mp3 import code. What output do you get when you use an empty mp3? (Audacity → Generate → Silence)
From what I gathered, the resulting samples should be all zero’s right?
The following was what I used (shouldn’t be much different from C# Pinvoke)
The mp3 I used 1163298–44532–$zero_silenced.zip (464 KB) was single tracked (mono) so I didn’t bother to deal with interleaved values.
inline float short_float(short val)
{
return val < 0 ? val*(1/32768.0f) : val*(1/32767.0f);
}
mpg123_handle *m = NULL;
int channels = 0, encoding = 0;
long rate = 0;
int err = MPG123_OK;
err = mpg123_init();
m = mpg123_new(NULL, &err);
mpg123_open(m, "L:\\zero_silenced.mp3");
mpg123_getformat(m, &rate, &channels, &encoding);
err = mpg123_format_none(m);
err = mpg123_format(m, rate, channels, encoding);
// Get the first 2048 samples
const int TIME = 2048;
// 16-bit integer encoded in bytes, hence x2 size
unsigned char* buffer = new unsigned char[TIME*2];
size_t done = 0;
err = mpg123_read(m, buffer, TIME*2, &done);
float* samples = new float[TIME];
int index = 0;
// Iterate 2 bytes at a time
for (int i = 0; i < done; i += 2)
{
unsigned char first = buffer[i];
unsigned char second = buffer[i + 1];
short val = (first | (second << 8));
samples[index++] = short_float(val);
}
I don’t want to rush you (not that I could). But I just had to add my voice to those that will be very happy the moment you release the source for that
I am just beginning my research into the topic and having a working example in Unity would be awesome.
A question since you linked to the badlogic tutorial:
Did you basically just follow along that tutorial to produce the same in C# or did you use it as a base and continued from there? I guess my question is if you consider the method used in the tutorial good enough for basic functionality, even if you went further than that.
I used the basic principles mentioned in that tutorial. I spent most time finding a decent FFT implementation that gave good results. This one seems to give the best results.
Currently I’m working on the game that is going to use this. I’m going to have to clean up the code for the onset detector, to be able to implement it properly. Right now it has a lot of debugging code and ambiguous and redundant parts. That’s mainly because of a lot of experimenting. I don’t think that’s going to help anyone to understand it.
I’m going at it soon. I’m a bit of a procrastinator though.
“I’m going at it soon. I’m a bit of a procrastinator though.”
I just began overhauling it. I’m rebuilding it from scratch. The old detector did all of the analyzing and detecting beforehand, which could take up to 20 seconds, for songs that were over 10 minutes long. Even longer on older computers.
Right now I’m making it so it will do everything on the fly. The actual analysis can be given a head start, so the beat and onset information will be available before they happen, just like in the old version, which is very important if you want to use it in a game.
This thing still is a bit messy. A lot of it’s functionality and properties are interwoven into the prototyping I’m working on. Once I get my concept pinned down, I’ll re-do it, so it will be usable in a wide variety of games.
Hello KoningStoma,
Has there been any progression with the project? I’ve been trying to accomplish onset detection, but unsurprisingly its not accurate at all.
Hello KoningStoma,
actually in my free time im working in a game about music, and im trying to use your code to draw cubes into array, but i cant make that the cubes move in the exactly moment of the beat, could you please help me with this ? what methods need for this ?
What exactly do you use to get raw PCM data from an mp3?
I’ve tried to use your code (thinking that AudioClip.GetData will yield me raw PCM), but it only filled the array with 0s. I used an array of size 1024 as a parameter.