I’m trying to build a very rudimentary pitch contour from a voice sample, and as I don’t need much accuracy the easiest way to do this seemed to be to check the fundamental frequency of the recording at some given interval, build an array containing the value of the F0 at each time, then draw that array as a graph to show the general trend of the speaker’s pitch.
I’m trying to accomplish this with two functions: ParseAudioData takes an AudioClip and a float interval, which is the number of samples I want to gather spaced evenly throughout the file,
float interval = 10f;
float[]ParseAudioData(AudioClip myClip){
mySource.clip = myClip; //Give the clip I want to analyze to the AudioSource
float timeInterval = myClip.length/interval; //set timeInterval to the time in seconds I should advance through myClip each time I take a new sample. If interval=4f and myClip.length = 1f, timeInterval=0.25f
float[] frequencyArray =newfloat[(int)interval]; //Create a new array with one index for each sample I want to take
mySource.time =0; //Set the clip to 0 minutes, 0 seconds
float totalInterval =0f; //This will be used to keep track of how far I've advanced through the file
for(int i =0; i < interval; i++){ //For each sample I want to take...
frequencyArray[i]=GetFundamentalFrequency(mySource, totalInterval); //Run GetFundamentalFrequency, which returns the F0 as a float, and store that number in frequencyArray
totalInterval += timeInterval; //Increment totalInterval by the amount of time we want to advance before taking the next sample
}
return frequencyArray; //Hand a completed array filled with F0s taken from various points in the audio clip to whoever called the function
}
float GetFundamentalFrequency(AudioSource mySource,float sourceTime) //Given an AudioSource, and the time within the file we should get data from, this should return the fundamental frequency of the voice recorded at time sourceTime.
{
float fundamentalFrequency =0.0f;
float[] data =newfloat[8192];
mySource.time = sourceTime;
mySource.Play();
mySource.GetSpectrumData(data,0,FFTWindow.BlackmanHarris);
mySource.Stop();
float s =0.0f;
int i =0;
for(int j =1; j <8192; j++)
{
if( s < data[j])
{
s = data[j];
i = j;
}
}
fundamentalFrequency = i * samplerate /8192;
return fundamentalFrequency;
}
Now this looks right to me, and I expect ParseAudioData to return an array filled with F0s from different points in the audioclip, but every time I run this, no matter what audio file I feed it, every value in my F0 array is set to 5512. I’ve been staring at this for two days, but I can’t see where my mistake is… is AudioSource not meant to be used this way?