C# create new audio format?

As I recently created a custom image and compression format, a custom 3D model format, including custom shape keys. And a basic encryption. Etc I wondered…

Is it possible to create a custom audio format and component for playing a custom audio file?

such as is it possible to speak to the system cpu speakers directly, live; and feed them rolling datas. Such as one frequency at a time during an update? Such as a device the Theremin, which allows the manipulation of a magnetic field directly influence the pulse rate of electricity feeding the speaker plates.

right now I confront a choice, make audio in float precision, or make audio in double precision. Now if I make audio in double precision I have to make a new script to make that file and potentially read and playback that file. But I am sure it can be done in wav. Never the less. The base audio clip creation technique available only accepts float array. But I am certain more finite control can be achieved.

Any info on where I might find script examples or general info of these types of direct sound manipulation behaviours that would be great

An alternative I was debating was making the wav double precision but this requires creation of the wav via another method. But I’m not sure what’s going on inbetween wav file data and the speakers. Is float rounding occurring anyways?

let me know what you’re thinking

While it’s generally a good idea to expand your knowledge, many of your posts start with literally zero. You always start with some wild assumptions without any reason behind it. Why would you use double precision floats for audio? Almost all hardware and audio formats use 16 bit. In rare occations you may find 24 bit precision which is usually total overkill for most music / audio with some rare exception in classical music recordings which may have a high dynamic range. Almost all pop, rock, electro, … music is usually compressed to death anyways. Even audio CDs only contain 16 bit samples. So floats already are more than enough to represent an audio sample, Though even the most wasteful audio format (wav) just stores plain 16 bit PCM samples.

Storing audio data efficiently is quite tricky since the amount of data is the main issue. You usually have a sample rate of 44kHz or 48kHz. That means for stereo you would have two 16 bit channels and for 1 second of audio you would need 2x44000 samples. That means it requires 176 kB just raw audio data as it is required by the audio hardware. So about 6 seconds requires already over 1 MB.

What’s the point of “creating your own format”? Do you still think that it would prevent people from copying your data? What are the goals you want to achieve? Most modern audio formats perform very complex analytics and exploit certain features of the human ear to reduce the number of bytes required to achieve a certain quality level.

Anyways, in Unity you can use the OnAudioFilterRead callback which is a direct hook into the audio system. Note: This callback is called on another background thread. So you have to be careful what you’re doing in that callback. So however you store your “custom format”, in the end the data has to fit into that float array. Note that the size of the float array is choosen by Unity. The callback is called quite often (usually every 20ms as you can read in the documentation, so about 50 times per second). the whole buffer / array has to be filled before the callback returns.

I even though it was subtly mean about me.

thanks bunny

[Unity - Scripting API: AudioClip.PCMReaderCallback

c](Unity - Scripting API: AudioClip.PCMReaderCallback)an pcm reader callback run inside an update loop each frame without producing a new clip? I just don’t feel like I want to make a new clip each frame. Last night I made an extremely nice sfx maker; best I have ever made, as a base to make effects with and keep memory of effects. And I thought the only way I could improve this is by not making a new clip for everything I do but to instead to roll that data live. Like a microphone to speaker. I wants to cut out the middle man assembly. And feed raw data to the speakers.

I did indeed google double precision audios and apparently it exists.

But in the end I’m not trying to stop people from getting my stuff, though ultimately I will as a byproduct, I am just wanting to be expanding my knowledge of the things I work with. In order to do things that are not immediately obvious or available.

Don’t reply

this answer has a lot of info in it. ^
Thanks again bunnykins mc lovin

I’ll explore

thread may not contain scripts anymore so feel free to dump it somewhere else :wink:

Have great day

adding relevant news recently, can you make plugin or similar to this? : D

and would it be possible to directly output audio into native code/directly to device, instead of using unity audiosystem?
although then you lose some control and easy 3d audio etc…

“ You get 4,294,967,296 different combinations of binary digits with 32-bit audio “

this is a good question^^ my exact question in fact. I would love to know. I am sure I can compress audio no doubt. And write the file no doubt. It’s only the read back of that file there is a shoulder that I do not want to stand on. I wonder if it’s even possible as little info of the pursuit exists.

Better suited to the Audio/Video forum. I can move your post if you wish.

1 Like

Its a good idea :slight_smile:
Not sure how active it is over there

See how i have to make a new sound effect each time? But is it possible to feed the data to the speakers directly without making a clip

It’s not so much about activity, it’s about it being appropriate. It’s irrelevant if there’s 1000 people here uninterested in Audio. :wink:

EDIT: Moved.

No, I don’t see where you make a new clip every time. AudioClips are meant to represent one finite stream of audio data with a certain length. Even when you use the PCM callback to actually fill the data in, the clip has a certain length. The OnAudioFilterRead callback is the direct callback from the DSP of the audio system. The data you feed it is directly send to the audio output (Of course it usually always goes through the operating system mixer or what ever driver may sit in between). That’s why we actually have abstract interfaces and standards so it works the same on all hardware. During the DOS era we had to manually configure the sound blaster settings to the exact sound card you had. Though there were always some exotic manufacturers which were not SB compatible and they simply didn’t work at all unless the game had specific support for this type of hardware. Luckily we’re kind of beyond that now. I said kind of because there are still too many incompatible standards and not all manufacturers provide drivers for all operating systems.

As I said, you can use the OnAudioFilterCallback to produce procedural audio on the fly. You are free to combine and mix whatever data you want yourself. However, be warned that mixing audio is an art in itself. The short video you’ve shown of a procedural sine wave generator I’ve implemented myself with a quite rudimentary mixer. It can mix infinite sources, however proper scaling is not that trivial. Technically if you mix two sources which have a certain amplitude, the output would have a peak amplitide that is the sum of the two. Mixing 10 or 100 things would make it 100 times “louder”. Dynamic compression is a tricky topic. I dug quite deep into that several years ago, but it’s a nightmare to get it right. It gets even more tricky when the number of sources would change over time. Keep in mind that the sound hardware can only “play” a single audio signal since that’s what is send to the speaker. If you want to have multiple sources, you have to mix them yourself. That’s what Unity’s audio mixer usually does for you.

Procedurally generating audio is also quite tricky to not cause any clicks or cracks. Outputting a smooth continuous “wave” is tricky when you want to suddenly enable, disable or change the frequency immediately. I’ve generated the sine wave using a complex number phasor that I’ve written myself.

Audio engineering is a really touch topic. So prepare to read a lot of theory and getting insane while trying to put the theory into practice :slight_smile:

1 Like

You seem quite knowledgeable on it, Have you explored Waves like the above ^^ It simply repeats from zero at the first detection of a value less than zero.

For combining wave it can get a scatchy for sure. But both methods the parsed out method and the consecutive combine can produce some great results. Sonic the hedgehog quality.

        if (NUMBER == 3)
        {
            var COMBINED_WAV = new List<float>();
            /*
            for (int i = 0; i < SINE_WAVE.Count + SINE_WAVE02.Count + SINE_WAVE03.Count; i++)
            {
                if (i < SINE_WAVE.Count)
                COMBINED_WAV.Add(SINE_WAVE[i]);
                if (SINE_WAVE02.Count > i)
                COMBINED_WAV.Add(SINE_WAVE02[i]);
                if (SINE_WAVE03.Count > i)
                COMBINED_WAV.Add(SINE_WAVE03[i]);
            }
            */
            COMBINED_WAV.AddRange(SINE_WAVE);
            COMBINED_WAV.AddRange(SINE_WAVE02);
            COMBINED_WAV.AddRange(SINE_WAVE03);
            C.SetData(COMBINED_WAV.ToArray(), OFFSET_SAMPLES);
        }

The wave in the image I am yet to test. I will in a few hours write a test script for some OnAudioFilterRead

arc wav
seems to definitely root out some of the clippings