Native Audio - Lower audio latency via OS's native audio library (iOS/Android)

Native Audio
Lower audio latency via direct unmixed audio stream at native side
Requirement : Unity 2019.3 LTS+ For iOS and Android.

Asset Store Link : Native Audio | Audio | Unity Asset Store
Release Note : in the website Changelog | Native Audio

So your Unity game outputs WAY slower audio than other apps even on the same device? Turns out, Unity adds as much as 79% of the total latency you hear. But not to worry because Native Audio will take care all of it. I have researched into the cause for so long and the solution is right here.

What can we skip by going directly to the native API?
Unity has an internal mixing system, backed by FMOD. You can go crazy with a method like AudioSource.PlayOneShot and the sound magically overlaps over itself,even though at native side Unity only asked Android for only 1 “Audio Track”. How can that be? Because Unity spend time mixing them together to 1 bus. Plus you have all the wonders of Unity 5.0.0 introduced audio mixer system. Effects, sends, mixers, etc. all stacked into that.

A great design for a game engine. But we certain subset of game developers absolutely do not want any of the “spend time” if possible. Unfortunately Unity does not give us a choice to bypass and just go native.

For some genre of apps and games that needs critical timing for audio (basically any response sound from input, rhythm game, etc.) this is not good. The idea to fix this is to directly call into the native methods and have it read the raw audio file we prepared without Unity’s importer. Bypassing Unity’s audio path entirely.

I have researched for long time into the fastest native way of each respective platform. For iOS it is to use OpenAL (Objective-C/Swift) and for Android it is to use OpenSL ES (NDK/C). For more information about other alternatives why they are not good enough, please go to implementation page.

But having to interface with multiple different set of libraries separately from Unity is a pain, so Native Audio is here to help…

“Native Audio is a plugin that helps you easily loads and plays an audio using each platform’s fastest native method, from the same common interface in Unity.”

Android High-Performance Audio Ready
It improves latency for iOS greatly, but I guess many came here to fix the already-horrible-without-Unity Android latency.

I am proud to present that Native Audio is following all of Google’s official best practices required to achieve High-Performance Audio in Android. I have additionally fixed all the latency-related mistakes that Unity had to unfortunately chosen for their “versatile” audio engine.

  • Uses C/NDK level OpenSL ES and not Java MediaPlayer, SoundPool, or AudioTrack. Plus, most latency-critical interfacing methods from Unity are by extern to C and not AndroidJavaClass to Java. Feature set of OpenSL ES that would add latency has been deliberately removed.
  • Ensuring “Fast Track” audio being instantiated at hardware level, not a normal one. Native Audio does not have any kind of application level mixer and each sound goes straight to this fast track. Currently Unity does not get the fast track due to sample rate mismatch, moreover somehow using a deep buffer thread, designed for power saving but with the worst latency.
  • Built-in resampler. Resample your audio on the fly to match “device native sample rate”. Each phone has its own preferred sample rate. (Required for fast track)
  • Minimum jitter by zero-padded audio, so that the length is exactly a multiple of “device native buffer size” to ensure consistent scheduling. Each phone has its own preferred buffer size.
  • Double buffering so your audio playing start as soon as possible, unlike a lazy single buffering which we must push entirely of the audio into the playing buffer. (This is not the same step as loading the audio, we must go through this step on every play.) Combined with the previous point the workload of each callback is deterministic.
  • A support for AAudio, the new and even faster standard from Google is coming in the future. Players that has Oreo (8.0) or higher will automatically get AAudio implementation with no modification from your code.
  • Automatically receives better audio latency from future system performance improvements.

Of course, with publicly available thorough research and confirmations. This means it can perform even better than native Android app that was coded naively/lazily in regarding to audio. Even pure Android developers might not want to go out of Java to C/NDK.

How faster?
Here are some benchmarks. While this is not a standard latency measurement approach like loopback cable method and the number alone is not comparable between devices, but time difference of the same device truly show how much the latency decreased without doubt.

The website contains much more to read : Native Audio - Unity Plugins by Exceed7 Experiments

Please do not hesitate to ask anything here or join the Discord channel. Thank you.

2 Likes

I sincerely hope that your plugin could support unity5.6.

Hello @liuxuan , in fact it should work with every Unity version out there (down to 5.0.0 or even below) as long as it supports basic iOS + Android native plugin. Being only an interface to native implementation the feature depends on device’s OS rather than Unity.

(And in this sense only Android Jelly Bean or over is supported because of some special fast path constructor of AudioTrack, for iOS I am not sure how long OpenAL has been around)

But the current newest version (2018.2.0b11) of Unity can go back only as far as 2017.1.3f1, more than that the project format mismatch and require Reimport All. And I just don’t want to have a commitment to make sure it works down to a very old version that is difficult to go to just to find bugs for my users. (Also Unity Hub installs starts at 2017.1.4f1, so my commitment of support is actually a little more than required)

You could try in 5.6 and I would not be so surprise if it works perfectly. But if it does not work it would be difficult for me to find out what went wrong for you.

By the way BIG NEWS about the next update :

  • Android part is undergoing a big migration to OpenSL ES instead of AudioTrack. Unlike AudioTrack, (built on top of OpenSL ES with similar latency from my test) it is one of the officially mentioned “high performance audio” way of playing audio here High-performance audio  |  Android NDK  |  Android Developers It will be awesome. And being in C language part Unity can invoke method via extern as opposed to via AndroidJavaClass like what we currently have. (speed!)

  • Additionally I will go as far as resampling the audio file on the fly (we don’t know which device the player will use, but we can only prepare 1 sampling rate of audio practically) to match each device differing native sampling rate (either 44100Hz or 48000Hz) so that the special “fast path” audio is enabled. Would be awesome for any music games out there. (But it adds some load time if a resampling is required, it is the price to pay)

About resampling quality do not worry, as instead of writing my own which would be super slow and sounds bad I will incorporate the impressive libsamplerate (Secret Rabbit Code) Secret Rabbit Code (aka libsamplerate) and it has a very permissive BSD license that just require you to put some attributions, not to open source your game or anything.

  • Even more I will intentionally zero pad the audio buffer so that it is a multiple of “native buffer size” of each device further reducing jitter when pushing data to the output stream.

And now that I got used to C programming with Android this will pave way for AAudio, new Google-developed native android audio library accessible in C language like OpenSL ES but it has even more potential to write data directly to the area very close to audio unit.

It is only usable for Android Oreo (8.0, better on 8.1) or higher so I will make it that the player get AAudio if they are on Oreo and OpenSL ES if otherwise. (Read more : AAudio and MMAP  |  Android Open Source Project , AAudio  |  Android NDK  |  Android Developers)

This development actually happen because of a user request in the Discord channel Exceed7 Experiments and it is great for my own game too so I decided to start working on it. You can follow the discussion to deliver these features there or tell me any feature you would like to have and we will see about possibility.

Hey there,

I’m working on rhythmic game which I want to sync gameplay (player movement) with music. Every beat or multiple beats the ball hits the platform. I use dspTime for syncing. It works perfectly on Windows but Android is a disaster. Ball doesn’t hit the platform at the correct moment and also because of dspTime latency the ball bouncing is very bad, it’s not smooth. Now I want to know using this native plugin is useful for my game ? How it can solve dspTime latency and latency in playing the music at the first time.

Hello. (I am also making a rhythm game and this plugin is actually a core part of the game)

I do not recommend this plugin for music since the the requirement is that native side could not decompress and the file must be in .wav. That means 4 minutes of music could add 20MB to your game. A good plan is to use normal Unity audio for compressed track like BGM (which you will then have to use various tricks to make it “stick” with any of Unity’s time value as possible so you can reference, will comeback to this later) and use Native Audio only for response sound which is not large.

For example a drumming game where you drum to the compressed BGM. You could get fastest response using Native Audio with the drum sound while not costing much space. Native Audio is for solving latency problem in playing short response sound which you could not predict it will play or not and you want it to play the fastest possible if it should play. The meaning of that is for example, the drumming game will not play the drum sound if you did not hit the screen. Games with coins to collect will not play coin collection sound if the player missed them. But if there is only backing track in your game (which it play for sure) then the problem is solvable without Native Audio but with fixed offset/calibration.

I will take this opportunity to write about basics of music synchronization in rhythm games / music game.

The backing track problem

This is no longer specific to Native Audio but Unity in general. In rhythm game first you have to get the backing track to line up with the first “note” (what is that depending on your game) and the rest will stay correct UNLESS the game lags or the audio lags. 90% of the time the game lags and audio went ahead of the game since audio is not in the same processing unit with the game anymore after the play command. The lag requires separated resolution and I will not talk about it right now.

Audio in Unity is “fire and forget”. When you ask AudioSource to play it will take variable amount of time and play whenever it feels like. This can be immediately in the same frame, a bit later but still in the same frame, or in the next frame. You get the idea. It is not frame dependent anymore from the moment you call AudioSource.Play. And each device especially Android has different audio latency.

We cannot easily calculate latency in-game, so the backing track problem on Android is usually solved by having the user calibrate by themselve since Android has different audio latency by device. If there is only backing track and no response sound then I think this is the best way to do it.

After the user get the correct offset for the device, then it is your job to make the offset stays true, stays the same every start of the play. Some rhythm game has problem even with manual calibration because each restart of the game the offset is different. This is programmer’s mistake and could make the user frustrated that he get the calibration right or not.

Starting the music precisely

As mentioned in prevous section after your player have solved the device specific latency for you, it is now your job to make that value right every time. (every restart, score attacker players will “retry” a lot)

  1. Preload the audio accordingly.
  2. Immediate playing is not possible, the only solution is to use AudioSource.PlayScheduled which can specify a precise point of time in the future. This method use dspTime, and be mindful of where you ask the Time.dspTime since this value has possibility to change (or not) even in-between lines of code. The only thing to ensure is this time should be large enough for it to “prepare”.
  3. According to the time in the future that you use here, start you game’s event as close to that time as possible. It is impossible to make a frame in the future lands exactly to that time, so maybe the best to just use the first frame which comes after that time. (This is similar to WaitForSeconds in StartCoroutine, it does not actually wait for that exact second but might be more depending on where the frame lands)

How to execute code the earliest in the frame

With the nature of dspTime and realTimeSinceStartup that change its value every time even in consecutive line of code, It might be desirable to nab the value at the same point of code in every frame as possible, remember it and use it with codes that comes later in the frame.

In Unity this is a bit troublesome since Update runs a bit late even with Script Execution Order moved to topmost. The earliest step is “Initialization” step. But to get your code to this step is currently not easy.

  1. With the new experimental API you can add custom code to that step. See http://beardphantom.com/ghost-stories/unity-2018-and-playerloop/
  2. With Unity’s new ECS/Entities package (get it from Unity Package Manager) you can create a system with [UpdateBefore(typeof(Initialization))] to position its OnUpdate as earliest possible.

The response sound problem

With backing track correct, the only problem left is if your game has any kind of response sound. Response sound cannot be calibrated/compensated so the best is to rely on a way to play the sound with shortest possible latency possible.

This is finally what Native Audio try to solve. Use it and get the most immediate playing as possible. Notice that immediate playing will always still be less accurate that the correct calibration, but calibration cannot work with response sound since you have to move the sound earlier in time. (Unless you are a psychic and can predict that the player will surely hit the screen and activates the response sound)

Bonus : Synching with dspTime problem

For most rhythm game getting the backing track accurate is enough, but what if you want to know where the audio is right now.

You want the real current audio time as real time as Time.realTimeSinceStartup. That API change its value even in 2 consecutive lines of code indicating that it is very realtime.

From my research AudioSettings.dspTime and audioSource.time updates in a separated discrete step . In the same frame if you ask the value it may or may not change depending if that update step happpen in between the line of code or not. But in 2 consecutive line of code it is very likely that it will stay the same unlike Time.realTimeSinceStartup

And now comes to the native time. In version 2.0 you can ask dspTime of audio played by Native Audio. Unfortunately I found that both Android and iOS reports a time that is also update in discrete step like dspTime. It seems that all audio engine are like this and nothing is truly real time.

There are differences though :

Android - the step is independent from Unity’s dsp step (AudioSettings.dspTime and audioSource.time) if those two change in between lines of code, Android time may not change. If Android time change, those two not necessary have to change.

iOS - The time from OpenAL is surprisingly in the same lock step as AudioSettings.dspTime and audioSource.time. Indicating that they internally use the same system. If one of them stay still the rest also stay still.

On iOS you see that yellow and blue overlap often. The time from native often stay closer or even the same as current dspTime than asking from Unity’s audio source. It might be that because there is less latency the audio start sooner and only red line is delayed.

Anyways this is a new GetPlaybackTime method and it will be in the next release whether it will be useful or not.

1 Like

Hello everyone. I guess I have finally weed out all the performance bug and other bugs of Native Audio already, so from now on it is entering the benchmarking phase! Soon I will be able to release for real in the store.

As a teaser this is what latency reduction we can expect from Native Audio 2.0

(Yes… Unity’s AudioSource is THAT slow. I too did not feel like that until I have Native Audio to compare with.)

From 323ms to 79ms That means we have -244ms latency reduction (-75.54%) !!

And this phone is not old, the Mi A2 just came out. What you hear is the best latency you can get from default Unity AudioSource (+Best Latency settings already selected on Audio Settings panel)

Here’s a list of devices I own and I will benchmark them throughly. The benchmark data will be publicly available including the data before averaging and the recorded sound files.

The version 2.0’s trailer is now online. Just to show you how much latency Unity can add over your game in Android.
It works on iOS too by the way!

Version 2.0.0 has been released in the store today! The store text and infographics has been updated throughly too. Check it out : Native Audio | Audio | Unity Asset Store

Moreover some more benchmarks has been added to the website :

I am trying to get my hand on more popular devices and add them soon. (That is, the Samsung Galaxy anything.)

Version 2.1.0 is underway :

[IOS] 2D PANNING The backend OpenAL of iOS is a 3D positional audio engine. 2D panning is emulated by deinterleaving a stereo source audio into 2 mono sources, then adjust the distance from the listener so that it sounds like 2D panning.

[ALL PLATFORMS] PLAY ADJUSTMENT There is a new member playAdjustment in PlayOptions that you can use on each nativeAudioPointer.Play(). You can adjust volume and pan right away BEFORE play. This is because I discovered on iOS it is too late to adjust volume immediately after play with NativeAudioController without hearing the full volume briefly.

Trade offs : - Now iOS can play only 16 concurrent sounds instead of 32 because one stereo file now take up 2 sources for each ear.

It is impossible to adjust only one channel of a stereo file in iOS to achieve “balancing” effect (not “panning”, but the method says “pan” anyways)

BUG FIXES
Previously the Android panning that is supposed to work had no effect. Now it works alongside with the new iOS 2D panning. (I am sorry)

Demo APK
Plus I have added a demo APK to the Native Audio - Unity Plugins by Exceed7 Experiments around the release note. I am not sure how can I make a demo for iOS, maybe based on manual TestFlight invite or upload the entire Xcode project (200MB big and unwieldly)


Benchmark update
Thanks to my friend, I am able to get the benchmark of some of the very popular Samsung Galaxy S phones.

S9+, the current Samsung flagship is currently holding the record of the best pure Unity audio latency. No other devices tested has been able to get a sub 100ms.

This might means, Unity’s added latency is highly CPU-bound. When using Native Audio, device’s CPU matter less for playing audio. (As several very old phones has the time of almost equal to the S9+)

Hello,

The last version really improved Android audio, thanks !

Will you consider adding new native audio methods like :

  • Pause / Resume ← this one is a must-have !
  • Fade in / out
  • Play( float startPosition)

Thanks

Hello,

Pause / Resume, Play( float startPosition)

These 2 are surprisingly closely related. For pause, resume’s design I have looked at SoundPool’s API design. In summary both SoundPool and Native Audio governs and manipulate native AudioTrack in some way :
https://developer.android.com/reference/android/media/SoundPool

  • Play use audio ID as an argument, by some algorithm (instantiate a new track, if not possible then overwrite the oldest track) a track is selected for that audio. It returns a track ID.
  • Pause/Resume/etc. requires a track ID you must keep and not audio ID.

This means this sequence of unexpected behaviour is possible :

  • Playing sound A, for a while you call Pause on the returned track ID.
  • You play sound B C D E … so many times that all the track was recycled for all other sounds.
  • When you use the stored track ID to call Resume, the sound will no longer be A. In fact the “pause” state was already gone by the moment the track with sound A got overwritten. Resume will do nothing.

I believe this is what SoundPool do. However what if we fix it like this :

  • On pausing returns a completely unique “Pause ID”, it contains an audio ID and a time which it was paused.
  • On resume we start a completely new play (track selection algorithm will run again) but not from the beginning.

The pause ID might add some complexity to the API, requiring a new class. (Because it cannot be the same as NativeAudioController since that is a representation of an AudioTrack)

Also this requires “Play() from any point in audio” feature. I am intended to add this function for sure.

Supposed that we now can play from any point. Currently there is a function GetPlaybackTime on NativeAudioController. If we use that to ask the time, Stop it with that controller, then start a new play with that time, it would equal to my redesign of Pause() and Resume(). Then it is possible to emulate Pause and Resume with only “play from any point” function. (Makes the API cleaner? But at the same times feel like a hack around pause and resume)

So I would like to ask if you have to do the time keeping + start from time instead of Pause/Resume would that be comfortable? (Only Play( float startPosition) would be implemented) Because SoundPool also can’t do “intuitive pause” like Unity’s AudioSource.

Fade In/Out
For fade in/out, currently you can DIY with SetVolume repeatedly on the returned NativeAudioController after play. In my opinion I don’t want to put time-dependent helper method on the API.

Hi,

Pause/resume would be more convenient but the feature is so important that we could deal with time keeping + start from time :slight_smile:

Will this cut down on microphone input latency?

Unfortunately no, Native Audio currently worked only on output. : (
OpenSL ES is definitely able to do input, but it is not implemented currently.

It is unlikely to be implemented soon as well, since there are still a lot of output functions missing waiting to be implemented. (Pause/Resume/Loop/compressed ogg support/relaxing file format restrictions) And personally my game does not use audio input.

Some articles that mention native input :
https://developer.android.com/ndk/guides/audio/audio-latency#input-latency
https://developer.android.com/ndk/guides/audio/opensl/opensl-prog-notes#perform

@5argon so pause/resume feature using time keeping + start from could be plan ? Can we expect that feature in a near future ? :slight_smile:

Thanks

I decided to take a shot on it today, will come back to you later. I expect it to be done in a week.

Additionally you can PM me an invoice number, then you can have the version with that function early before I submit for Asset Store approval.

Ok, I got it earlier than I thought. Android is a pain but fortunately on iOS it’s quite easy with OpenAL.
I would like to take this opportunity for the next release note of Native Audio v2.2.0 :

[ALL PLATFORMS] TRACK’S PLAYHEAD MANIPULATION METHODS ADDED

  • NativeAudio.Play(playOptions) : Able to specify play offset in seconds via playAdjustment in the PlayOptions argument.
  • NativeAudioController : Added track-based pause, resume, get playback time, and set playback time even while the track is playing. Ways to pause and resume include using this track-based pause/resume, or use get playback time and store it for a new Play(playOptions) later and at the same time Stop() it immediately, if you fear that the track’s audio content might be overwritten before you can resume.
  • NativeAudioPointer : Added Length property. It contains a cached audio’s length in seconds calculated after loading.

This video demonstrates something only possible with these features. I have to take some time checking everything before submitting.

@Kiupe a beta of this version will be messaged to you soon.

Thanks !!!

Hello. A new experimental feature Native Audio Generator is currently in the work. It is initially for my game, I am still deciding whether or not it is good enough to be in the release.

Some of you might have already made a kind of wrapper over Native Audio to map to your sounds or something similar, this feature will help you achieve that faster, you don’t even have to type a code.

By script generation it will create a C# file hard-coded all of your audio in the selectable subfolder of StreamingAssets folder (supports one more level of subfolder as a “group”, as pictured) This is called NativeAudioLibrary.

…along with an asset file (ScriptableObject) to remember other settings for each sound which you can modify. This file has to be in Resoures since a static variable will get it.

For instance all the string paths are stored in there so no need to type in the code. (You don’t want to edit this) Also we can store Volume value which will be multiplied automatically to the volume you can already use when playing. This way we can use that asset file as an individual sound mixer of sorts. You can then modify the code to include other things to each one as you like. Each one of these is called NativeAudioObject. (Responsible for storing a loaded NativeAudioPointer inside)

3757165--313081--new.gif

Using the generated script looks like this.

  • Play() immediately on it is safe, it will load and keep the pointer automatically if not already.
  • In the generated script group operations are available, such as load/unload every sounds in the group. Also you can still load individual sound.