Hello.
I’m researching a ways to integrate one of the leading voice modification software with Vivox for Unity and is quite disappointed that it still doesn’t provide full access to the underlying audio buffers right after capture and before audio data is sent to the server (which is essential for voice modification on the sender side). The feature with audio taps that is mentioned in this closed topic seems to be always working on “receiving” side, it just provides a copy of various part of data (participant, channel or self) in addition to it being presented to output by Vivox. All this despite declaring this in the first message of topic: “Add an audio effect that modifies the voice (for example, a robot voice, radio distortion, or static)”
On the other hand, low-level Vivox Core SDK and Vivox Unreal SDK provide access to it (pf_on_audio_unit_after_capture_audio_read). Maybe I’m missing something? Or, if it’s really not implemented yet, then is it possible to implement it ASAP? We are interested in even the lowest possible level of implementation, like setting a callback on VivoxConfigurationOptions object.
Capture-side integration with Audio Sources wasn’t included in the implementation that was released due to limitations within the audio subsystem. This could lead to very noticeable hangs in an application and rather than expose users to that scenario we made the decision to hold back capture. When that risk is mitigated we’re eager to add capture support to the Audio Source integration.
This year we plan on releasing an audio engine agnostic plugin for all flavors of the SDK that replicates the buffer access available in Core in a way that is more easily used than what is needed to use the existing low level API.
Thanks for clarifying on this, Nick.
Your worries about performance - are they apply more to how to bridge the capture audio buffer to the C# side in order to represent it in a similar way as other Audio Taps do (Audio Source component that internally manages its Audio Clip to feed audio)? Or there are worries about performance of this API even on low-level (Vivox C++ code)?
I’m asking that because of two factors:
- I actually found a way about how to set up this callback. It is a bit hacky but not too hacky, i.e. no “heavy artillery” like IL code weaving or binary patching is needed.
It only involves some reflection hacking into Vivox C# wrapper (VivoxConfigurationOptions class) and manipulates internal C# wrapper class over C++ struct vx_sdk_config_t to set the pointer of the
pf_on_audio_unit_after_capture_audio_read property to our own C++ callback handling function. Pointer to the C++ function (IntPtr) gets requested from our own C++ DLL.
Sure, it is fragile and may break with next Vivox Unity SDK updates, especially with the update you promise, but currently it works, even in IL2CPP build mode and high levels of script optimizations (needs tweaking some script optimization excludes in a link.xml stuff). And at this point it works actually the same way as it may work in a Vivox Core SDK, because in this scenario we use C# code only for setting up callbacks, after that it is C++ ⇆ C++ code interaction.
Unfortunately I cannot share a sample code now, because I’m already bound by NDA of the company for which I’m implementing this, but I hope you got idea. Maybe I can share it later, when it gets close to release.
- We cannot wait for this update and eager to try it in the existing form. Of course we will be actively monitoring updates and when this update gets released, we will re-evaluate this stuff.
Anyway, much thanks that you are paying attention to this part of plugin.
The issue exists during an interaction between OnAudioFilterRead and the garbage collector rather than the SDK itself. With a high number of objects this can occur and because we can’t guarantee how users will manage their objects we decided that the potential risk was more than what we wanted to expose users to.
The performance issue we decided to avoid is very undesirable. The best implementation for user experience would have attached the garbage collector to the audio thread from just one usage of the capture modifier. This would cause large garbage collection pauses to sleep the audio thread, causing audible discontinuities with all audio.
I’m glad you found a way to make the native callback work. That is really the best option in the short term. It’s feasible that we could provide similar access in C# like you’ve gotten working, but it wouldn’t come anytime soon. Knowing the details of what you discovered could be helpful, but no rush!