How to reduce latency with GPT & Unity Requests.

Hi, i want to develop a realtime ai assistant with unity. I’m using models below:

Whisper
GPT 4o
TTS-1

But the respond duration is too long. 5-20 secs

Can i use the API of Voice Mode in GPT, are there any ways to reduce latency?
Thank you!

The response times are what they are. Note that GPT applies rate limiting even to paying customers.

This isn’t actually Unity related though. You should take note that Unity has Muse Chat which provides more suitable responses for all things Unity related. If I recall correctly we will get to see this integrated in the editor.

1 Like

Only to a very limited degree by having a faster connection between the user’s device and the servers. AI-based technologies take time to generate results and the companies apply their own rate limits to keep the load down on their servers.

1 Like