We’re developing an educational game focused on building team work and communication. We have a voice chat component as part of the game, and would like to be able to do some communication analysis based on the chat conversations players have.
To help with this chat analysis, we’d like to be able to utilize speech->text functionality, so that we can do further natural language processing/etc with the resulting strings.
It seems that most Unity developers have successfully used Pocketsphinx to do speech recognition. However, I was wondering if anyone has attempted to use Kaldi (with processing done on a server backend)? If so, what was that development experience like, and was it successful?
Basics about our game environment:
Linux Server Backend, running with Java + Groovy (AWS) + SmartFox
Unity Standalone Game Client, could be run on a variety of OS at this point, but no mobile devices