Hello guys,
I am thinking about making a TTS system for my game.
My plan is following:
- record required words
- give them string ID
When someone wants to play TTS, it takes input string, cuts it to the words, and takes registered AudioClips.
But my question is, how can I do something like this?
Should I play each part separately, make some kind of audio merge or…?
Thanks for feedback
If you want an actual TTS that can say anything and you don’t even know where to start you might wanna use an existing library for this, lots of stuff open for use way better then what you or I could probably make, even together.
if it’s just simple preselected phrases you need to index every option and every recording and couple them with a dictionary or something.
Depends on how much options you wanna have, if it’s only a few you can use something like this:
AudioClip followMeSound, fallBackSound, holdPositionSound;
public AudioClip GetSoundForAction(int actionIndex){
if(actionIndex == 0){
return followMeSound;
} else if(actionIndex == 1){
return fallBackSound;
} else if(actionIndex == 2){
return holdPositionSound;
} else {
Debug.LogWarning("No sound for action " + actionIndex);
return null;
}
}
This kind of setup can deal with a low number of options relatively well, but if you go into double digits or more you’re gonna want something more sophisticated.