According to OpenAI you can prompt whisper so it outputs alternative words or phrasing. For example Aimee instead of Amy, Shawn instead of Sean, etc.
Is there a way to support this when running Whisper in Sentis?
Thanks,
Ray
According to OpenAI you can prompt whisper so it outputs alternative words or phrasing. For example Aimee instead of Amy, Shawn instead of Sean, etc.
Is there a way to support this when running Whisper in Sentis?
Thanks,
Ray
Hi, Ray!
Regarding the whisper paper, you can prepend prompt tokens before the SOT token as follows:
For example, the token sequence can consist of:
<|startofprev|> + prompt_tokens + SOT + language tag + transcribe + no timestamps + …
The <|startofprev|> value is 50361, and prompt_tokens can be generated using BPE (byte-pair encoding) tokenizer from text. You can refer to the BPE implementation in our Phi1.5 sample on our Hugging Face repository.
According to the paper, there is a maximum limit of 224 prompt tokens, which differs from GPT’s regular prompting style. This method can only be used for grammar correction, style modification, and word alternatives.
Thanks,
Sky.