WhisperX missing pieces of transcript compared to Whisper API #931

tomhayw · 2024-11-25T12:37:51Z

I've been using WhisperX but I keep coming across issues whereby parts of the transcript are just missing entirely (i.e. half of sentences). I have ran the same audio file through OpenAI's Whisper API and it works perfectly fine.

Has anyone else had this issue and if so, how did you remediate it?

Thanks.

sulutian · 2024-11-25T15:14:40Z

You can try lowering --vad_onset 0.1 --vad_offset 0.1

klausackermann · 2024-12-23T08:42:08Z

I also obtained this and it is much better switching back to model large_v2 instead of model large_v3. The large_v3 also shows halo in the middle of transcribed, which large_v2 does not.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WhisperX missing pieces of transcript compared to Whisper API #931

WhisperX missing pieces of transcript compared to Whisper API #931

tomhayw commented Nov 25, 2024

sulutian commented Nov 25, 2024

klausackermann commented Dec 23, 2024

WhisperX missing pieces of transcript compared to Whisper API #931

WhisperX missing pieces of transcript compared to Whisper API #931

Comments

tomhayw commented Nov 25, 2024

sulutian commented Nov 25, 2024

klausackermann commented Dec 23, 2024