The train natchatts on my own data ,synthesis speech is silence. #122

zhaojingxin123 · 2024-12-01T09:18:44Z

Use the speech of GenShin inside the KeQing (a game character) train the matcha_tts model , and use the trained model synthesis speech which is silence.
but the synthesis speech have mel spectrogram。
the synthesised mel spectrogram picture

Has this ever happened to you?What do you think is the cause?
thank you .

shivammehta25 · 2024-12-01T16:37:16Z

Could you try replacing the vocoder with BigVGAN or even Griffin-lim for testing? I haven't faced this issue before; usually, the hifigan works just fine for the audio I have tested it with.

shivammehta25 · 2024-12-02T15:08:29Z

Also, could you just verify that the sample rate of the input audio is 22050?

Matcha-TTS/configs/data/ljspeech.yaml

Line 13 in 108906c

sample_rate: 22050

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The train natchatts on my own data ,synthesis speech is silence. #122

The train natchatts on my own data ,synthesis speech is silence. #122

zhaojingxin123 commented Dec 1, 2024

shivammehta25 commented Dec 1, 2024

shivammehta25 commented Dec 2, 2024 •

edited

Loading

The train natchatts on my own data ,synthesis speech is silence. #122

The train natchatts on my own data ,synthesis speech is silence. #122

Comments

zhaojingxin123 commented Dec 1, 2024

shivammehta25 commented Dec 1, 2024

shivammehta25 commented Dec 2, 2024 • edited Loading

shivammehta25 commented Dec 2, 2024 •

edited

Loading