You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Originally posted by ibonsc November 28, 2024
Hello
I'm training Matcha for different languages and I think it would be good to fine tune the universal HiFiGAN vocoder to these new languages. Have you done such a thing?
To do that it is necessary to get the ground truth-aligned spectrograms of the training material, that will be used as input for the HiFiGAN finetuning. Is there any way to do that?
Thank you very much for your help.
The text was updated successfully, but these errors were encountered:
Then, you would have to hack the inference code and use these alignments instead. One way I can suggest is to use the batched dataset in the CLI and also load the transcripts and durations:
Discussed in #120
Originally posted by ibonsc November 28, 2024
Hello
I'm training Matcha for different languages and I think it would be good to fine tune the universal HiFiGAN vocoder to these new languages. Have you done such a thing?
To do that it is necessary to get the ground truth-aligned spectrograms of the training material, that will be used as input for the HiFiGAN finetuning. Is there any way to do that?
Thank you very much for your help.
The text was updated successfully, but these errors were encountered: