Training Matcha for other languages: Fine tunning HiFiGAN to specific speakers #124

ibonsc · 2024-12-06T17:07:35Z

Discussed in #120

^{Originally posted by ibonsc November 28, 2024}
Hello
I'm training Matcha for different languages and I think it would be good to fine tune the universal HiFiGAN vocoder to these new languages. Have you done such a thing?
To do that it is necessary to get the ground truth-aligned spectrograms of the training material, that will be used as input for the HiFiGAN finetuning. Is there any way to do that?
Thank you very much for your help.

shivammehta25 · 2024-12-08T16:39:20Z

Yeah! Ofcourse, what you can do it extract alignments from a trained model, using the first part of: https://github.com/shivammehta25/Matcha-TTS/wiki/Extracting-phoneme-alignments-and-improving-GPU-utilisation

Then, you would have to hack the inference code and use these alignments instead. One way I can suggest is to use the batched dataset in the CLI and also load the transcripts and durations:

Matcha-TTS/matcha/cli.py

Lines 292 to 300 in 108906c

    
           class BatchedSynthesisDataset(torch.utils.data.Dataset): 
        
               def __init__(self, processed_texts): 
        
                   self.processed_texts = processed_texts 
        
               def __len__(self): 
        
                   return len(self.processed_texts) 
        
               def __getitem__(self, idx): 
        
                   return self.processed_texts[idx]

Then use these saved durations instead of w

Matcha-TTS/matcha/models/matcha_tts.py

Line 121 in 108906c

w = torch.exp(logw) * x_mask

And it should be fine. Hope this helps, let me know if you have any more doubts.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training Matcha for other languages: Fine tunning HiFiGAN to specific speakers #124

Training Matcha for other languages: Fine tunning HiFiGAN to specific speakers #124

ibonsc commented Dec 6, 2024

shivammehta25 commented Dec 8, 2024

Training Matcha for other languages: Fine tunning HiFiGAN to specific speakers #124

Training Matcha for other languages: Fine tunning HiFiGAN to specific speakers #124

Comments

ibonsc commented Dec 6, 2024

Discussed in #120

shivammehta25 commented Dec 8, 2024