When starting the training, there is no response #131

EobardThawne721 · 2024-12-26T07:46:46Z

Based on the information shown below, why is it that when I restart the training, no further information is displayed, and it stays stuck at the last line, where only the model parameters are printed and nothing happens afterward?

And also, there's no information displayed on TensorBoard.

[2024-12-26 15:33:23,763][matcha.utils.utils][INFO] - Enforcing tags! <cfg.extras.enforce_tags=True>
[2024-12-26 15:33:23,787][matcha.utils.utils][INFO] - Printing config tree with Rich! <cfg.extras.print_config=True>
CONFIG
├── data
│ └── target: matcha.data.text_mel_datamodule.TextMelDataModule
│ name: ljspeech
│ train_filelist_path: D:/Artificial Intelligence/python_project/Matcha-T
│ valid_filelist_path: D:/Artificial Intelligence/python_project/Matcha-T
│ batch_size: 32
│ num_workers: 20
│ pin_memory: true
│ cleaners:
│ - english_cleaners2
│ add_blank: true
│ n_spks: 1
│ n_fft: 1024
│ n_feats: 80
│ sample_rate: 22050
│ hop_length: 256
│ win_length: 1024
│ f_min: 0
│ f_max: 8000
│ data_statistics:
│ mel_mean: -5.517027
│ mel_std: 2.064394
│ seed: 1234
│ load_durations: false
│
├── model
│ └── target: matcha.models.matcha_tts.MatchaTTS
│ n_vocab: 178
│ n_spks: 1
│ spk_emb_dim: 64
│ n_feats: 80
│ data_statistics:
│ mel_mean: -5.517027
│ mel_std: 2.064394
│ out_size: null
│ prior_loss: true
│ use_precomputed_durations: false
│ encoder:
│ encoder_type: RoPE Encoder
│ encoder_params:
│ n_feats: 80
│ n_channels: 192
│ filter_channels: 768
│ filter_channels_dp: 256
│ n_heads: 2
│ n_layers: 6
│ kernel_size: 3
│ p_dropout: 0.1
│ spk_emb_dim: 64
│ n_spks: 1
│ prenet: true
│ duration_predictor_params:
│ filter_channels_dp: 256
│ kernel_size: 3
│ p_dropout: 0.1
│ decoder:
│ channels:
│ - 256
│ - 256
│ dropout: 0.05
│ attention_head_dim: 64
│ n_blocks: 1
│ num_mid_blocks: 2
│ num_heads: 2
│ act_fn: snakebeta
│ cfm:
│ name: CFM
│ solver: euler
│ sigma_min: 0.0001
│ optimizer:
│ target: torch.optim.Adam
│ partial: true
│ lr: 0.0001
│ weight_decay: 0.0
│
├── callbacks
│ └── model_checkpoint:
│ target: lightning.pytorch.callbacks.ModelCheckpoint
│ dirpath: D:\Artificial Intelligence\python_project\Matcha-TTS-main\lo
│ filename: checkpoint_{epoch:03d}
│ monitor: epoch
│ verbose: false
│ save_last: true
│ save_top_k: 10
│ mode: max
│ auto_insert_metric_name: true
│ save_weights_only: false
│ every_n_train_steps: null
│ train_time_interval: null
│ every_n_epochs: 100
│ save_on_train_epoch_end: null
│ model_summary:
│ target: lightning.pytorch.callbacks.RichModelSummary
│ max_depth: 3
│ rich_progress_bar:
│ target: lightning.pytorch.callbacks.RichProgressBar
│
├── logger
│ └── tensorboard:
│ target: lightning.pytorch.loggers.tensorboard.TensorBoardLogger
│ save_dir: D:\Artificial Intelligence\python_project\Matcha-TTS-main\l
│ name: null
│ log_graph: false
│ default_hp_metric: true
│ prefix: ''
│
├── trainer
│ └── target: lightning.pytorch.trainer.Trainer
│ default_root_dir: D:\Artificial Intelligence\python_project\Matcha-TTS-
│ max_epochs: -1
│ accelerator: gpu
│ devices:
│ - 0
│ precision: 16-mixed
│ check_val_every_n_epoch: 1
│ deterministic: false
│ gradient_clip_val: 5.0
│
├── paths
│ └── root_dir: D:\Artificial Intelligence\python_project\Matcha-TTS-main
│ data_dir: D:\Artificial Intelligence\python_project\Matcha-TTS-main/dat
│ log_dir: D:\Artificial Intelligence\python_project\Matcha-TTS-main/logs
│ output_dir: D:\Artificial Intelligence\python_project\Matcha-TTS-main\l
│ work_dir: D:\Artificial Intelligence\python_project\Matcha-TTS-main\mat
│
├── extras
│ └── ignore_warnings: false
│ enforce_tags: true
│ print_config: true
│
├── task_name
│ └── train
├── run_name
│ └── ljspeech
├── tags
│ └── ['ljspeech']
├── train
│ └── True
├── test
│ └── True
├── ckpt_path
│ └── None
└── seed
└── 1234
Global seed set to 1234
[2024-12-26 15:33:24,106][main][INFO] - Instantiating datamodule <matcha.data.text_mel_datamodule.TextMelDataModule>
[2024-12-26 15:33:27,319][main][INFO] - Instantiating model <matcha.models.matcha_tts.MatchaTTS>
D:\anaconda\envs\matcha-tts\lib\site-packages\diffusers\models\lora.py:393: FutureWarning: LoRACompatibleLinear is deprecated and will be removed in version 1.0.0. Use of LoRACompatibleLinear is deprecated. Please switch to PEFT backend by installing PEFT: pip install peft.
deprecate("LoRACompatibleLinear", "1.0.0", deprecation_message)
[2024-12-26 15:33:28,402][main][INFO] - Instantiating callbacks...
[2024-12-26 15:33:28,403][matcha.utils.instantiators][INFO] - Instantiating callback <lightning.pytorch.callbacks.ModelCheckpoint>
[2024-12-26 15:33:28,410][matcha.utils.instantiators][INFO] - Instantiating callback <lightning.pytorch.callbacks.RichModelSummary>
[2024-12-26 15:33:28,414][matcha.utils.instantiators][INFO] - Instantiating callback <lightning.pytorch.callbacks.RichProgressBar>
[2024-12-26 15:33:28,419][main][INFO] - Instantiating loggers...
[2024-12-26 15:33:28,420][matcha.utils.instantiators][INFO] - Instantiating logger <lightning.pytorch.loggers.tensorboard.TensorBoardLogger>
[2024-12-26 15:33:28,428][main][INFO] - Instantiating trainer <lightning.pytorch.trainer.Trainer>
Using 16bit Automatic Mixed Precision (AMP)
Trainer already configured with model summary callbacks: [<class 'lightning.pytorch.callbacks.rich_model_summary.RichModelSummary'>]. Skipping setting a default ModelSummary callback.
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
[2024-12-26 15:33:28,590][main][INFO] - Logging hyperparameters!
[2024-12-26 15:33:28,932][main][INFO] - Starting training!
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
┌────┬───────────────────────────────────┬───────────────────┬────────┐
│ │ Name │ Type │ Params │
├────┼───────────────────────────────────┼───────────────────┼────────┤
│ 0 │ encoder │ TextEncoder │ 7.2 M │
│ 1 │ encoder.emb │ Embedding │ 34.2 K │
│ 2 │ encoder.prenet │ ConvReluNorm │ 591 K │
│ 3 │ encoder.prenet.conv_layers │ ModuleList │ 553 K │
│ 4 │ encoder.prenet.norm_layers │ ModuleList │ 1.2 K │
│ 5 │ encoder.prenet.relu_drop │ Sequential │ 0 │
│ 6 │ encoder.prenet.proj │ Conv1d │ 37.1 K │
│ 7 │ encoder.encoder │ Encoder │ 6.2 M │
│ 8 │ encoder.encoder.drop │ Dropout │ 0 │
│ 9 │ encoder.encoder.attn_layers │ ModuleList │ 889 K │
│ 10 │ encoder.encoder.norm_layers_1 │ ModuleList │ 2.3 K │
│ 11 │ encoder.encoder.ffn_layers │ ModuleList │ 5.3 M │
│ 12 │ encoder.encoder.norm_layers_2 │ ModuleList │ 2.3 K │
│ 13 │ encoder.proj_m │ Conv1d │ 15.4 K │
│ 14 │ encoder.proj_w │ DurationPredictor │ 345 K │
│ 15 │ encoder.proj_w.drop │ Dropout │ 0 │
│ 16 │ encoder.proj_w.conv_1 │ Conv1d │ 147 K │
│ 17 │ encoder.proj_w.norm_1 │ LayerNorm │ 512 │
│ 18 │ encoder.proj_w.conv_2 │ Conv1d │ 196 K │
│ 19 │ encoder.proj_w.norm_2 │ LayerNorm │ 512 │
│ 20 │ encoder.proj_w.proj │ Conv1d │ 257 │
│ 21 │ decoder │ CFM │ 11.0 M │
│ 22 │ decoder.estimator │ Decoder │ 11.0 M │
│ 23 │ decoder.estimator.time_embeddings │ SinusoidalPosEmb │ 0 │
│ 24 │ decoder.estimator.time_mlp │ TimestepEmbedding │ 1.2 M │
│ 25 │ decoder.estimator.down_blocks │ ModuleList │ 3.1 M │
│ 26 │ decoder.estimator.mid_blocks │ ModuleList │ 2.8 M │
│ 27 │ decoder.estimator.up_blocks │ ModuleList │ 3.7 M │
│ 28 │ decoder.estimator.final_block │ Block1D │ 197 K │
│ 29 │ decoder.estimator.final_proj │ Conv1d │ 20.6 K │
└────┴───────────────────────────────────┴───────────────────┴────────┘
Trainable params: 18.2 M
Non-trainable params: 0
Total params: 18.2 M
Total estimated model params size (MB): 72

The text was updated successfully, but these errors were encountered:

EobardThawne721 · 2024-12-26T07:48:08Z

Is there anything else I need to change? I followed the steps in the README.

shivammehta25 · 2025-01-01T00:32:10Z

I am not sure what could be the reason for it. I could not reproduce it, could you update your cuda version and create a new environment and then retry?

shivammehta25 · 2025-01-01T00:33:27Z

Are you seeing realtime console logs or are these some logs from a .log file? I know that rich progressbar can have some issues logging into the .log file.

EobardThawne721 · 2025-01-06T05:20:13Z

Thank you for your reply. I found that during training, my real-time console and log files did not output any information, but in reality, it was during training. Is this problem due to a conflict in the package for printing relevant information? The model can still be saved normally, but there is no information output

EobardThawne721 · 2025-01-06T05:21:36Z

Thank you for your reply. I found that during training, my real-time console and log files did not output any information, but in reality, it was during training. Is this problem due to a conflict in the package for printing relevant information? The model can still be saved normally, but there is no information output

I'm not sure if it's because there's a conflict in the printed packages

shivammehta25 · 2025-01-08T06:22:50Z

You can just launch tensorboard and also have more information: tensorboard --logdir logs

EobardThawne721 · 2025-01-08T06:24:43Z

Anyway, thank you for your contribution and timely response. Perhaps there was a package conflict

You can just launch tensorboard and also have more information: tensorboard --logdir logs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When starting the training, there is no response #131

When starting the training, there is no response #131

EobardThawne721 commented Dec 26, 2024

EobardThawne721 commented Dec 26, 2024

shivammehta25 commented Jan 1, 2025

shivammehta25 commented Jan 1, 2025

EobardThawne721 commented Jan 6, 2025

EobardThawne721 commented Jan 6, 2025

shivammehta25 commented Jan 8, 2025 •

edited

Loading

EobardThawne721 commented Jan 8, 2025

When starting the training, there is no response #131

When starting the training, there is no response #131

Comments

EobardThawne721 commented Dec 26, 2024

EobardThawne721 commented Dec 26, 2024

shivammehta25 commented Jan 1, 2025

shivammehta25 commented Jan 1, 2025

EobardThawne721 commented Jan 6, 2025

EobardThawne721 commented Jan 6, 2025

shivammehta25 commented Jan 8, 2025 • edited Loading

EobardThawne721 commented Jan 8, 2025

shivammehta25 commented Jan 8, 2025 •

edited

Loading