-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When starting the training, there is no response #131
Comments
Is there anything else I need to change? I followed the steps in the README. |
I am not sure what could be the reason for it. I could not reproduce it, could you update your cuda version and create a new environment and then retry? |
Are you seeing realtime console logs or are these some logs from a .log file? I know that rich progressbar can have some issues logging into the .log file. |
You can just launch tensorboard and also have more information: |
Anyway, thank you for your contribution and timely response. Perhaps there was a package conflict
|
Based on the information shown below, why is it that when I restart the training, no further information is displayed, and it stays stuck at the last line, where only the model parameters are printed and nothing happens afterward?
And also, there's no information displayed on TensorBoard.
[2024-12-26 15:33:23,763][matcha.utils.utils][INFO] - Enforcing tags! <cfg.extras.enforce_tags=True>
[2024-12-26 15:33:23,787][matcha.utils.utils][INFO] - Printing config tree with Rich! <cfg.extras.print_config=True>
CONFIG
├── data
│ └── target: matcha.data.text_mel_datamodule.TextMelDataModule
│ name: ljspeech
│ train_filelist_path: D:/Artificial Intelligence/python_project/Matcha-T
│ valid_filelist_path: D:/Artificial Intelligence/python_project/Matcha-T
│ batch_size: 32
│ num_workers: 20
│ pin_memory: true
│ cleaners:
│ - english_cleaners2
│ add_blank: true
│ n_spks: 1
│ n_fft: 1024
│ n_feats: 80
│ sample_rate: 22050
│ hop_length: 256
│ win_length: 1024
│ f_min: 0
│ f_max: 8000
│ data_statistics:
│ mel_mean: -5.517027
│ mel_std: 2.064394
│ seed: 1234
│ load_durations: false
│
├── model
│ └── target: matcha.models.matcha_tts.MatchaTTS
│ n_vocab: 178
│ n_spks: 1
│ spk_emb_dim: 64
│ n_feats: 80
│ data_statistics:
│ mel_mean: -5.517027
│ mel_std: 2.064394
│ out_size: null
│ prior_loss: true
│ use_precomputed_durations: false
│ encoder:
│ encoder_type: RoPE Encoder
│ encoder_params:
│ n_feats: 80
│ n_channels: 192
│ filter_channels: 768
│ filter_channels_dp: 256
│ n_heads: 2
│ n_layers: 6
│ kernel_size: 3
│ p_dropout: 0.1
│ spk_emb_dim: 64
│ n_spks: 1
│ prenet: true
│ duration_predictor_params:
│ filter_channels_dp: 256
│ kernel_size: 3
│ p_dropout: 0.1
│ decoder:
│ channels:
│ - 256
│ - 256
│ dropout: 0.05
│ attention_head_dim: 64
│ n_blocks: 1
│ num_mid_blocks: 2
│ num_heads: 2
│ act_fn: snakebeta
│ cfm:
│ name: CFM
│ solver: euler
│ sigma_min: 0.0001
│ optimizer:
│ target: torch.optim.Adam
│ partial: true
│ lr: 0.0001
│ weight_decay: 0.0
│
├── callbacks
│ └── model_checkpoint:
│ target: lightning.pytorch.callbacks.ModelCheckpoint
│ dirpath: D:\Artificial Intelligence\python_project\Matcha-TTS-main\lo
│ filename: checkpoint_{epoch:03d}
│ monitor: epoch
│ verbose: false
│ save_last: true
│ save_top_k: 10
│ mode: max
│ auto_insert_metric_name: true
│ save_weights_only: false
│ every_n_train_steps: null
│ train_time_interval: null
│ every_n_epochs: 100
│ save_on_train_epoch_end: null
│ model_summary:
│ target: lightning.pytorch.callbacks.RichModelSummary
│ max_depth: 3
│ rich_progress_bar:
│ target: lightning.pytorch.callbacks.RichProgressBar
│
├── logger
│ └── tensorboard:
│ target: lightning.pytorch.loggers.tensorboard.TensorBoardLogger
│ save_dir: D:\Artificial Intelligence\python_project\Matcha-TTS-main\l
│ name: null
│ log_graph: false
│ default_hp_metric: true
│ prefix: ''
│
├── trainer
│ └── target: lightning.pytorch.trainer.Trainer
│ default_root_dir: D:\Artificial Intelligence\python_project\Matcha-TTS-
│ max_epochs: -1
│ accelerator: gpu
│ devices:
│ - 0
│ precision: 16-mixed
│ check_val_every_n_epoch: 1
│ deterministic: false
│ gradient_clip_val: 5.0
│
├── paths
│ └── root_dir: D:\Artificial Intelligence\python_project\Matcha-TTS-main
│ data_dir: D:\Artificial Intelligence\python_project\Matcha-TTS-main/dat
│ log_dir: D:\Artificial Intelligence\python_project\Matcha-TTS-main/logs
│ output_dir: D:\Artificial Intelligence\python_project\Matcha-TTS-main\l
│ work_dir: D:\Artificial Intelligence\python_project\Matcha-TTS-main\mat
│
├── extras
│ └── ignore_warnings: false
│ enforce_tags: true
│ print_config: true
│
├── task_name
│ └── train
├── run_name
│ └── ljspeech
├── tags
│ └── ['ljspeech']
├── train
│ └── True
├── test
│ └── True
├── ckpt_path
│ └── None
└── seed
└── 1234
Global seed set to 1234
[2024-12-26 15:33:24,106][main][INFO] - Instantiating datamodule <matcha.data.text_mel_datamodule.TextMelDataModule>
[2024-12-26 15:33:27,319][main][INFO] - Instantiating model <matcha.models.matcha_tts.MatchaTTS>
D:\anaconda\envs\matcha-tts\lib\site-packages\diffusers\models\lora.py:393: FutureWarning:
LoRACompatibleLinear
is deprecated and will be removed in version 1.0.0. Use ofLoRACompatibleLinear
is deprecated. Please switch to PEFT backend by installing PEFT:pip install peft
.deprecate("LoRACompatibleLinear", "1.0.0", deprecation_message)
[2024-12-26 15:33:28,402][main][INFO] - Instantiating callbacks...
[2024-12-26 15:33:28,403][matcha.utils.instantiators][INFO] - Instantiating callback <lightning.pytorch.callbacks.ModelCheckpoint>
[2024-12-26 15:33:28,410][matcha.utils.instantiators][INFO] - Instantiating callback <lightning.pytorch.callbacks.RichModelSummary>
[2024-12-26 15:33:28,414][matcha.utils.instantiators][INFO] - Instantiating callback <lightning.pytorch.callbacks.RichProgressBar>
[2024-12-26 15:33:28,419][main][INFO] - Instantiating loggers...
[2024-12-26 15:33:28,420][matcha.utils.instantiators][INFO] - Instantiating logger <lightning.pytorch.loggers.tensorboard.TensorBoardLogger>
[2024-12-26 15:33:28,428][main][INFO] - Instantiating trainer <lightning.pytorch.trainer.Trainer>
Using 16bit Automatic Mixed Precision (AMP)
Trainer already configured with model summary callbacks: [<class 'lightning.pytorch.callbacks.rich_model_summary.RichModelSummary'>]. Skipping setting a default
ModelSummary
callback.GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
[2024-12-26 15:33:28,590][main][INFO] - Logging hyperparameters!
[2024-12-26 15:33:28,932][main][INFO] - Starting training!
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
┌────┬───────────────────────────────────┬───────────────────┬────────┐
│ │ Name │ Type │ Params │
├────┼───────────────────────────────────┼───────────────────┼────────┤
│ 0 │ encoder │ TextEncoder │ 7.2 M │
│ 1 │ encoder.emb │ Embedding │ 34.2 K │
│ 2 │ encoder.prenet │ ConvReluNorm │ 591 K │
│ 3 │ encoder.prenet.conv_layers │ ModuleList │ 553 K │
│ 4 │ encoder.prenet.norm_layers │ ModuleList │ 1.2 K │
│ 5 │ encoder.prenet.relu_drop │ Sequential │ 0 │
│ 6 │ encoder.prenet.proj │ Conv1d │ 37.1 K │
│ 7 │ encoder.encoder │ Encoder │ 6.2 M │
│ 8 │ encoder.encoder.drop │ Dropout │ 0 │
│ 9 │ encoder.encoder.attn_layers │ ModuleList │ 889 K │
│ 10 │ encoder.encoder.norm_layers_1 │ ModuleList │ 2.3 K │
│ 11 │ encoder.encoder.ffn_layers │ ModuleList │ 5.3 M │
│ 12 │ encoder.encoder.norm_layers_2 │ ModuleList │ 2.3 K │
│ 13 │ encoder.proj_m │ Conv1d │ 15.4 K │
│ 14 │ encoder.proj_w │ DurationPredictor │ 345 K │
│ 15 │ encoder.proj_w.drop │ Dropout │ 0 │
│ 16 │ encoder.proj_w.conv_1 │ Conv1d │ 147 K │
│ 17 │ encoder.proj_w.norm_1 │ LayerNorm │ 512 │
│ 18 │ encoder.proj_w.conv_2 │ Conv1d │ 196 K │
│ 19 │ encoder.proj_w.norm_2 │ LayerNorm │ 512 │
│ 20 │ encoder.proj_w.proj │ Conv1d │ 257 │
│ 21 │ decoder │ CFM │ 11.0 M │
│ 22 │ decoder.estimator │ Decoder │ 11.0 M │
│ 23 │ decoder.estimator.time_embeddings │ SinusoidalPosEmb │ 0 │
│ 24 │ decoder.estimator.time_mlp │ TimestepEmbedding │ 1.2 M │
│ 25 │ decoder.estimator.down_blocks │ ModuleList │ 3.1 M │
│ 26 │ decoder.estimator.mid_blocks │ ModuleList │ 2.8 M │
│ 27 │ decoder.estimator.up_blocks │ ModuleList │ 3.7 M │
│ 28 │ decoder.estimator.final_block │ Block1D │ 197 K │
│ 29 │ decoder.estimator.final_proj │ Conv1d │ 20.6 K │
└────┴───────────────────────────────────┴───────────────────┴────────┘
Trainable params: 18.2 M
Non-trainable params: 0
Total params: 18.2 M
Total estimated model params size (MB): 72
The text was updated successfully, but these errors were encountered: