-
Notifications
You must be signed in to change notification settings - Fork 394
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RuntimeError: CUDA out of memory. Tried to allocate 42.00 MiB (GPU 0; 22.02 GiB total capacity; 20. #60
Comments
大佬解决了吗 |
同样24G显卡,batch_size=1 max_source_seq_len =50 失败,解决了吗? |
@kbwzy @chenchaoac @wuguangshuo 各位解决了吗? |
24G显存,同样问题,请问解决了吗 |
我的显卡16GB,训练的时候也报错了,OOM。 理论上16G足够微调lora了,看了下代码,发现是因为微调训练结束后,在保存模型的时候,原始代码里面会将 原始模型和微调后心脏的模型参数 merge到一个模型,输出为一个模型文件(这个merge代码中,对训练后的模型deep_copy了一份,相当于需要的内存 * 2)。 解决方案就是:修改下train.py文件中的save_model函数,不merge参数,只将微调后的模型参数单独保存。当然,由于没有merge到一个模型,在推理的时候也需要相应修改下代码,使得代码能够加载原始模型+lora模型参数。 lora模型参数单独保留:
推理的时候,单独加载下lora参数模型:
|
is:issue is:open 请问LLM LoRA Finetune单卡需要多大的显存呀?23G的A100 batch_size=1 max_source_seq_len =4 max_target_seq_len=2还是报内存不足错误呀
The text was updated successfully, but these errors were encountered: