Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: CUDA out of memory. Tried to allocate 42.00 MiB (GPU 0; 22.02 GiB total capacity; 20. #60

Open
chenchaoac opened this issue May 5, 2023 · 5 comments

Comments

@chenchaoac
Copy link

is:issue is:open 请问LLM LoRA Finetune单卡需要多大的显存呀?23G的A100 batch_size=1 max_source_seq_len =4 max_target_seq_len=2还是报内存不足错误呀

@haizeiwanglf
Copy link

大佬解决了吗

@kbwzy
Copy link

kbwzy commented May 15, 2023

同样24G显卡,batch_size=1 max_source_seq_len =50 失败,解决了吗?

@a6225301
Copy link

a6225301 commented Jun 3, 2023

@kbwzy @chenchaoac @wuguangshuo 各位解决了吗?

@hsauod
Copy link

hsauod commented Jun 8, 2023

24G显存,同样问题,请问解决了吗

@rainkin1993
Copy link

我的显卡16GB,训练的时候也报错了,OOM。

理论上16G足够微调lora了,看了下代码,发现是因为微调训练结束后,在保存模型的时候,原始代码里面会将 原始模型和微调后心脏的模型参数 merge到一个模型,输出为一个模型文件(这个merge代码中,对训练后的模型deep_copy了一份,相当于需要的内存 * 2)。

解决方案就是:修改下train.py文件中的save_model函数,不merge参数,只将微调后的模型参数单独保存。当然,由于没有merge到一个模型,在推理的时候也需要相应修改下代码,使得代码能够加载原始模型+lora模型参数。

lora模型参数单独保留:

diff --git a/LLM/finetune/train.py b/LLM/finetune/train.py
index 4483fc0..53dc4e9 100644
--- a/LLM/finetune/train.py
+++ b/LLM/finetune/train.py
@@ -155,12 +155,13 @@ def save_model(
     Args:
         cur_save_path (str): 存储路径。
     """
-    if args.use_lora:                       # merge lora params with origin model
-        merged_model = copy.deepcopy(model)
-        merged_model = merged_model.merge_and_unload()
-        merged_model.save_pretrained(cur_save_dir)
-    else:
-        model.save_pretrained(cur_save_dir)
+    # if args.use_lora:                       # merge lora params with origin model
+    #     merged_model = copy.deepcopy(model)
+    #     merged_model = merged_model.merge_and_unload()
+    #     merged_model.save_pretrained(cur_save_dir)
+    # else:
+    #     model.save_pretrained(cur_save_dir)
+    model.save_pretrained(cur_save_dir)

推理的时候,单独加载下lora参数模型:

diff --git a/LLM/finetune/inference.py b/LLM/finetune/inference.py
index f7d1311..183241a 100644
--- a/LLM/finetune/inference.py
+++ b/LLM/finetune/inference.py
@@ -1,3 +1,4 @@
+# coding: utf8
 # !/usr/bin/env python3
 """
 ==== No Bugs in code, just some Random Unexpected FEATURES ====
@@ -23,6 +24,7 @@ Date: 2023/03/17
 import time
 import torch

+from peft import PeftModel
 from transformers import AutoTokenizer, AutoModel
 torch.set_default_tensor_type(torch.cuda.HalfTensor)

@@ -64,18 +66,21 @@ if __name__ == '__main__':

     device = 'cuda:0'
     max_new_tokens = 300
-    model_path = "checkpoints/model_1000"
+    lora_model_path = "checkpoints/finetune/model_1000"

     tokenizer = AutoTokenizer.from_pretrained(
-        model_path,
+        "D:\\software\\chatglm-6b\\chatglm-6b", # 改成chatglm-6b原始模型的地址
         trust_remote_code=True
     )

     model = AutoModel.from_pretrained(
-        model_path,
+        "D:\\software\\chatglm-6b\\chatglm-6b", # # 改成chatglm-6b原始模型的地址
         trust_remote_code=True
     ).half().to(device)

+    model = PeftModel.from_pretrained(model, lora_model_path, adapter_name="lora")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants