Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lora微调Mamba-Codestral-7B-v0.1出现了问题 #6434

Open
1 task done
tongzeliang opened this issue Dec 24, 2024 · 0 comments
Open
1 task done

lora微调Mamba-Codestral-7B-v0.1出现了问题 #6434

tongzeliang opened this issue Dec 24, 2024 · 0 comments
Labels
pending This problem is yet to be addressed

Comments

@tongzeliang
Copy link

Reminder

  • I have read the README and searched the existing issues.

System Info

  • llamafactory version: 0.9.1.dev0
  • Platform: Linux-6.5.0-18-generic-x86_64-with-glibc2.35
  • Python version: 3.9.0
  • PyTorch version: 2.5.1+cu124 (GPU)
  • Transformers version: 4.45.0
  • Datasets version: 2.21.0
  • Accelerate version: 0.34.2
  • PEFT version: 0.12.0
  • TRL version: 0.9.6
  • GPU type: NVIDIA GeForce RTX 4090

Reproduction

这是我遇到的问题:

  File "/home/tzl/.conda/envs/SEM/lib/python3.9/site-packages/torch/_compile.py", line 32, in inner
    return disable_fn(*args, **kwargs)
  File "/home/tzl/.conda/envs/SEM/lib/python3.9/site-packages/torch/_dynamo/eval_frame.py", line 632, in _fn
    return fn(*args, **kwargs)
  File "/home/tzl/.conda/envs/SEM/lib/python3.9/site-packages/torch/utils/checkpoint.py", line 489, in checkpoint
    return CheckpointFunction.apply(function, preserve, *args)
  File "/home/tzl/.conda/envs/SEM/lib/python3.9/site-packages/torch/autograd/function.py", line 575, in apply
    return super().apply(*args, **kwargs)  # type: ignore[misc]
  File "/home/tzl/.conda/envs/SEM/lib/python3.9/site-packages/torch/utils/checkpoint.py", line 264, in forward
    outputs = run_function(*args)
  File "/home/tzl/.conda/envs/SEM/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/tzl/.conda/envs/SEM/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/tzl/.conda/envs/SEM/lib/python3.9/site-packages/transformers/models/mamba2/modeling_mamba2.py", line 649, in forward
    hidden_states = self.mixer(
  File "/home/tzl/.conda/envs/SEM/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/tzl/.conda/envs/SEM/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/tzl/.conda/envs/SEM/lib/python3.9/site-packages/transformers/models/mamba2/modeling_mamba2.py", line 608, in forward
    return self.torch_forward(hidden_states, cache_params, cache_position, attention_mask)
  File "/home/tzl/.conda/envs/SEM/lib/python3.9/site-packages/transformers/models/mamba2/modeling_mamba2.py", line 535, in torch_forward
    G_intermediate = C[:, :, :, None, :, :] * B[:, :, None, :, : ,:]  # shape: (b, c, l, s, h, n)
torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 16.00 GiB. GPU 0 has a total capacity of 23.64 GiB of which 9.08 GiB is free. Including non-PyTorch memory, this process has 14.56 GiB memory in use. Of the allocated memory 14.09 GiB is allocated by PyTorch, and 19.65 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

我确认GPU中无其它进程占用,但还是有CUDA out of memory的问题,因此想请教您一下,谢谢。

Expected behavior

No response

Others

No response

@github-actions github-actions bot added the pending This problem is yet to be addressed label Dec 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pending This problem is yet to be addressed
Projects
None yet
Development

No branches or pull requests

1 participant