-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to properly run model training on 1 RTX4090 graphics card? #203
Comments
Wow I totally forgot to push those training script fixes. I just pushed them up to The script has to know how to properly mask out the user's request and only train on the assistant response portion. I have a method to auto detect this but that can fail which causes the training script to mask the entire message and causes loss to immediately go to zero. Some tokenizers don't work properly with the auto-detect and you need to provide a set of prefix and suffix tokens like this: https://github.com/acon96/home-llm/blob/develop/train.py#L747 The updated training script should also print out a warning if the entire message is masked as well as print out the tokenized messages so you can determine the tokens you would need to provide for the masking to work. There's a few examples from the various models I have tried in the script. |
Thanks for explanation, but from practical site what I should change? 😀 I'm not that proficient with model training tools yet.. How you recognise those values for this specific model? Should I uncomment this code?
|
Yes exactly. And if you have a new model to fine tune you just need to determine the prefix and suffix tokens that surround the response from the assistant. |
hm.. in your message it sounds like it's something very simple 😀 but can you tell me how to determine those values? I think I will be able to add 2 additional params to set these values if required so that I don't have to comment the code 😀 |
I updated script to latest version. I also updated the way how I run the main train script:
I don't understand where is a problem but now I see lot of this kind of errors:
now I see 11201 warnings but script still calls to DataCollatorForSupervisedFineTuning class. 🧐 |
So what is happening is the script is printing the warning along with the tokenized version of the example that is being trained on. This is just a helpful thing I did to help determine what the correct prefix_ids and suffix_ids. The idea is that the training script need to build a mask (array) that is The training script here attempts to auto-detect which tokens are for the assistant, but that is not trivial, and sometimes you need to manually provide the tokens that start an assistant response, and the tokens that end an assistant response. For a model like TinyLlama that uses Zephyr format, the prefix is The other issue is that tokenizers perform differently based on if a token is preceded by white-space or if it is adjacent to the token that came before it. For example, check out https://gpt-tokenizer.dev/ to mess around with the GPT tokenizers. Try tokenizing the word I tried to make a script to show this and potentially assist in determining the correct prefix and suffix tokens for your model: |
Thank you for the very good explanation. Now I understand the problem more broadly. Also, many thanks for creating this
How should I decide which to choose? 😀 |
BTW. @acon96 your knowledge is amazing, so I took the liberty of updating the documentation to be helpful for people like me who are learning this whole process. I hope you don't mind me making these changes. Also I little updated script to try not hardcode any value if the are not needed. 😀 Thanks again for your work 💪🏻 If you have a time please check my PR #204 |
it looks like the Llama 3 tokenizer doesn't have any issues with the white-space. I would use
And thanks for adding the new docs page. I haven't had the time to properly document that part of the process. Mostly been focused on user guides for people using the HA integration. |
It is very strange.. because even if I set properly prefix and suffix the script still can show me info about no assistant response. In example below I checked and exist this params for prefix and suffix (now I understand why you create this output) but still I see this info.
BTW. I did small mistake in |
Just figured it out. You're using the correct token IDs, the parameters needed to be converted from string form to integers for the mask detection function to work properly. I just pushed a fix for that to |
Thank you for find and fix the problem. I try tu ryn this script for
Do you know why this might be happening? |
Those models don't support system prompts. I can try and tweak things to support that but most of the code assumes that you are able to pass a system prompt to the model. |
I understand. If possible, I would be very grateful for adding such a support option because these are the largest Polish models and it seems to me that they would work best for the Polish language. |
speakleash/Bielik-7B-Instruct-v0.1 supports system prompt, so the problem must be with data: "Conversation roles must alternate user/assistant/user/assistant/..." |
@acon96 I made a minor correction and now it looks like the templates are working. You can find the code changes below:
in output then we have something like this:
or in different model:
another example:
|
BTW. I checked template also with system prompt and it also works for all above examples:
PS. I have information that models: |
@acon96 When I try to run DPO training I get this message. What is it related to?
I use this command:
|
I don't think I ever properly got the DPO stuff working. The loss never went down with the DPO dataset and the code hasn't been updated in a while. |
Hey @acon96, It seems to me that this is not a bug, but an incorrect configuration. I try to run
train.py
script with some params, on my 1 x RTX4090 graphic card. I try to trainllama-3.1-8B
models with polish dataset. All command you can find below:Generate data:
Run trains script:
Describe the bug
I use LoRA to run bigger model on my card. I don't really know where the problem is so that I can train such a model. I don't really know how to set the indicated parameters, I relied on your examples to try to get it working. I run models 1B and 3B without LoRA and they can fit in the VRAM of the graphics card. Could you help me how to set the indicated parameters so that I can train such a model?
Expected behavior
I would expect that after running this command and waiting a few hours I would receive a new model :)
Logs
running command
nvidia-smi
I see this output:The text was updated successfully, but these errors were encountered: