Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
deepspeed-chat: fix incorrect lr when using lora only (#756)
When using lora only, get_optimizer_grouped_parameters() returns a list of 3 parameter groups, where only the second is not empty. Then, deepspeed removes empty parameter groups. [ref: DeepSpeedEngine._configure_optimizer() deepspeed v0.10.3] However, the lr_scheduler still contains 3 groups. This causes the lr scheduler to update the lora params with the wrong lr. Fix it by removing all empty groups in get_optimizer_grouped_parameters(). Change-Id: I520841312bdedd6a572cf4c827e0bbf06f983575 Signed-off-by: Moshe Island <[email protected]> Co-authored-by: Moshe Island <[email protected]>
- Loading branch information