Release DeepSpeed v0.9.0 · microsoft/DeepSpeed

New features

🚀 DeepSpeed Chat: Easy, Fast and Affordable RLHF Training of ChatGPT-like Models at All Scales 🚀

What's Changed

[docs] add MCR-DL paper to readme/docs by @Quentin-Anthony in #3066
Several fixes to unblock CI by @loadams in #3047
Assert mp_size is factor of model dimensions by @molly-smith in #2891
[CI] follow-up fixes by @jeffra in #3072
fix return prev key and value , added strides to from_blob by @mzusman in #2828
Remove bf16 from inference config dtye enum by @molly-smith in #3010
Softmax Scheduling Cleanup by @cmikeh2 in #3046
Fix nebula in save_16bit_model issue by @FreyaRao in #3023
Allow lists by @satpalsr in #3042
Goodbye Torch 1.8 by @mrwyattii in #3082
Empty ZeRO3 partition cache by @tjruwase in #3060
pre-commit check for torch.cuda in code by @delock in #2981
Move cuda check into utils by @loadams in #3074
update yapf version and style settings by @jeffra in #3098
Fix comms benchmark import issues and support MPI/slurm launching by @Quentin-Anthony in #2932
Disable Stage 1&2 CPUAdam pathways by @mrwyattii in #3097
♻️ replace deprecated functions for communication by @mayank31398 in #2995
Make fp32 default communication data type by @tjruwase in #2970
Update DeepSpeed copyright license to Apache 2.0 by @mrwyattii in #3111
Add Full Apache License by @mrwyattii in #3119
VL MoE Blog by @yaozhewei in #3120
Update SD triton version in requirements-sd.txt by @lekurile in #3135
Fix launch issue by @tjruwase in #3137
Fix CI badges by @mrwyattii in #3138
Optimize Softmax Kernel by @molly-smith in #3112
Use generic O_DIRECT by @tjruwase in #3115
Enable autoTP for bloom by @sywangyi in #3035
[cleanup] remove pass calls where they aren't needed by @stas00 in #2826
[ci] nv-transformers-v100 - use the same torch version as transformers CI by @stas00 in #3096
Fixes code and tests skipping/asserting incorrectly on torch 2+. by @loadams in #3136
fix example symlink about DeepSpeed+AzureML by @EeyoreLee in #3127
Remove Extra Bracket by @VHellendoorn in #3101
Recover shared parameters by @ShijieZZZZ in #3033
Fix for Diffusers 0.14.0 by @molly-smith in #3142
Fix copyright check, add copyright replace script by @mrwyattii in #3141
Update curriculum-learning.md by @goodship1 in #3031
Remove benchmark code by @mrwyattii in #3157
fixing a bug in CPU Adam and Adagrad by @xiexbing in #3109
op_builder: conditionally compute relative path for hip compiled files by @adammoody in #3095
zero.Init() should pin params in GPU memory as requested by @tjruwase in #2953
deepspeed/runtime/utils.py: reset_peak_memory_stats when empty cache by @guoyejun in #2803
Add DeepSpeed-Chat Blogpost by @awan-10 in #3185
[docs] add run command for 13b by @awan-10 in #3187
add news item. by @awan-10 in #3188
DeepSpeed Chat by @tjruwase in #3186
Fix references to figures by @tohtana in #3189
Fix typo by @zhouzaida in #3183
Fix typo by @dawei-wang in #3164
Chatgpt chinese blog by @yaozhewei in #3193
Add Japanese version of ChatGPT-like pipeline blog by @tohtana in #3194
fix hero figure by @conglongli in #3199
feat: Add support for NamedTuple when sharding parameters [#3029] by @alexandervaneck in #3037
fix license badge by @conglongli in #3200
Update AMD workflows by @loadams in #3179
[CPU support] Optionally bind each rank to different cores on host by @delock in #2881

New Contributors

@mzusman made their first contribution in #2828
@FreyaRao made their first contribution in #3023
@sywangyi made their first contribution in #3035
@EeyoreLee made their first contribution in #3127
@VHellendoorn made their first contribution in #3101
@goodship1 made their first contribution in #3031
@zhouzaida made their first contribution in #3183
@dawei-wang made their first contribution in #3164
@alexandervaneck made their first contribution in #3037

Full Changelog: v0.8.3...v0.9.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DeepSpeed v0.9.0

New features

What's Changed

New Contributors

Contributors