v0.9.2: Patch release
What's Changed
- MiCS implementation by @zarzen in #2964
- Fix formatting by @mrwyattii in #3343
- [ROCm] Hipify cooperative_groups headers by @rraminen in #3323
- Diffusers 0.15.0 bug fix by @molly-smith in #3345
- Print default values for DeepSpeed --help by @mrwyattii in #3347
- add bf16 cuda kernel support by @dc3671 in #3092
- README.md: Update MosaicML docs link by @kobindra in #3344
- hybrid_engine: check tuple size when fusing lora params by @adammoody in #3311
- fix mpich launcher issue in multi-node by @sywangyi in #3078
- Update DS-Chat issue template by @mrwyattii in #3368
- add deepspeed chat blog links, add tags by @conglongli in #3369
- Fix redundant shared_params in zero_to_fp32.py by @ShijieZZZZ in #3149
- fixing default communication_data_type for bfloat16_enabled and docs by @clumsy in #3370
- Auto TP Tutorial with T5 Example by @molly-smith in #2962
- stage_1_and_2.py: do gradient scale only for fp16 by @guoyejun in #3166
- Fix memory leak in zero2 contiguous gradients by @hablb in #3306
- remove megatron-lm, no longer pip installable by @jeffra in #3389
- Fix pipeline module evaluation when contiguous activation checkpoin… by @hablb in #3005
- doc updates by @goodship1 in #3415
- Save tensors in context of memory_efficient_linear by @tohtana in #3413
- Add HE support for the rest of model containers by @RezaYazdaniAminabadi in #3191
- Update PyTorch Lightning/DeepSpeed examples links by @loadams in #3424
- Fix
PipelineEngine.eval_batch
result by @nrailgun in #3316 - OPT Activation Function Hotfix by @cmikeh2 in #3400
- Add ZeRO 1 support to PP for BF16. by @jomayeri in #3399
- [zero_to_fp32] fix shared param recovery by @stas00 in #3407
- Adagrad support in ZeRO by @jomayeri in #3401
- Update 2020-09-09-sparse-attention.md by @goodship1 in #3432
New Contributors
- @dc3671 made their first contribution in #3092
- @kobindra made their first contribution in #3344
- @hablb made their first contribution in #3306
- @nrailgun made their first contribution in #3316
Full Changelog: v0.9.1...v0.9.2