v0.6.3: Patch release
What's Changed
- Fix setup.py crash when torch is not installed. by @PaperclipBadger in #1866
- Add support for AWS SageMaker. by @matherit in #1868
- Fix broken links by @tjruwase in #1873
- [docs] add amd blog to website by @jeffra in #1874
- [docs] add moe paper by @jeffra in #1875
- Supporting multiple modules injection with a single policy when they … by @samyam in #1869
- [docs] fix dead links by @jeffra in #1877
- add now required
-lcurand
to solveundefined symbol: curandCreateGenerator
by @stas00 in #1879 - Bug fix for flops profilers output by @VisionTheta in #1885
- Bump nokogiri from 1.13.3 to 1.13.4 in /docs by @dependabot in #1889
- [docs] fix commonmarker security issue by @jeffra in #1892
- bf16+pipeline parallelism by @tjruwase in #1801
- fix file ordering by @szhengac in #1822
- Use f-strings where possible by @manuelciosici in #1900
- [partition_parameters.py] better diagnostics by @stas00 in #1887
- comm backend: cast bool when not supported by torch2cupy by @conglongli in #1894
- Use cuda events to improve timing for multi-stream execution by @tjruwase in #1881
- Fix multiple zero 3 tracing errors by @tjruwase in #1901
- Improve ds_report output for HIP/ROCm by @mrwyattii in #1906
- Fix launcher for reading env vars by @szhengac in #1907
- Fix OOM and type mismatch by @tjruwase in #1884
New Contributors
- @PaperclipBadger made their first contribution in #1866
- @matherit made their first contribution in #1868
- @VisionTheta made their first contribution in #1885
- @szhengac made their first contribution in #1822
Misc
- v0.6.2 was skipped due to a build/deploy issue with that release
Full Changelog: v0.6.1...v0.6.3