Skip to content

v0.0.4

Compare
Choose a tag to compare
@HanGuo97 HanGuo97 released this 27 Jul 22:37
· 97 commits to main since this release
  1. Adding support for LLaMA-3.1 405B
  2. Lightly tuned BF16 performance, though still worse than FP16, especially in 3-bit settings.
  3. Uses newer vLLM version.