What's new in 1.1.1 (2024-12-27)
These are the changes in inference v1.1.1.
New features
- FEAT: support F5-TTS-MLX by @qinxuye in #2671
- FEAT: Support qwen2.5-coder-instruct model for tool calls by @Timmy-web in #2681
- FEAT: Support minicpm-4B on vllm by @Jun-Howie in #2697
- FEAT: support scheduling-policy for vllm by @hwzhuhao in #2700
- FEAT: Support QvQ-72B-Preview by @Jun-Howie in #2712
- FEAT: support SD3.5 series model by @qinxuye in #2706
Enhancements
- ENH: Guided Decoding OpenAIClient compatibility by @wxiwnd in #2673
- ENH: resample f5-tts-mlx ref audio when sample rate not synching. by @qinxuye in #2678
- ENH: support no images for MLX vlm by @qinxuye in #2670
- ENH: Update fish speech 1.5 by @codingl2k1 in #2672
- ENH: Update cosyvoice 2 by @codingl2k1 in #2684
- REF: Reduce code redundancy by setting default values by @pengjunfeng11 in #2711
Bug fixes
- BUG: Fix f5tts audio ref by @codingl2k1 in #2680
- BUG:
glm4-chat
cannot apply for continuous batching with transformers backend by @ChengjieLi28 in #2695
New Contributors
- @Timmy-web made their first contribution in #2681
Full Changelog: v1.1.0...v1.1.1