-
Notifications
You must be signed in to change notification settings - Fork 90
Description
Feature Description
This epic is created to track Omni model (vllm-omni targeted) quantization support status.
main dtypes includes: INT4, MXFP8, MXFP4 is future target
Motivation and Use Case
omni model enable status:
| model | architecture | owner | INT4 | MXFP8 |
|---|---|---|---|---|
| Qwen3-omni | Qwen3OmniMoeForConditionalGeneration | Liang | ok | |
| Qwen2.5-omni | Qwen2_5OmniForConditionalGeneration | Liang | ok | |
| HunyuanImage-3.0 | HunyuanImage3ForCausalMM | Chang | ok | |
| Z-Image | ZImageTransformer2DModel | Xin | ok | |
| Next-Step | ZImageTransformer2DModel | Xin | WIP (Feature request in vllm-omni) | |
| MiMo-Audio-7B-Instruct | MiMoAudioForCausalLM | Weiwei | fail | |
| Qwen3-TTS-12Hz-1.7B-CustomVoice | Qwen3TTSForConditionalGeneration | Weiwei | fail | |
| GLM-Image | GlmImageForConditionalGeneration GlmImageTransformer2DModel | Liang | ok | |
| BAGEL-7B-MoT | BagelForConditionalGeneration | Liang | ||
| Ovis-Image | OvisImageTransformer2DModel | Liang |
Alternatives Considered
No response
Definition of Done
No response
Additional Context
No response
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request