Skip to content

Bump vLLM version to 0.11.0#481

Merged
SumanthRH merged 1 commit intoNovaSky-AI:mainfrom
tyler-griggs:tgriggs/vllm_0110
Oct 15, 2025
Merged

Bump vLLM version to 0.11.0#481
SumanthRH merged 1 commit intoNovaSky-AI:mainfrom
tyler-griggs:tgriggs/vllm_0110

Conversation

@tyler-griggs
Copy link
Member

@tyler-griggs tyler-griggs commented Oct 15, 2025

This required a flashinfer bump, which we now install from pypi directly, but they now provide prebuilt jit compilation caches! So startup is still fast.

@tyler-griggs tyler-griggs marked this pull request as ready for review October 15, 2025 01:42
Copy link
Member

@SumanthRH SumanthRH left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀🚀

@SumanthRH SumanthRH merged commit 8389f22 into NovaSky-AI:main Oct 15, 2025
3 checks passed
erictang000 added a commit that referenced this pull request Oct 17, 2025
erictang000 added a commit that referenced this pull request Oct 17, 2025
Reverts #481 due to dependency issues with megatron.
SumanthRH pushed a commit that referenced this pull request Oct 24, 2025
… to 0.11.0 + pin minimum uv version for extra-build-dependencies (#528)

## Separates vllm + megatron deps
After #481, there were some megatron flashinfer issues with --extra
vllm. This PR separates out the version of vllm that megatron relies on
from the general vllm version, allowing us to bump vllm to 0.11.0 for
the rest of the training stack.

## Update flash-attn installation
Updates flash-attn installation to use the `extra-build-dependencies`
feature from uv, requiring us to use a uv version >= 0.8.10. This
feature allows us to do the following, removing the need to deal with
markers + extras to specify a url source for each set of extras.

```
[tool.uv.extra-build-dependencies]
flash-attn = [{requirement = "torch", match-runtime = true}]

[tool.uv.extra-build-variables]
flash-attn = { FLASH_ATTENTION_SKIP_CUDA_BUILD = "TRUE"}

[project.optional-dependencies]
vllm = [
    "vllm==0.11.0",
    "flash-attn==2.8.3",
...
]
mcore = [
     "flash-attn==2.7.4.post1"
...
]
```
li-boxuan pushed a commit to li-boxuan/SkyRL that referenced this pull request Nov 23, 2025
This required a flashinfer bump, which we now install from pypi
directly, but they now provide prebuilt jit compilation caches! So
startup is still fast.
li-boxuan pushed a commit to li-boxuan/SkyRL that referenced this pull request Nov 23, 2025
Reverts NovaSky-AI#481 due to dependency issues with megatron.
li-boxuan pushed a commit to li-boxuan/SkyRL that referenced this pull request Nov 23, 2025
… to 0.11.0 + pin minimum uv version for extra-build-dependencies (NovaSky-AI#528)

## Separates vllm + megatron deps
After NovaSky-AI#481, there were some megatron flashinfer issues with --extra
vllm. This PR separates out the version of vllm that megatron relies on
from the general vllm version, allowing us to bump vllm to 0.11.0 for
the rest of the training stack.

## Update flash-attn installation
Updates flash-attn installation to use the `extra-build-dependencies`
feature from uv, requiring us to use a uv version >= 0.8.10. This
feature allows us to do the following, removing the need to deal with
markers + extras to specify a url source for each set of extras.

```
[tool.uv.extra-build-dependencies]
flash-attn = [{requirement = "torch", match-runtime = true}]

[tool.uv.extra-build-variables]
flash-attn = { FLASH_ATTENTION_SKIP_CUDA_BUILD = "TRUE"}

[project.optional-dependencies]
vllm = [
    "vllm==0.11.0",
    "flash-attn==2.8.3",
...
]
mcore = [
     "flash-attn==2.7.4.post1"
...
]
```
dzorlu pushed a commit to fleet-ai/SkyRL that referenced this pull request Feb 4, 2026
This required a flashinfer bump, which we now install from pypi
directly, but they now provide prebuilt jit compilation caches! So
startup is still fast.
dzorlu pushed a commit to fleet-ai/SkyRL that referenced this pull request Feb 4, 2026
Reverts NovaSky-AI#481 due to dependency issues with megatron.
dzorlu pushed a commit to fleet-ai/SkyRL that referenced this pull request Feb 4, 2026
… to 0.11.0 + pin minimum uv version for extra-build-dependencies (NovaSky-AI#528)

## Separates vllm + megatron deps
After NovaSky-AI#481, there were some megatron flashinfer issues with --extra
vllm. This PR separates out the version of vllm that megatron relies on
from the general vllm version, allowing us to bump vllm to 0.11.0 for
the rest of the training stack.

## Update flash-attn installation
Updates flash-attn installation to use the `extra-build-dependencies`
feature from uv, requiring us to use a uv version >= 0.8.10. This
feature allows us to do the following, removing the need to deal with
markers + extras to specify a url source for each set of extras.

```
[tool.uv.extra-build-dependencies]
flash-attn = [{requirement = "torch", match-runtime = true}]

[tool.uv.extra-build-variables]
flash-attn = { FLASH_ATTENTION_SKIP_CUDA_BUILD = "TRUE"}

[project.optional-dependencies]
vllm = [
    "vllm==0.11.0",
    "flash-attn==2.8.3",
...
]
mcore = [
     "flash-attn==2.7.4.post1"
...
]
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants