Skip to content

[train][fix] Temporarily rename /update_weights endpoint to prevent conflict with native vllm endpoint#1265

Merged
SumanthRH merged 4 commits intomainfrom
sumanthrh/weight-sync-temp-fix
Mar 3, 2026
Merged

[train][fix] Temporarily rename /update_weights endpoint to prevent conflict with native vllm endpoint#1265
SumanthRH merged 4 commits intomainfrom
sumanthrh/weight-sync-temp-fix

Conversation

@SumanthRH
Copy link
Copy Markdown
Member

@SumanthRH SumanthRH commented Mar 3, 2026

What does this PR do?

Temporary fix for weight sync CI failures reported in #1242 to unblock release.

Changes

test_weight_sync.py was failing because SkyRL's /update_weights endpoint was conflicting with vLLM's /update_weights endpoint.

This PR fixes the issue by renaming our update weights endpoint to /update_weights_skyrl.

The changes have been replicated even for the remote server codepath in the old InferenceEngineClient codepath (generators/inference_engines).

The long term fix is to migrate SkyRL's new inference servers codepath (generators/inference_servers) to use the native vLLM API endpoints.


Open with Devin

Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
@SumanthRH SumanthRH marked this pull request as ready for review March 3, 2026 19:42
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a conflict with a native vllm endpoint by renaming /update_weights to /update_weights_skyrl. However, a critical security vulnerability was identified: these sensitive endpoints lack any form of authentication. This allows anyone with network access to the inference server to trigger model weight updates, which could lead to model hijacking or denial of service. It is highly recommended to implement authentication for these custom endpoints, especially if the servers are reachable over a network. Additionally, to improve maintainability, consider using a constant for the new endpoint path to avoid hardcoding it in multiple locations.


resp = await session.post(
f"{self._url}/update_weights",
f"{self._url}/update_weights_skyrl",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The endpoint path /update_weights_skyrl is hardcoded. This string is also used in the corresponding server implementation. To improve maintainability and ensure consistency, consider defining this path as a constant in a shared module (e.g., in a new constants.py file) and importing it where needed. This would make future changes to the endpoint easier and less error-prone.

Dict mapping server_url to response.
"""
return await self._call_all_servers("/update_weights", request.to_json_dict())
return await self._call_all_servers("/update_weights_skyrl", request.to_json_dict())
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The endpoint path /update_weights_skyrl is hardcoded. This string is also used in the corresponding server implementation. To improve maintainability and ensure consistency, consider defining this path as a constant in a shared module (e.g., in a new constants.py file) and importing it where needed. This would make future changes to the endpoint easier and less error-prone.

return {"status": "ok", "server_id": server_id}

@app.post("/update_weights")
@app.post("/update_weights_skyrl")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The endpoint path /update_weights_skyrl is hardcoded in this test's mock server. To ensure tests stay in sync with the application code, it would be better to import the endpoint path from the same shared constant used in the application code.

# Verify correct endpoint called
call_args = mock_session.post.call_args
assert call_args[0][0] == f"{url}/update_weights"
assert call_args[0][0] == f"{url}/update_weights_skyrl"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The endpoint path /update_weights_skyrl is hardcoded in this test assertion. To ensure tests stay in sync with the application code, it would be better to import the endpoint path from the same shared constant used in the application code and use it in the assertion.

@SumanthRH SumanthRH merged commit a93c11e into main Mar 3, 2026
5 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[train] Update new inference servers codepath after vllm upgrade to 0.16.0

1 participant