Skip to content

fix(mnnchat): reuse loaded runtime session for API start#4318

Draft
Juude wants to merge 2 commits intoalibaba:masterfrom
Juude:codex/mnnchat-api-runtime-reuse
Draft

fix(mnnchat): reuse loaded runtime session for API start#4318
Juude wants to merge 2 commits intoalibaba:masterfrom
Juude:codex/mnnchat-api-runtime-reuse

Conversation

@Juude
Copy link
Collaborator

@Juude Juude commented Mar 25, 2026

Summary

  • reuse an already loaded runtime session when API service starts for the same model
  • avoid releasing and reloading the model when chat already opened the model with app config
  • add a focused unit test for the API runtime session start policy

Why

Opening the API service after entering chat could hit a config-domain mismatch:

  • chat loads the model via useAppConfig=true
  • API startup previously called ensureSession(modelId) with the default base-config path
  • the runtime controller treated that as non-reusable and released/rebuilt the same model session

That extra reload at the API startup / first request boundary could cause freezes or crashes for local models such as Qwen3.5-0.8B and 2B.

Verification

  • ./gradlew :app:testStandardDebugUnitTest --tests com.alibaba.mnnllm.api.openai.service.ApiRuntimeSessionStartPolicyTest
  • Focused device validation on 2112123AC before the adb connection dropped:
    • entered chat for Qwen3.5-0.8B-MNN
    • confirmed dumpapp llm status reported an active loaded session
    • started API service for the same model
    • verified logcat hit Reusing loaded runtime session for API start...
    • verified device-local API request returned HTTP 200

Known blockers

  • On clean origin/master, ./tests/smoke/scripts/08_regress_api_dumpapp.sh currently fails at baseline because it sources tests/smoke/scripts/smoke_state_helpers.sh, which is not present on that base.
  • ./tests/smoke/scripts/09_regress_api_uiautomator.sh rerun on this clean PR branch was blocked after the adb device disconnected from the host.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant