Skip to content

feat: add distributed mode#9124

Draft
mudler wants to merge 1 commit intomasterfrom
feat/distributed-mode
Draft

feat: add distributed mode#9124
mudler wants to merge 1 commit intomasterfrom
feat/distributed-mode

Conversation

@mudler
Copy link
Owner

@mudler mudler commented Mar 23, 2026

Description

The objective of this PR is to make LocalAI scalable horizontally, and delegate processing to remote gRPC LocalAI workers.

Distributed mode enables horizontal scaling of LocalAI across multiple machines using PostgreSQL for state and node registry, and NATS for real-time coordination. Unlike P2P mode, distributed mode is designed for production deployments and Kubernetes environments where you need centralized management, health monitoring, and deterministic routing. To enable this, you have to pass --distributed to LocalAI. A docker compose file is provided as well to start quickly the full stack with a single command.

Note: unlike other ways to run LocalAI, distributed mode requires authentication enabled with a PostgreSQL database — SQLite is not supported. This is because the node registry, job store, and other distributed state are stored in PostgreSQL tables.

Architecture:

                    ┌─────────────────┐
                    │   Load Balancer  │
                    └────────┬────────┘
                             │
              ┌──────────────┼──────────────┐
              │              │              │
      ┌───────▼──────┐ ┌────▼─────┐ ┌─────▼──────┐
      │  Frontend #1 │ │ Frontend │ │ Frontend #N│
      │  (LocalAI)   │ │  #2      │ │  (LocalAI) │
      └──────┬───────┘ └────┬─────┘ └─────┬──────┘
             │              │              │
     ┌───────▼──────────────▼──────────────▼───────┐
     │              PostgreSQL + NATS               │
     │  (node registry, jobs, coordination)         │
     └───────┬──────────────┬──────────────┬───────┘
             │              │              │
      ┌──────▼──────┐ ┌────▼─────┐ ┌─────▼──────┐
      │  Worker #1  │ │ Worker   │ │ Worker #N  │
      │  (generic)  │ │ #2       │ │  (generic) │
      └─────────────┘ └──────────┘ └────────────┘

Frontends are stateless LocalAI instances that receive API requests and route them to worker nodes via the SmartRouter. All frontends share state through PostgreSQL and coordinate via NATS.

Workers are generic processes that self-register with a frontend. They don't have a fixed backend type — the SmartRouter dynamically installs the required backend via NATS backend.install events when a model request arrives.

Scheduling Algorithm

The SmartRouter uses idle-first scheduling:

  1. If the model is already loaded on a node → use it (least in-flight)
  2. If no node has the model → prefer truly idle nodes (zero models, zero in-flight), trying to fit in nodes reported free VRAM/RAM

Nodes page:

Screenshot 2026-03-24 at 00-20-06 LocalAI

Notes for Reviewers

TODO:

  • Make sure we sync to nodes also files that are mentioned inside options (this is a bit more challenging) and mmproj files
  • re-use the vram detection logic to route models more efficiently to the nodes that have free vram, not only on capacity
  • Add hints in the UI on how to start workers
  • Backend management in distributed mode (should be able to install/delete backends as well)
  • Model management (if a model is deleted from the frontend, should be removed from the nodes too)
  • Dynamic auth tokens for nodes (Currently user have to specify a registration token manually in the frontend and in the workers have to be the same) -> went with approval/auto-approval mode

Signed commits

  • Yes, I signed my commits.

@netlify
Copy link

netlify bot commented Mar 23, 2026

Deploy Preview for localai failed.

Name Link
🔨 Latest commit 5aa34de
🔍 Latest deploy log https://app.netlify.com/projects/localai/deploys/69c315ff7eb98e000832c96d

@mudler mudler force-pushed the feat/distributed-mode branch 2 times, most recently from 36bed92 to 23f3831 Compare March 24, 2026 21:35
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
@mudler mudler force-pushed the feat/distributed-mode branch from 23f3831 to 5aa34de Compare March 24, 2026 22:53
@mudler mudler changed the title feat: add distributed mode (experimental) feat: add distributed mode Mar 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant