feat: add distributed mode by mudler · Pull Request #9124 · mudler/LocalAI

mudler · 2026-03-23T23:47:35Z

Description

The objective of this PR is to make LocalAI scalable horizontally, and delegate processing to remote gRPC LocalAI workers.

Distributed mode enables horizontal scaling of LocalAI across multiple machines using PostgreSQL for state and node registry, and NATS for real-time coordination. Unlike P2P mode, distributed mode is designed for production deployments and Kubernetes environments where you need centralized management, health monitoring, and deterministic routing. To enable this, you have to pass --distributed to LocalAI. A docker compose file is provided as well to start quickly the full stack with a single command.

Note: unlike other ways to run LocalAI, distributed mode requires authentication enabled with a PostgreSQL database — SQLite is not supported. This is because the node registry, job store, and other distributed state are stored in PostgreSQL tables.

Architecture:

                    ┌─────────────────┐
                    │   Load Balancer  │
                    └────────┬────────┘
                             │
              ┌──────────────┼──────────────┐
              │              │              │
      ┌───────▼──────┐ ┌────▼─────┐ ┌─────▼──────┐
      │  Frontend #1 │ │ Frontend │ │ Frontend #N│
      │  (LocalAI)   │ │  #2      │ │  (LocalAI) │
      └──────┬───────┘ └────┬─────┘ └─────┬──────┘
             │              │              │
     ┌───────▼──────────────▼──────────────▼───────┐
     │              PostgreSQL + NATS               │
     │  (node registry, jobs, coordination)         │
     └───────┬──────────────┬──────────────┬───────┘
             │              │              │
      ┌──────▼──────┐ ┌────▼─────┐ ┌─────▼──────┐
      │  Worker #1  │ │ Worker   │ │ Worker #N  │
      │  (generic)  │ │ #2       │ │  (generic) │
      └─────────────┘ └──────────┘ └────────────┘

Frontends are stateless LocalAI instances that receive API requests and route them to worker nodes via the SmartRouter. All frontends share state through PostgreSQL and coordinate via NATS.

Workers are generic processes that self-register with a frontend. They don't have a fixed backend type — the SmartRouter dynamically installs the required backend via NATS backend.install events when a model request arrives.

Scheduling Algorithm

The SmartRouter uses idle-first scheduling:

If the model is already loaded on a node → use it (least in-flight)
If no node has the model → prefer truly idle nodes (zero models, zero in-flight), trying to fit in nodes reported free VRAM/RAM

Nodes page:

Screenshot 2026-03-24 at 00-20-06 LocalAI

Notes for Reviewers

TODO:

Make sure we sync to nodes also files that are mentioned inside options (this is a bit more challenging) and mmproj files
re-use the vram detection logic to route models more efficiently to the nodes that have free vram, not only on capacity
Add hints in the UI on how to start workers
Backend management in distributed mode (should be able to install/delete backends as well)
Model management (if a model is deleted from the frontend, should be removed from the nodes too)
Dynamic auth tokens for nodes (Currently user have to specify a registration token manually in the frontend and in the workers have to be the same) -> went with approval/auto-approval mode

Signed commits

Yes, I signed my commits.

netlify · 2026-03-23T23:47:46Z

❌ Deploy Preview for localai failed.

Name	Link
🔨 Latest commit	`5aa34de`
🔍 Latest deploy log	https://app.netlify.com/projects/localai/deploys/69c315ff7eb98e000832c96d

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

mudler mentioned this pull request Mar 24, 2026

add local remote llamas #9122

Open

mudler force-pushed the feat/distributed-mode branch 2 times, most recently from 36bed92 to 23f3831 Compare March 24, 2026 21:35

feat: add distributed mode (experimental)

5aa34de

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

mudler force-pushed the feat/distributed-mode branch from 23f3831 to 5aa34de Compare March 24, 2026 22:53

mudler changed the title ~~feat: add distributed mode (experimental)~~ feat: add distributed mode Mar 24, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add distributed mode#9124

feat: add distributed mode#9124
mudler wants to merge 1 commit intomasterfrom
feat/distributed-mode

mudler commented Mar 23, 2026 •

edited

Loading

Uh oh!

netlify bot commented Mar 23, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

mudler commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Scheduling Algorithm

Uh oh!

netlify bot commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

❌ Deploy Preview for localai failed.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

mudler commented Mar 23, 2026 •

edited

Loading

netlify bot commented Mar 23, 2026 •

edited

Loading