Skip to content

fix catalog server filtering and implement bm25 search#423

Merged
slimslenderslacks merged 2 commits intodocker:mainfrom
slimslenderslacks:import-json-profile
Feb 25, 2026
Merged

fix catalog server filtering and implement bm25 search#423
slimslenderslacks merged 2 commits intodocker:mainfrom
slimslenderslacks:import-json-profile

Conversation

@slimslenderslacks
Copy link
Collaborator

BM25 Search Implementation

  • Replace keyword search with Okapi BM25 ranking algorithm for mcp-find
  • Implement field weighting (name ×4, title ×3, tools ×2, description ×1)
  • Add comprehensive test suite with 8 test scenarios
  • Add timing logs for index build, scoring, and total query time
  • Build fresh index on each query to include dynamically activated servers

Critical Bug Fix: Catalog Server Filtering

  • Fix FilterByPolicy removing all catalog servers from mcp-find results
  • Preserve catalog servers in configuration.servers map for search
  • Only apply policy filtering to enabled servers in serverNames
  • Result: mcp-find now searches all 340 servers (1 enabled + 339 catalog)

Profile Management Improvements

  • Refactor ProfileActivator interface to accept WorkingSet directly
  • Separate profile loading from activation logic
  • Add profiles.json support for project-level profiles
  • Implement atomic file writes with temp file + rename pattern
  • Add comprehensive tests for SaveProfile and LoadProfiles
  • Add mutex protection for concurrent configuration access

Breaking Changes

  • ProfileActivator.ActivateProfile now takes WorkingSet instead of string

  This commit introduces significant improvements to server discovery and
  profile management:

  ## BM25 Search Implementation
  - Replace keyword search with Okapi BM25 ranking algorithm for mcp-find
  - Implement field weighting (name ×4, title ×3, tools ×2, description ×1)
  - Add comprehensive test suite with 8 test scenarios
  - Add timing logs for index build, scoring, and total query time
  - Build fresh index on each query to include dynamically activated servers

  ## Critical Bug Fix: Catalog Server Filtering
  - Fix FilterByPolicy removing all catalog servers from mcp-find results
  - Preserve catalog servers in configuration.servers map for search
  - Only apply policy filtering to enabled servers in serverNames
  - Result: mcp-find now searches all 340 servers (1 enabled + 339 catalog)

  ## Profile Management Improvements
  - Refactor ProfileActivator interface to accept WorkingSet directly
  - Separate profile loading from activation logic
  - Add profiles.json support for project-level profiles
  - Implement atomic file writes with temp file + rename pattern
  - Add comprehensive tests for SaveProfile and LoadProfiles
  - Add mutex protection for concurrent configuration access

  ## Breaking Changes
  - ProfileActivator.ActivateProfile now takes WorkingSet instead of string
@slimslenderslacks slimslenderslacks requested a review from a team as a code owner February 25, 2026 02:03
@slimslenderslacks
Copy link
Collaborator Author

This fixes the problem introduced by policy where Catalog.servers is filtered down to only the enabled servers.

sequenceDiagram
    participant Client as AI Client
    participant Gateway
    participant Handler as bm25Strategy
    participant Config as Configuration
    participant Index as BM25 Index

    Client->>Gateway: CallTool("mcp-find", {query: "github"})
    Gateway->>Handler: Invoke handler

    Note over Handler: Build fresh index
    Handler->>Config: Read configuration.servers
    Config-->>Handler: 340 servers (1 enabled + 339 catalog)

    Handler->>Index: buildBM25Index(servers)
    loop For each server
        Index->>Index: Tokenize name (×4 weight)
        Index->>Index: Tokenize title (×3 weight)
        Index->>Index: Tokenize description (×1)
        Index->>Index: Tokenize tool names (×2)
        Index->>Index: Tokenize tool descriptions (×1)
    end
    Index-->>Handler: Index with 340 documents

    Note over Handler: Score query
    Handler->>Handler: Tokenize query: ["github"]
    Handler->>Index: Score all documents
    Index-->>Handler: Scored results

    Handler->>Handler: Sort by score descending
    Handler->>Handler: Apply limit
    Handler-->>Gateway: Formatted results
    Gateway-->>Client: Server matches with metadata
Loading

@slimslenderslacks
Copy link
Collaborator Author

This also allows us to import directly from a profiles.json file written to the project.

flowchart TD
    Start[Profile Activation Request]

    CheckSource{Where to<br/>look first?}

    InProject{Found in<br/>profiles.json?}
    InDB{Found in<br/>database?}

    ValidateConfig{All servers<br/>valid?}
    ValidateSecrets{Required<br/>secrets available?}
    ValidatePolicy{Passes<br/>policy check?}

    AlreadyActive{Server already<br/>in session?}

    AddToConfig[Add to configuration.servers]
    AddToNames[Add to configuration.serverNames]

    ReloadCaps[Reload capabilities]

    Success[Activation Successful]
    PartialSuccess[Partial Activation<br/>Some servers skipped]
    Failure[Activation Failed]

    Start --> CheckSource
    CheckSource -->|Project-local| InProject
    CheckSource -->|Saved| InDB

    InProject -->|Yes| ValidateConfig
    InProject -->|No| InDB
    InDB -->|Yes| ValidateConfig
    InDB -->|No| Failure

    ValidateConfig -->|Invalid| Failure
    ValidateConfig -->|Valid| ValidateSecrets

    ValidateSecrets -->|Missing| PartialSuccess
    ValidateSecrets -->|Available| ValidatePolicy

    ValidatePolicy -->|Denied| PartialSuccess
    ValidatePolicy -->|Allowed| AlreadyActive

    AlreadyActive -->|Yes, skip| ReloadCaps
    AlreadyActive -->|No| AddToConfig

    AddToConfig --> AddToNames
    AddToNames --> ReloadCaps

    ReloadCaps --> Success

    style Success fill:#27ae60,color:#fff
    style PartialSuccess fill:#f39c12,color:#fff
    style Failure fill:#e74c3c,color:#fff
Loading

saucow
saucow previously approved these changes Feb 25, 2026
Copy link
Contributor

@saucow saucow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm - from perspective of policy and search improvements.

Comment on lines -27 to 32
func LoadProfiles(ctx context.Context, activator ProfileActivator) error {
// LoadProfiles loads and returns all profiles from profiles.json in the current working directory
func LoadProfiles(_ context.Context) (ProfilesConfig, error) {
// Get current working directory
cwd, err := os.Getwd()
if err != nil {
return fmt.Errorf("failed to get current working directory: %w", err)
return nil, fmt.Errorf("failed to get current working directory: %w", err)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cmrigney @slimslenderslacks - do we need to handle any backwards compatibility here if the old format is detected?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is called in two places

  • from middleware when the client identifies itself as claude code today
  • when an agent explicitly calls activate-profile in a session
    But there hasn't been a format change here to the profile metadata

* do not complain about missing properties with defaults
@slimslenderslacks slimslenderslacks merged commit 5d74672 into docker:main Feb 25, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants