Skip to content

improve: enhance tool descriptions for better LLM agent tool selection#651

Closed
spidershield-contrib wants to merge 2 commits intografana:mainfrom
spidershield-contrib:improve/tool-descriptions
Closed

improve: enhance tool descriptions for better LLM agent tool selection#651
spidershield-contrib wants to merge 2 commits intografana:mainfrom
spidershield-contrib:improve/tool-descriptions

Conversation

@spidershield-contrib
Copy link
Copy Markdown

@spidershield-contrib spidershield-contrib commented Mar 12, 2026

What

Rewrote 60 MCP tool descriptions in Go source files using an action-verb-first format designed for LLM agent planning.

Why

MCP tool descriptions serve as planning hints — LLMs use them to select which tool to call and in what order. Current descriptions like "Search for Grafana teams by a query string" lack the scenario triggers and disambiguation boundaries needed for reliable agent tool selection.

Format applied

<Action verb> ...

Use when the user wants to [concrete scenario].
Do not use when [boundary] (use [alternative] instead).
Accepts `param` (required/optional). e.g., param="example".
Raises an error if [failure condition].

Results

Metric Before After
Avg description quality 2.7/10 9.8/10
Tools with disambiguation boundary 0/60 60/60
Tools scoring ≥ 9.8/10 0/60 59/60

🤖 Descriptions improved with spidershield — open-source MCP quality scanner.


Note

Low Risk
Changes are limited to human/LLM-facing tool description strings; no execution logic or API wiring is modified. Risk is primarily around potential mismatch between described parameters/behavior and the actual tool schemas/handlers.

Overview
Updates MCP tool metadata by rewriting descriptions across Grafana, datasource, and incident/oncall/query tools to be more action-oriented and LLM-planning friendly, adding clear use/do-not-use boundaries, parameter callouts, examples, and failure conditions.

No functional behavior changes are introduced; only the text passed into mcpgrafana.MustTool(...) is modified to improve tool selection/disambiguation during agent planning.

Written by Cursor Bugbot for commit 9873de6. This will update automatically on new commits. Configure here.

@spidershield-contrib spidershield-contrib requested a review from a team as a code owner March 12, 2026 19:30
@cla-assistant
Copy link
Copy Markdown

cla-assistant bot commented Mar 12, 2026

CLA assistant check
All committers have signed the CLA.

@cla-assistant
Copy link
Copy Markdown

cla-assistant bot commented Mar 12, 2026

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


spidershield-contrib seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@spidershield-contrib spidershield-contrib force-pushed the improve/tool-descriptions branch from 05aadeb to 012e2d1 Compare March 12, 2026 20:37
spidershield-contrib and others added 2 commits March 13, 2026 18:29
Rewrote 60 tool descriptions in Go source files using action-verb-first
format with scenario triggers, disambiguation boundaries, parameter docs,
and error guidance. Avg quality: 2.7/10 → 9.8/10.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Truncated leftover text appended after closing quotes in tool
description strings for UpdateDashboard and several Loki tools
(ListLokiLabelNames, ListLokiLabelValues, QueryLokiLogs,
QueryLokiStats, QueryLokiPatterns). These fragments broke the
Go string literal syntax and caused build failures across all CI checks.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@spidershield-contrib spidershield-contrib force-pushed the improve/tool-descriptions branch from 012e2d1 to 9873de6 Compare March 14, 2026 01:32
@sd2k
Copy link
Copy Markdown
Collaborator

sd2k commented Mar 17, 2026

This is a nice idea but it inflates our token usage by a further 25% and would cost all of our users a significant amount of money so I can't in good conscience merge it, sorry. Perhaps after some tool consolidation this will make more sense.

mcp-grafana on  improve/tool-descriptions [$?] via 🐹 v1.26.1-X:nodwarf5
❯ make build && mcp-tokens analyze --baseline baseline.json -- ./dist/mcp-grafana
go build -o dist/mcp-grafana ./cmd/mcp-grafana
Connecting to MCP server: ./dist/mcp-grafana []
time=2026-03-17T11:02:02.076Z level=INFO msg="Using Grafana configuration" url=http://localhost:3000 api_key_set=false basic_auth_set=false org_id=0 extra_headers_count=0
time=2026-03-17T11:02:02.077Z level=ERROR msg="failed to initialize proxied tools for stdio" error="failed to discover MCP datasources: failed to list datasources: Get \"http://localhost:3000/api/datasources\": dial tcp [::1]:3000: connect: connection refused"
time=2026-03-17T11:02:02.077Z level=INFO msg="Starting Grafana MCP server using stdio transport" version=v0.11.4-0.20260314012944-9873de6be2e1+dirty
time=2026-03-17T11:02:02.077Z level=INFO msg="Using Grafana configuration" url=http://localhost:3000 api_key_set=false basic_auth_set=false org_id=0 extra_headers_count=0
Connected to mcp-grafana v0.11.4-0.20260314012944-9873de6be2e1+dirty (50 tools)

Using anthropic counter (model: claude-sonnet-4-5-20250929)
MCP Token Analysis: mcp-grafana v0.11.4-0.20260314012944-9873de6be2e1+dirty
Counter: anthropic (claude-sonnet-4-5-20250929)
============================================================

Total tokens: 18411

Tools (50 tools, 18411 tokens):
----------------------------------------
    2335 tokens  alerting_manage_rules (desc: 117, schema: 2218)
    1129 tokens  update_dashboard (desc: 206, schema: 923)
    1105 tokens  query_loki_logs (desc: 209, schema: 896)
    1081 tokens  query_prometheus (desc: 202, schema: 879)
    1061 tokens  get_panel_image (desc: 167, schema: 894)
    1030 tokens  fetch_pyroscope_profile (desc: 205, schema: 825)
    1014 tokens  list_alert_groups (desc: 202, schema: 812)
     998 tokens  alerting_manage_routing (desc: 128, schema: 870)
     989 tokens  create_annotation (desc: 174, schema: 815)
     970 tokens  list_prometheus_label_values (desc: 137, schema: 833)
     954 tokens  list_prometheus_label_names (desc: 137, schema: 817)
     951 tokens  get_annotations (desc: 154, schema: 797)
     913 tokens  create_incident (desc: 161, schema: 752)
     902 tokens  query_prometheus_histogram (desc: 109, schema: 793)
     898 tokens  generate_deeplink (desc: 153, schema: 745)
     873 tokens  query_loki_stats (desc: 167, schema: 706)
     872 tokens  query_loki_patterns (desc: 147, schema: 725)
     866 tokens  get_assertions (desc: 141, schema: 725)
     825 tokens  list_pyroscope_label_values (desc: 106, schema: 719)
     819 tokens  find_error_pattern_logs (desc: 158, schema: 661)
     816 tokens  list_loki_label_values (desc: 124, schema: 692)
     803 tokens  find_slow_requests (desc: 142, schema: 661)
     794 tokens  list_pyroscope_label_names (desc: 122, schema: 672)
     794 tokens  update_annotation (desc: 144, schema: 650)
     791 tokens  get_dashboard_property (desc: 175, schema: 616)
     790 tokens  list_loki_label_names (desc: 142, schema: 648)
     788 tokens  get_sift_analysis (desc: 166, schema: 622)
     772 tokens  list_pyroscope_profile_types (desc: 114, schema: 658)
     771 tokens  list_prometheus_metric_names (desc: 133, schema: 638)
     766 tokens  list_prometheus_metric_metadata (desc: 132, schema: 634)
     762 tokens  list_oncall_schedules (desc: 151, schema: 611)
     755 tokens  get_dashboard_panel_queries (desc: 131, schema: 624)
     753 tokens  add_activity_to_incident (desc: 138, schema: 615)
     753 tokens  list_datasources (desc: 128, schema: 625)
     753 tokens  list_oncall_users (desc: 143, schema: 610)
     744 tokens  create_folder (desc: 132, schema: 612)
     735 tokens  list_incidents (desc: 131, schema: 604)
     732 tokens  get_datasource (desc: 150, schema: 582)
     728 tokens  search_dashboards (desc: 126, schema: 602)
     716 tokens  get_sift_investigation (desc: 130, schema: 586)
     714 tokens  get_current_oncall_users (desc: 150, schema: 564)
     700 tokens  get_oncall_shift (desc: 141, schema: 559)
     692 tokens  get_dashboard_by_uid (desc: 140, schema: 552)
     685 tokens  get_annotation_tags (desc: 121, schema: 564)
     679 tokens  list_sift_investigations (desc: 129, schema: 550)
     675 tokens  get_alert_group (desc: 119, schema: 556)
     672 tokens  get_dashboard_summary (desc: 123, schema: 549)
     671 tokens  get_incident (desc: 128, schema: 543)
     667 tokens  list_oncall_teams (desc: 124, schema: 543)
     659 tokens  search_folders (desc: 118, schema: 541)


Baseline Comparison
============================================================

Baseline: 14715 tokens
Current:  18411 tokens
Change:   +3696 tokens (+25.1%)

Tool Changes:
----------------------------------------
  [~] create_annotation (+137 tokens)
  [~] get_sift_analysis (+134 tokens)
  [~] get_annotations (+130 tokens)
  [~] query_prometheus (+128 tokens)
  [~] find_error_pattern_logs (+116 tokens)
  [~] update_annotation (+114 tokens)
  [~] find_slow_requests (+113 tokens)
  [~] get_assertions (+113 tokens)
  [~] list_prometheus_label_names (+109 tokens)
  [~] create_folder (+106 tokens)
  [~] get_current_oncall_users (+104 tokens)
  [~] create_incident (+102 tokens)
  [~] list_prometheus_metric_metadata (+102 tokens)
  [~] get_panel_image (+102 tokens)
  [~] get_annotation_tags (+98 tokens)
  [~] get_oncall_shift (+96 tokens)
  [~] get_incident (+95 tokens)
  [~] list_sift_investigations (+94 tokens)
  [~] add_activity_to_incident (+93 tokens)
  [~] list_incidents (+93 tokens)
  [~] list_oncall_teams (+93 tokens)
  [~] generate_deeplink (+92 tokens)
  [~] list_prometheus_label_values (+92 tokens)
  [~] list_datasources (+92 tokens)
  [~] get_alert_group (+91 tokens)
  [~] get_datasource (+88 tokens)
  [~] search_folders (+87 tokens)
  [~] list_oncall_schedules (+87 tokens)
  [~] search_dashboards (+86 tokens)
  [~] list_prometheus_metric_names (+85 tokens)
  [~] list_oncall_users (+80 tokens)
  [~] get_dashboard_by_uid (+77 tokens)
  [~] list_loki_label_names (+74 tokens)
  [~] get_dashboard_summary (+72 tokens)
  [~] get_sift_investigation (+66 tokens)
  [~] get_dashboard_property (+56 tokens)
  [~] list_alert_groups (+54 tokens)
  [~] query_loki_logs (+42 tokens)
  [~] query_loki_patterns (+41 tokens)
  [~] list_loki_label_values (+37 tokens)
  [~] query_loki_stats (+25 tokens)

Result: FAILED - Token increase of 25.1% exceeds threshold of 5.0%

(using https://github.com/sd2k/mcp-tokens)

@sd2k sd2k added the cla: yes Contributor License Agreement is signed label Mar 26, 2026
@sd2k sd2k closed this Mar 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla: yes Contributor License Agreement is signed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants