Skip to content

feat(observability): replace console.log with diagnostics_channel, add typed subscribe helper and ai-chat events#1024

Merged
threepointone merged 3 commits intomainfrom
log-diagnostics
Feb 28, 2026
Merged

feat(observability): replace console.log with diagnostics_channel, add typed subscribe helper and ai-chat events#1024
threepointone merged 3 commits intomainfrom
log-diagnostics

Conversation

@threepointone
Copy link
Copy Markdown
Contributor

Summary

Overhaul the observability system to use Node.js diagnostics_channel instead of console.log. Events are now silent by default (zero overhead when nobody is listening), fully typed with strict payloads, and automatically forwarded to Tail Workers in production.

30 files changed, 849 insertions, 472 deletions.


What changed

1. diagnostics_channel replaces console.log

The default genericObservability implementation no longer logs every event to the console. Instead, events are published to named diagnostics channels using the Node.js diagnostics_channel API. Publishing to a channel with no subscribers is a no-op — this eliminates the logspam that the old system produced.

Seven named channels, one per event domain:

Channel Events
agents:state state:update
agents:rpc rpc, rpc:error
agents:message message:request/response/clear/cancel/error, tool:result/approval
agents:schedule schedule:create/execute/cancel/retry/error, queue:retry/error
agents:lifecycle connect, destroy
agents:workflow workflow:start/event/approved/rejected/terminated/paused/resumed/restarted
agents:mcp mcp:client:preconnect/connect/authorize/discover

2. Strict BaseEvent type

BaseEvent has been simplified to three fields: type, payload, and timestamp. The removed fields are:

  • id (was nanoid()-generated) — removed because diagnostics channels provide ordering; IDs added no value and increased bundle size
  • displayMessage (human-readable string) — removed because structured type + payload is more useful for programmatic consumption; display formatting belongs in the subscriber

The payload type is now strict — accessing undeclared fields is a type error. This forces consumers to narrow on event.type before accessing payload properties, catching mismatches at compile time:

// Before (loose): event.payload.anything was allowed
// After (strict): must narrow first
if (event.type === "rpc:error") {
  console.log(event.payload.method); // ✅ typed
}

3. New error events

Error events are emitted at failure sites alongside console.error:

  • rpc:error — when a @callable method throws (includes method and error)
  • schedule:error — when a schedule callback fails after all retries
  • queue:error — when a queue callback fails after all retries

Design decision: we kept console.error at these sites so errors remain visible in local dev even without subscribers, while also emitting structured events for production monitoring.

4. _emit() private helper

All 28 inline emit blocks in the Agent class have been replaced with a private _emit(type, payload?) helper that auto-generates timestamps. This reduced each call site from ~6 lines to 1 and eliminated the scattered nanoid() / displayMessage boilerplate.

5. Typed subscribe() helper

A new subscribe() function exported from agents/observability provides type-safe event subscription per channel:

import { subscribe } from "agents/observability";

const unsub = subscribe("rpc", (event) => {
  // event is typed as rpc | rpc:error — full autocomplete
  if (event.type === "rpc:error") {
    console.error(event.payload.method, event.payload.error);
  }
});

unsub(); // clean up

Implementation: wraps node:diagnostics_channel subscribe/unsubscribe with a ChannelEventMap that maps each channel key to its Extract<>-ed event union. The cast through unknown in the handler is safe because we control both the publish and subscribe sides.

6. AIChatAgent observability events

Added 5 new event types emitted by AIChatAgent in @cloudflare/ai-chat:

Event Payload When
message:clear {} Client clears chat history
message:cancel { requestId } Client cancels a streaming request
message:error { error } Chat stream fails
tool:result { toolCallId, toolName } Client sends a tool execution result
tool:approval { toolCallId, approved } Client approves or rejects a tool call

These are routed to the agents:message channel alongside the existing message:request and message:response events. The tool: prefix is deliberately routed to message (not a separate channel) because tool interactions are part of the chat message lifecycle.

7. Missing mcp:client:authorize emit

The mcp:client:authorize event type existed but was never actually fired. Added the emit in MCPClientManager.connectToServer() when the OAuth flow enters AUTHENTICATING state with an authUrl.

8. Test agent cleanup

Removed observability = undefined overrides from all test agents. These were silencing observability during tests, which masked bugs and prevented testing the observability system itself. The new diagnostics_channel approach is silent by default (no subscribers = no work), so these overrides are unnecessary.


Design decisions

Why diagnostics_channel over EventEmitter or custom pub/sub?

  1. Zero overheadchannel.publish() with no subscribers is a single boolean check, not a function call
  2. Tail Worker integration — Cloudflare Workers automatically forward all diagnostics channel messages to Tail Workers via event.diagnosticsChannelEvents. No agent-side code needed for production observability
  3. TracingChannel futurediagnostics_channel also provides TracingChannel for start/end/error spans with AsyncLocalStorage integration, opening the door to end-to-end distributed tracing
  4. Standard API — part of the Node.js compatibility layer in Workers, no additional dependencies

Why strict payloads instead of Record<string, unknown>?

The loose payload type allowed consumers to access arbitrary fields without narrowing, leading to runtime errors when event shapes changed. The strict type (Record<string, never> default) forces narrowing on event.type first, which is the correct pattern for discriminated unions and catches payload mismatches at compile time.

Why route tool: events to the message channel?

Tool results and approvals are part of the chat conversation flow — they happen between message:request and message:response. A subscriber monitoring chat interactions needs to see tool events to understand the full picture. A separate agents:tool channel would fragment this view.

Why keep console.error alongside error event emission?

Error events are the most time-sensitive — developers need to see them immediately during local development. Requiring a diagnostics_channel subscriber just to see errors would be a poor DX regression. The pattern is: console.error for humans, _emit("rpc:error", ...) for machines.


Breaking changes

  • BaseEvent: id and displayMessage fields removed. payload type is now strict.
  • Observability.emit(): Removed the optional ctx second parameter.
  • AgentObservabilityEvent: Each event type now has its own discriminant (was a combined union with shared fields). This enables Extract<>-based type narrowing.

If you have a custom Observability implementation, update your emit signature to emit(event: ObservabilityEvent): void.


Notes for reviewers

  • packages/agents/src/index.ts — The diff is large because every inline emit() block was replaced with a one-liner _emit() call. The logic is unchanged; it is purely mechanical cleanup. Focus on the _emit helper definition at line ~739.
  • packages/agents/src/observability/index.ts — This is the core of the change. Review getChannel() routing, ChannelEventMap type mapping, and the subscribe() helper.
  • packages/agents/src/observability/base.ts — The Record<string, never> default payload is intentional. It means events with no declared payload fields will error if you try to access payload.foo. This is the desired behavior.
  • packages/ai-chat/src/index.ts — 8 emit sites total. The emit calls use this.observability?.emit() directly (not _emit) because AIChatAgent does not have the private helper — it inherits observability from the base Agent class.
  • Test agents (src/tests/agents/*.ts) — All observability = undefined lines were removed. This is safe because genericObservability with no subscribers does zero work.
  • mcp/client.ts line 849 — New mcp:client:authorize emit. Was a dead type before this change.
  • 331-line new test file (observability.test.ts) — Covers subscribe/unsubscribe lifecycle, channel routing for all 7 channels, and integration tests for rpc:error and queue:error emission via real WebSocket connections.

Testing

  • 45/45 TypeScript projects typecheck
  • 1031 tests pass (764 workers + 267 ai-chat)
  • Zero test failures, zero new skips

Replace console.log-based observability with Node's diagnostics_channel API. Events are now published to named channels (agents:state, agents:rpc, agents:message, agents:schedule, agents:lifecycle, agents:workflow, agents:mcp) via a new channels map and a getChannel(type) helper that routes event types to the appropriate channel. The genericObservability.emit implementation now publishes events to diagnostics_channel instead of printing them, and the previous local-mode console-printing logic and getCurrentAgent usage were removed. A changeset was added documenting the change and noting that messages are forwarded to Tail Workers in production.
Replace console.log-based observability with Node's diagnostics_channel API and update docs/changeset. Events are now published to named channels (agents:state, agents:rpc, agents:message, agents:schedule, agents:lifecycle, agents:workflow, agents:mcp) instead of unconditionally logging to stdout. Add a ChannelEventMap type and a typed subscribe(channel, callback) helper that returns an unsubscribe function. Changes also document Tail Worker integration where published events are forwarded to production tailing.
Replace console.log observability with Node diagnostics_channel and stricter typed events. Breaking changes to agents/observability types: BaseEvent no longer includes id or displayMessage and payloads are now strict types; Observability.emit signature removed the optional ctx parameter (emit(event: ObservabilityEvent): void). Exported ObservabilityEvent type and refined per-channel unions (including new error event types such as rpc:error, schedule:error, queue:error). Add Agent._emit helper to auto-generate timestamps and replace ~20 inline emit blocks (removed nanoid usage and per-call ids). Update MCP observability emissions to drop legacy fields. Docs updated to describe named channels, typed subscribe helper, Tail Worker integration, and event reference. Tests and example agents cleaned up to stop overriding observability. Bump agents package to minor. If you implement a custom Observability, update your emit signature and narrow on event.type before accessing payload fields.
@changeset-bot
Copy link
Copy Markdown

changeset-bot bot commented Feb 28, 2026

🦋 Changeset detected

Latest commit: cc06793

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
agents Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new bot commented Feb 28, 2026

Open in StackBlitz

npm i https://pkg.pr.new/agents@1024
npm i https://pkg.pr.new/@cloudflare/ai-chat@1024
npm i https://pkg.pr.new/@cloudflare/codemode@1024
npm i https://pkg.pr.new/hono-agents@1024

commit: 2786a22

@threepointone
Copy link
Copy Markdown
Contributor Author

@dmmulroy — heads up, this PR intersects with your ALS access API work in #1001 in a useful way.

How this helps #1001

Right now, observability events are emitted without any agent identity — there is no agent name, class, or instance ID attached to events. This is the biggest gap in the current system. Your AgentContext ALS in #1001 solves this cleanly.

Once #1001 lands, the _emit helper in index.ts (line ~739) can pull the current agent context from the ALS and automatically attach it to every event:

private _emit(
  type: ObservabilityEvent["type"],
  payload: Record<string, unknown> = {}
): void {
  const ctx = getAgentContext(); // from #1001
  this.observability?.emit({
    type,
    payload,
    timestamp: Date.now(),
    // New: auto-attached from ALS
    agent: ctx ? { name: ctx.agent.name, id: ctx.agent.id } : undefined
  } as ObservabilityEvent);
}

This means every event across all 7 diagnostics channels — RPC, state, schedule, workflow, message, MCP, lifecycle — gets agent identity for free without changing any of the 28+ call sites.

What changes in #1001 after this PR

  1. internal_context.ts — Both PRs touch this. This PR does not modify it, but feat(agents): ALS access API #1001 does. No merge conflict expected, just FYI.

  2. packages/agents/src/index.ts — Both PRs modify the Agent class. This PR replaces all inline emit() blocks with _emit() calls and adds the _emit helper. feat(agents): ALS access API #1001 adds the public context API. These changes are orthogonal but will need a rebase — the line numbers in index.ts have shifted significantly.

  3. packages/ai-chat/src/index.ts — Both PRs touch this file. This PR adds 8 observability?.emit() calls. feat(agents): ALS access API #1001 adds ALS context. Again orthogonal but will need rebase.

Suggested sequencing

I would suggest merging this PR first since it is a larger structural change (the _emit refactor touches 28 sites). Then #1001 can rebase on top and the context enrichment of events becomes a small, focused follow-up — just updating _emit and BaseEvent to include the agent identity from the ALS.

Happy to coordinate on the merge order. Let me know if you have questions about the diagnostics_channel approach or the event type system.

@threepointone threepointone merged commit e9ae070 into main Feb 28, 2026
3 checks passed
@threepointone threepointone deleted the log-diagnostics branch February 28, 2026 11:01
@github-actions github-actions bot mentioned this pull request Feb 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant