feat(observability): replace console.log with diagnostics_channel, add typed subscribe helper and ai-chat events#1024
Conversation
Replace console.log-based observability with Node's diagnostics_channel API. Events are now published to named channels (agents:state, agents:rpc, agents:message, agents:schedule, agents:lifecycle, agents:workflow, agents:mcp) via a new channels map and a getChannel(type) helper that routes event types to the appropriate channel. The genericObservability.emit implementation now publishes events to diagnostics_channel instead of printing them, and the previous local-mode console-printing logic and getCurrentAgent usage were removed. A changeset was added documenting the change and noting that messages are forwarded to Tail Workers in production.
Replace console.log-based observability with Node's diagnostics_channel API and update docs/changeset. Events are now published to named channels (agents:state, agents:rpc, agents:message, agents:schedule, agents:lifecycle, agents:workflow, agents:mcp) instead of unconditionally logging to stdout. Add a ChannelEventMap type and a typed subscribe(channel, callback) helper that returns an unsubscribe function. Changes also document Tail Worker integration where published events are forwarded to production tailing.
Replace console.log observability with Node diagnostics_channel and stricter typed events. Breaking changes to agents/observability types: BaseEvent no longer includes id or displayMessage and payloads are now strict types; Observability.emit signature removed the optional ctx parameter (emit(event: ObservabilityEvent): void). Exported ObservabilityEvent type and refined per-channel unions (including new error event types such as rpc:error, schedule:error, queue:error). Add Agent._emit helper to auto-generate timestamps and replace ~20 inline emit blocks (removed nanoid usage and per-call ids). Update MCP observability emissions to drop legacy fields. Docs updated to describe named channels, typed subscribe helper, Tail Worker integration, and event reference. Tests and example agents cleaned up to stop overriding observability. Bump agents package to minor. If you implement a custom Observability, update your emit signature and narrow on event.type before accessing payload fields.
🦋 Changeset detectedLatest commit: cc06793 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
commit: |
|
@dmmulroy — heads up, this PR intersects with your ALS access API work in #1001 in a useful way. How this helps #1001Right now, observability events are emitted without any agent identity — there is no agent name, class, or instance ID attached to events. This is the biggest gap in the current system. Your Once #1001 lands, the private _emit(
type: ObservabilityEvent["type"],
payload: Record<string, unknown> = {}
): void {
const ctx = getAgentContext(); // from #1001
this.observability?.emit({
type,
payload,
timestamp: Date.now(),
// New: auto-attached from ALS
agent: ctx ? { name: ctx.agent.name, id: ctx.agent.id } : undefined
} as ObservabilityEvent);
}This means every event across all 7 diagnostics channels — RPC, state, schedule, workflow, message, MCP, lifecycle — gets agent identity for free without changing any of the 28+ call sites. What changes in #1001 after this PR
Suggested sequencingI would suggest merging this PR first since it is a larger structural change (the Happy to coordinate on the merge order. Let me know if you have questions about the diagnostics_channel approach or the event type system. |
Summary
Overhaul the observability system to use Node.js
diagnostics_channelinstead ofconsole.log. Events are now silent by default (zero overhead when nobody is listening), fully typed with strict payloads, and automatically forwarded to Tail Workers in production.30 files changed, 849 insertions, 472 deletions.
What changed
1.
diagnostics_channelreplacesconsole.logThe default
genericObservabilityimplementation no longer logs every event to the console. Instead, events are published to named diagnostics channels using the Node.jsdiagnostics_channelAPI. Publishing to a channel with no subscribers is a no-op — this eliminates the logspam that the old system produced.Seven named channels, one per event domain:
agents:statestate:updateagents:rpcrpc,rpc:erroragents:messagemessage:request/response/clear/cancel/error,tool:result/approvalagents:scheduleschedule:create/execute/cancel/retry/error,queue:retry/erroragents:lifecycleconnect,destroyagents:workflowworkflow:start/event/approved/rejected/terminated/paused/resumed/restartedagents:mcpmcp:client:preconnect/connect/authorize/discover2. Strict
BaseEventtypeBaseEventhas been simplified to three fields:type,payload, andtimestamp. The removed fields are:id(wasnanoid()-generated) — removed because diagnostics channels provide ordering; IDs added no value and increased bundle sizedisplayMessage(human-readable string) — removed because structuredtype+payloadis more useful for programmatic consumption; display formatting belongs in the subscriberThe
payloadtype is now strict — accessing undeclared fields is a type error. This forces consumers to narrow onevent.typebefore accessing payload properties, catching mismatches at compile time:3. New error events
Error events are emitted at failure sites alongside
console.error:rpc:error— when a@callablemethod throws (includesmethodanderror)schedule:error— when a schedule callback fails after all retriesqueue:error— when a queue callback fails after all retriesDesign decision: we kept
console.errorat these sites so errors remain visible in local dev even without subscribers, while also emitting structured events for production monitoring.4.
_emit()private helperAll 28 inline
emitblocks in the Agent class have been replaced with a private_emit(type, payload?)helper that auto-generates timestamps. This reduced each call site from ~6 lines to 1 and eliminated the scatterednanoid()/displayMessageboilerplate.5. Typed
subscribe()helperA new
subscribe()function exported fromagents/observabilityprovides type-safe event subscription per channel:Implementation: wraps
node:diagnostics_channelsubscribe/unsubscribe with aChannelEventMapthat maps each channel key to itsExtract<>-ed event union. The cast throughunknownin the handler is safe because we control both the publish and subscribe sides.6. AIChatAgent observability events
Added 5 new event types emitted by
AIChatAgentin@cloudflare/ai-chat:message:clear{}message:cancel{ requestId }message:error{ error }tool:result{ toolCallId, toolName }tool:approval{ toolCallId, approved }These are routed to the
agents:messagechannel alongside the existingmessage:requestandmessage:responseevents. Thetool:prefix is deliberately routed tomessage(not a separate channel) because tool interactions are part of the chat message lifecycle.7. Missing
mcp:client:authorizeemitThe
mcp:client:authorizeevent type existed but was never actually fired. Added the emit inMCPClientManager.connectToServer()when the OAuth flow entersAUTHENTICATINGstate with anauthUrl.8. Test agent cleanup
Removed
observability = undefinedoverrides from all test agents. These were silencing observability during tests, which masked bugs and prevented testing the observability system itself. The new diagnostics_channel approach is silent by default (no subscribers = no work), so these overrides are unnecessary.Design decisions
Why
diagnostics_channelover EventEmitter or custom pub/sub?channel.publish()with no subscribers is a single boolean check, not a function callevent.diagnosticsChannelEvents. No agent-side code needed for production observabilitydiagnostics_channelalso providesTracingChannelfor start/end/error spans withAsyncLocalStorageintegration, opening the door to end-to-end distributed tracingWhy strict payloads instead of
Record<string, unknown>?The loose payload type allowed consumers to access arbitrary fields without narrowing, leading to runtime errors when event shapes changed. The strict type (
Record<string, never>default) forces narrowing onevent.typefirst, which is the correct pattern for discriminated unions and catches payload mismatches at compile time.Why route
tool:events to themessagechannel?Tool results and approvals are part of the chat conversation flow — they happen between
message:requestandmessage:response. A subscriber monitoring chat interactions needs to see tool events to understand the full picture. A separateagents:toolchannel would fragment this view.Why keep
console.erroralongside error event emission?Error events are the most time-sensitive — developers need to see them immediately during local development. Requiring a diagnostics_channel subscriber just to see errors would be a poor DX regression. The pattern is:
console.errorfor humans,_emit("rpc:error", ...)for machines.Breaking changes
BaseEvent:idanddisplayMessagefields removed.payloadtype is now strict.Observability.emit(): Removed the optionalctxsecond parameter.AgentObservabilityEvent: Each event type now has its own discriminant (was a combined union with shared fields). This enablesExtract<>-based type narrowing.If you have a custom
Observabilityimplementation, update youremitsignature toemit(event: ObservabilityEvent): void.Notes for reviewers
packages/agents/src/index.ts— The diff is large because every inlineemit()block was replaced with a one-liner_emit()call. The logic is unchanged; it is purely mechanical cleanup. Focus on the_emithelper definition at line ~739.packages/agents/src/observability/index.ts— This is the core of the change. ReviewgetChannel()routing,ChannelEventMaptype mapping, and thesubscribe()helper.packages/agents/src/observability/base.ts— TheRecord<string, never>default payload is intentional. It means events with no declared payload fields will error if you try to accesspayload.foo. This is the desired behavior.packages/ai-chat/src/index.ts— 8 emit sites total. The emit calls usethis.observability?.emit()directly (not_emit) becauseAIChatAgentdoes not have the private helper — it inheritsobservabilityfrom the baseAgentclass.src/tests/agents/*.ts) — Allobservability = undefinedlines were removed. This is safe becausegenericObservabilitywith no subscribers does zero work.mcp/client.tsline 849 — Newmcp:client:authorizeemit. Was a dead type before this change.observability.test.ts) — Covers subscribe/unsubscribe lifecycle, channel routing for all 7 channels, and integration tests forrpc:errorandqueue:erroremission via real WebSocket connections.Testing