Skip to content

fix: prevent schedule() row accumulation across DO restarts#1154

Merged
threepointone merged 3 commits intomainfrom
schedule-warnings
Mar 23, 2026
Merged

fix: prevent schedule() row accumulation across DO restarts#1154
threepointone merged 3 commits intomainfrom
schedule-warnings

Conversation

@threepointone
Copy link
Copy Markdown
Contributor

@threepointone threepointone commented Mar 22, 2026

We hope this addresses the issues raised in #1153 — looking for feedback on the approach before we ship.

Context

#1153 reported that schedule() called during initialization creates a new row on every Durable Object restart. In crash-loop scenarios, hundreds of stale rows accumulate in cf_agents_schedules. When the alarm fires, it processes every stale row individually — 200 rows = 200 callback executions in a single cycle. The reporter's cost: ~$1,650 in DO SQL row reads from a system with 1 user.

The root cause is two-fold:

  1. schedule() always generates a fresh nanoid(9), so INSERT OR REPLACE never replaces — it always inserts
  2. alarm() processes all stale rows without any awareness of duplicates

scheduleEvery() already got idempotency in #1049. This PR extends similar protection to schedule().

Changes

1. Cron schedules via schedule() are idempotent by default

Calling schedule("0 * * * *", "tick") multiple times with the same (callback, cron, payload) now returns the existing row instead of creating a duplicate. This mirrors scheduleEvery()'s behavior.

This is the most important fix — cron rows created via schedule() are never deleted (they're updated with the next execution time), so duplicates accumulate permanently. After N cold starts, you'd get N executions per cron tick, forever.

Set { idempotent: false } to opt out if needed.

2. Opt-in idempotent option for delayed/scheduled types

// Safe to call in onStart() — only creates one row
await this.schedule(60, "maintenance", undefined, { idempotent: true });

Dedup key is (type, callback, payload). Different payloads create separate rows. Default remains false for delayed/scheduled types to preserve backward compatibility — there are legitimate reasons to schedule multiple one-shots for the same callback (e.g., different payloads representing different work items).

3. onStart() warning

When schedule() is called inside onStart() without the idempotent option (for non-cron types), a console.warn fires:

schedule("maintenance") called inside onStart() without { idempotent: true }. This creates a new row on every Durable Object restart...

The warning is:

  • Once per callback per onStart() cycle (no log spam)
  • Skipped for cron (already idempotent by default)
  • Skipped when idempotent is explicitly set (true or false — user knows what they're doing)

4. Alarm-time observability

When alarm() is about to process ≥10 stale one-shot rows for the same callback, it emits:

  • A console.warn with actionable guidance
  • A schedule:duplicate_warning event via diagnostics_channel

This catches the amplification problem at the point of impact without changing execution semantics — all rows are still processed.

What we deliberately did NOT do

  • No dedup in alarm(). Silently dropping scheduled work is worse than executing duplicates. The reporter's scenario involved callbacks with the same payload, but in general schedule() rows with different payloads represent different work items. Deduping at execution time would cause silent data loss.
  • No batch cap in alarm(). Doesn't actually fix the amplification when callbacks reschedule themselves (the row count stays constant across cycles). And it delays legitimate batch processing.
  • No stale row purging. Impossible to pick a safe age threshold — a user might schedule something days in advance.

Breaking changes

The only behavioral change is cron idempotency becoming the default. This is technically a semver concern, but the previous behavior (N duplicate cron rows → N executions per tick) was almost certainly always a bug. The escape hatch is { idempotent: false }.

All other changes are additive: new option, new warnings, new observability event.

Test plan

  • Cron idempotency: same args return existing, repeated calls don't duplicate, different cron/payload create new rows, opt-out with idempotent: false
  • Delayed/scheduled idempotency: opt-in dedup, crash-loop simulation (10x), different payloads stay separate, default behavior unchanged
  • onStart warning: fires without idempotent, suppressed with { idempotent: true }, suppressed with { idempotent: false }
  • Alarm warning: fires at ≥10 stale rows, doesn't fire below threshold, all rows still processed and deleted
  • Full test suite passes (358 tests), build, typecheck, lint, format all green

Closes #1153

Made with Cursor


Open with Devin

Introduce idempotent scheduling to avoid duplicate schedule rows and add safety warnings.

- Add options.idempotent?: boolean to schedule(); cron schedules are idempotent by default, delayed/scheduled types are opt-in.
- Deduplicate by querying existing cf_agents_schedules (callback+payload, and cron for recurring) and return existing schedule when found.
- Warn when schedule() is called inside onStart() without { idempotent: true } (track _insideOnStart and _warnedScheduleInOnStart to avoid spam); explicit idempotent:false suppresses the warning.
- Emit schedule:create as before and add new observability event schedule:duplicate_warning when many stale one-shot rows for the same callback are processed.
- Add SQL checks and early returns to prevent creating duplicates; keep existing behavior when idempotent is not set.
- Add comprehensive tests and test agents covering onStart warnings, cron/delayed/scheduled idempotency, and alarm duplicate warnings; update wrangler test config and test worker env types.
@changeset-bot
Copy link
Copy Markdown

changeset-bot bot commented Mar 22, 2026

🦋 Changeset detected

Latest commit: d0acbd0

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
agents Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 5 additional findings.

Open in Devin Review

@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new bot commented Mar 22, 2026

Open in StackBlitz

agents

npm i https://pkg.pr.new/agents@1154

@cloudflare/ai-chat

npm i https://pkg.pr.new/@cloudflare/ai-chat@1154

@cloudflare/codemode

npm i https://pkg.pr.new/@cloudflare/codemode@1154

hono-agents

npm i https://pkg.pr.new/hono-agents@1154

@cloudflare/shell

npm i https://pkg.pr.new/@cloudflare/shell@1154

@cloudflare/think

npm i https://pkg.pr.new/@cloudflare/think@1154

@cloudflare/voice

npm i https://pkg.pr.new/@cloudflare/voice@1154

@cloudflare/worker-bundler

npm i https://pkg.pr.new/@cloudflare/worker-bundler@1154

commit: d0acbd0

@threepointone threepointone merged commit 74a018a into main Mar 23, 2026
2 checks passed
@threepointone threepointone deleted the schedule-warnings branch March 23, 2026 11:17
@github-actions github-actions bot mentioned this pull request Mar 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

schedule() delayed rows accumulate across OOM crashes, alarm() processes all without dedup

1 participant