Skip to content

Gracefully drain process on SIGTERM.#244

Merged
kentonv merged 3 commits intomainfrom
kenton/drain
Jan 5, 2023
Merged

Gracefully drain process on SIGTERM.#244
kentonv merged 3 commits intomainfrom
kenton/drain

Conversation

@kentonv
Copy link
Member

@kentonv kentonv commented Jan 2, 2023

We will wait for all in-flight requests to complete, without accepting any new requests.

Depends on: capnproto/capnproto#1598

Addresses part of #101.

We will wait for all in-flight requests to complete, without accepting any new requests.

Depends on: capnproto/capnproto#1598
auto [ fatalPromise, fatalFulfiller ] = kj::newPromiseAndFulfiller<void>();
this->fatalFulfiller = kj::mv(fatalFulfiller);

auto forkedDrainWhen = drainWhen
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the behavior correct if drainWhen is rejected for some reason? I guess that would just let the server keep running.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, yeah I suppose you could argue that drain() should be called on HttpServers in the failure case too.

However, in practice, if drainWhen fails, then by virtue of it being exclusiveJoin()ed with the listen tasks, those tasks will fail, causing Server::taskFailed() to be invoked, which fulfills the "fatal fulfiller", which in turn causes Server::run() itself to fail with the given error. So the effect is, the drain is skipped and the server shuts down immediately.

I think that works out just fine so I think I'm OK with the code as it is.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works for me, too.

auto promise = kj::evalNow([&]() {
return conn->listedHttp.httpServer.listenHttp(kj::mv(stream.stream))
.attach(kj::mv(conn))
.attach(kj::addRef(*this)); // two attach()es because `this` must outlive `conn`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought .attach(a, b) is always equivalent to .attach(a).attach(b). But, from looking at the code it looks like that's only true for kj::Own<T>::attach(), and with kj::Promise<T>::attach() the destruction order is reversed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, unfortunately, I didn't think hard enough when creating kj::Promise::attach() and it's probably too late to safely reverse the ordering. To avoid any confusion I use multiple attaches when the ordering matters.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. I suppose we could deprecate both attach()es in favor of a fixed attach2() for a couple capnproto releases, then deprecate-and-rename back to attach().

@kentonv
Copy link
Member Author

kentonv commented Jan 4, 2023

(Waiting for capnproto/capnproto#1598 to be approved before merging this.)

@harrishancock
Copy link
Collaborator

I approved the capnproto PR.

@kentonv kentonv merged commit 6d9cafa into main Jan 5, 2023
@kentonv kentonv deleted the kenton/drain branch January 5, 2023 23:26
mrbbot added a commit to cloudflare/miniflare that referenced this pull request Feb 27, 2023
`kill()` uses `SIGTERM` by default. In `workerd`, this waits for HTTP
connections to close before exiting. Notably, Chrome sometimes keeps
connections open for about 10s, blocking exit. We'd like `dispose()`/
`setOptions()` to immediately terminate the existing process.
Therefore, use `SIGINT` which force closes all connections.

Ref: cloudflare/workerd#244
mrbbot added a commit to cloudflare/workers-sdk that referenced this pull request Oct 31, 2023
`kill()` uses `SIGTERM` by default. In `workerd`, this waits for HTTP
connections to close before exiting. Notably, Chrome sometimes keeps
connections open for about 10s, blocking exit. We'd like `dispose()`/
`setOptions()` to immediately terminate the existing process.
Therefore, use `SIGINT` which force closes all connections.

Ref: cloudflare/workerd#244
mrbbot added a commit to cloudflare/workers-sdk that referenced this pull request Nov 1, 2023
`kill()` uses `SIGTERM` by default. In `workerd`, this waits for HTTP
connections to close before exiting. Notably, Chrome sometimes keeps
connections open for about 10s, blocking exit. We'd like `dispose()`/
`setOptions()` to immediately terminate the existing process.
Therefore, use `SIGINT` which force closes all connections.

Ref: cloudflare/workerd#244
mrbbot added a commit to cloudflare/workers-sdk that referenced this pull request Nov 1, 2023
`kill()` uses `SIGTERM` by default. In `workerd`, this waits for HTTP
connections to close before exiting. Notably, Chrome sometimes keeps
connections open for about 10s, blocking exit. We'd like `dispose()`/
`setOptions()` to immediately terminate the existing process.
Therefore, use `SIGINT` which force closes all connections.

Ref: cloudflare/workerd#244
mrbbot added a commit to cloudflare/workers-sdk that referenced this pull request Nov 1, 2023
`kill()` uses `SIGTERM` by default. In `workerd`, this waits for HTTP
connections to close before exiting. Notably, Chrome sometimes keeps
connections open for about 10s, blocking exit. We'd like `dispose()`/
`setOptions()` to immediately terminate the existing process.
Therefore, use `SIGINT` which force closes all connections.

Ref: cloudflare/workerd#244
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants