Put Work at Its Right Altitude

The principle: Decide for every unit of work whether it runs in-band (cheap, must be durable for the response) or out-of-band (slow, flaky, fan-out), and make the seam between them the thinnest possible thread boundary — a two-line job that only exists to be on another thread.

① First principles: the fan-out that blocks the sender

Someone posts one message in a fifty-person room. That one save has to do a surprising amount of work: mark everyone else's membership unread, and send an OS push notification to every member who is offline and wants one. Both of those are "consequences of the message existing," so the instinct from see P1: The Model Owns Its Consequences is right — they belong on the model. Here's the version you'd vibe-code first, hanging it all on a callback:

class Message < ApplicationRecord
  after_save :notify_room      # ⚠ fires INSIDE the transaction, on every save

  private
    def notify_room
      room.memberships.where.not(user: creator).each do |membership|
        membership.update!(unread_at: Time.current)   # N UPDATEs, one per member
        next if membership.user.online?                # presence re-checked in Ruby
        PushNotifier.deliver(membership.user, self)    # ...and BLOCK on each push
      end
    end
end

This isn't a strawman — it's the honest shape of the first thing you'd build before the conventions clicked. It "works" in development with three users. Then it meets production and grows four separate bugs, and they are all the same bug wearing different hats: work running at the wrong altitude.

A square 1:1 Wait-But-Why style hand-drawn stick-figure sketch, off-white paper…
Walk them in order. First, the loop fires fifty UPDATE statements where one would do, and re-checks presence by loading every user into Ruby. Cheap work done expensively, but still in-band — survivable. Second, and worse: PushNotifier.deliver is a synchronous HTTPS call to Apple's and Google's push gateways. Fifty members means up to fifty sequential network round-trips to flaky third-party servers, while the sender's browser spins. Post in a busy room and your own request hangs on someone else's slow gateway. Third: it's after_save, which fires inside the database transaction. If anything later in that transaction rolls back, you've already pinged fifty phones about a message that no longer exists — the ghost row. Fourth: after_save also fires on every update, so editing a typo re-notifies the whole room.

Here is the first-principles cut that dissolves all four at once. Stop asking "where does this code go?" and start asking "at what altitude does each unit of work belong?" There are exactly two answers, and the boundary between them is a real, drawable line:

In-band. Cheap, and it must be durable before you return the HTTP response. The unread marking is in-band: it's one bulk UPDATE, and if it didn't happen the response would be lying about the app's state.
Out-of-band. Slow, flaky, or a fan-out that the sender should never wait on. The push notifications are out-of-band: fifty calls to gateways you don't control, none of which the sender's request has any business blocking on.

That split is the sync/async line.

A square 1:1 technical teaching poster titled, in a consistent top band, 'THE S…
So the derived shape is forced, not chosen:

The cheap, must-be-durable work runs in-band, synchronously, before the response.
The slow fan-out is handed out-of-band to a background job.
The trigger is after_create_commit — fires once, only after the row survives the transaction, so a rolled-back message can never reach a worker.
The seam between in-band and out-of-band is the thinnest thread boundary: a job whose entire job is being on another thread. The logic stays on the model where it's synchronously testable; the job is a two-line thunk.

② The beauty in combination

Altitude is the principle that finishes several others. On its own "use background jobs for slow things" is advice your framework's README already gave you. Held against the other principles, it becomes the thing that decides where the seam goes and how thin it can be.

With P1 — the model owns the consequence, the seam owns the altitude. see P1: The Model Owns Its Consequences settles whose fact this is: marking unread and pushing both belong to the room, because they're true of every message. P9 settles something P1 deliberately leaves open — when and on which thread each consequence runs. The model says what (room.receive(self)); the seam says where (one part in-band, one part deferred). That's why room.receive can be two lines that read like intent and still be correct: the altitude decision lives one layer down, at the _commit trigger and the _later boundary, not smeared into the consequence itself.

With P8 — the guard lives on the wrapper, so the job stays dumb. see P8: Give Behavior a Home files each consequence into the trait's concern. P9 adds a rule about the pairing inside that concern: the public do_thing_later method owns the precondition and the enqueue; the private do_thing method does the work. The job in between never re-checks anything — it's a thunk. You'll see this in deliver_webhook_later guarding if webhook before enqueuing, so Bot::WebhookJob can be three lines that assume the work is safe to do. The guard at the right altitude (on the wrapper, in the request) means the defensive if never has to be duplicated inside the worker.

With P2 and P3 — pass the record, do AR reads first, fail closed. Active Job serializes arguments as GlobalIDs, so you hand a job the record, not an id to re-find — a small instance of see P2: Derive, Don't Store (the worker re-derives the row from a global identity instead of you storing and re-fetching a bare integer). And for the rare work that must escape Rails entirely — a raw thread pool firing thousands of pushes — the discipline is to do all Active Record reads before posting to threads, because outside the Rails executor there's no connection management to lean on. The same altitude discipline shows up on the app's other external-fetch surface — the OpenGraph link-preview fetcher — where the SSRF guard (RestrictedHTTP::PrivateNetworkGuard) fails closed: an unparseable address is treated as a private IP and blocked, which is see P3: Security Is the Shape of Your Data Access applied at the lowest altitude. The payoff: the entire async surface of the app — every place work crosses a thread boundary — is readable in about a dozen lines, because each boundary is thin and each guard sits at the altitude where it belongs.

The compounding insight: altitude isn't a performance tactic, it's a correctness boundary. Each unit of work placed at its right altitude absorbs a production edge case for free — the ghost row, the blocking request, the redundant re-notify, the connection leak — and the seam between altitudes stays so thin you can read the whole map at a glance.

③ How 37signals did it

The trigger altitude: `after_create_commit`, and the fast/slow split

Here is the entire fan-out, at message.rb:11-12:

  before_create -> { self.client_message_id ||= Random.uuid } # Bots don't care
  after_create_commit -> { room.receive(self) }

Line 12 is the trigger at the right altitude. Not after_save (fires on edits, fires inside the transaction); not after_create (fires inside the transaction). after_create_commit fires once, only after the row is durably committed — _commit means after-durable. We don't re-derive the callback lifecycle here; that timeline is see F1: The Rails Model & Active Record's job (the callback lifecycle). Here we only need the consequence of the altitude: a message that gets rolled back can never reach a worker, so the ghost row — a push to a phone for a message that no longer exists — is a bug that cannot occur.

This isn't a Campfire one-off — Fizzy reaches for the identical seam: after_create_commit :deliver_later on a webhook delivery (webhook/delivery.rb:23), where deliver_later is a one-line Webhook::DeliveryJob.perform_later(self) (webhook/delivery.rb:29-30). The same _commit trigger handing off to the same thinnest-thread-boundary, in an unrelated product — 37signals do this, it isn't an idiom of one chat app.

Now follow room.receive (room.rb:46-49), which is where the sync/async line is drawn explicitly:

  def receive(message)
    unread_memberships(message)
    push_later(message)
  end

Two named intents, two altitudes, one line each. The in-band half (room.rb:68-70) is a single bulk statement, not a loop:

    def unread_memberships(message)
      memberships.visible.disconnected.where.not(user: message.creator).update_all(unread_at: message.created_at, updated_at: Time.current)
    end

The entire "who needs an unread badge?" decision is one update_all over composable scopes — no each, no N+1, no loading rows into Ruby to filter them. That's the cheap-and-durable work, done at the in-band altitude where the response needs it.

The out-of-band half (room.rb:72-74) is the seam itself — the thinnest thread boundary:

    def push_later(message)
      Room::PushMessageJob.perform_later(self, message)
    end

Notice the arguments: self and message, the records, not their ids. Active Job serializes them as GlobalIDs; the worker re-derives them. You never write find(id) at the top of a perform.

"Why split it at all — isn't one callback simpler than a callback plus a job plus a pusher class?" Count the files, then count the edge cases each absorbs. The split costs you one three-line job class. In return: the sender's request never blocks on a push gateway, a rolled-back message never notifies anyone, and editing a message never re-pings the room. The naive single callback is fewer files but more bugs — it pays for its brevity in production incidents. Altitude is the trade that buys correctness with one thin boundary.

The jobs are thunks; the guard lives on the wrapper

Look at all three background jobs in the app side by side. room/push_message_job.rb:1-5:

class Room::PushMessageJob < ApplicationJob
  def perform(room, message)
    Room::MessagePusher.new(room:, message:).push
  end
end

bot/webhook_job.rb:1-5:

class Bot::WebhookJob < ApplicationJob
  def perform(bot, message)
    bot.deliver_webhook(message)
  end
end

remove_banned_content_job.rb:1-5:

class RemoveBannedContentJob < ApplicationJob
  def perform(user)
    user.remove_banned_content
  end
end

Every one is the same shape: receive records, call a model method, return.

And this isn't Campfire's taste leaking into three job classes — 37signals wrote the rule down as law. Fizzy ships a STYLE.md that states it outright: "we write shallow job classes that delegate the logic itself to domain models," with the _later suffix flagging the enqueue and _now the synchronous worker method (STYLE.md:185-213). The example in the doc is the same thunk you're looking at — Event::RelayJob#perform is just event.relay_now. The altitude discipline you've been deriving is documented house doctrine, obeyed in two products.

A square 1:1 'annotated screenshot of real code' teaching poster, titled in a t…
And the guard? It lives one altitude up, on the _later wrapper. From user/bot.rb:51-57:

  def deliver_webhook_later(message)
    Bot::WebhookJob.perform_later(self, message) if webhook
  end

  def deliver_webhook(message)
    webhook.deliver(message)
  end

The if webhook check happens before enqueuing, synchronously, at the call site (messages_controller.rb:76 fans these out: bots_eligible_for_webhook.excluding(@message.creator).each { |bot| bot.deliver_webhook_later(@message) }). So Bot::WebhookJob#perform never re-checks defensively — it can assume a webhook exists, because a bot without one was never enqueued. The guard at the right altitude means it's written once, not duplicated inside every worker. The same _later/plain-method pairing appears in user/bannable.rb:19-28 (remove_banned_content_later enqueues; remove_banned_content does the destroy-and-broadcast loop).

"Why is broadcasting NOT a callback like room.receive, when both are consequences of a message?" Because altitude is decided per consequence, and these two have different ones. Marking-unread-and-pushing is true of every message however it's born, so it rides after_create_commit. Broadcasting differs by call path — a normal send appends, an edit replaces, a seed shouldn't broadcast at all — so it's a plain method (broadcast_create) called explicitly at each site (messages_controller.rb:24, webhook.rb:60). Same principle from see P1: The Model Is the Truth — and It Owns Its Consequences (the callback that refused to be a callback), now read through altitude: the seam is placed where the decision about timing-and-transport actually lives.

The lowest altitude: work that escapes the Rails executor lives in `lib/`

Active Job is the right tool for "off the request thread." But the push fan-out itself — potentially thousands of HTTPS deliveries — needs a real thread pool, and a thread pool running outside the Rails executor doesn't get automatic Active Record connection management. So that code drops to the lowest altitude and moves to lib/. The file says so itself (web_push/pool.rb:1):

# This is in lib so we can use it in a thread pool without the Rails executor
class WebPush::Pool

And the discipline that makes it safe is explicit (web_push/pool.rb:25-31):

    def deliver_later(payload, subscription)
      # Ensure any AR operations happen before we post to the thread pool
      notification = subscription.notification(**payload)
      subscription_id = subscription.id

      delivery_pool.post do
        deliver(notification, subscription_id)

Read the comment as a rule of altitude: do all your Active Record reads before posting to threads. By the time the work crosses into the pool, it carries a plain notification object and an integer subscription_id — no live AR connection required on the other side. The reads happen at the altitude that has the database; the delivery happens at the altitude that has the threads; the boundary between them is, again, as thin as a single .post.

That's the whole async surface of a production chat app: one _commit trigger, one receive that draws the sync/async line, three thunk jobs, and one lib/ pool that does its reads first. Count the edge cases this arrangement absorbs for free — the ghost row, the blocked sender, the re-notify on edit, the connection leak outside the executor — and you see why it's small. It isn't doing less. Each boundary is placed at the altitude where its bug class disappears.

Key Takeaways — Patterns to Steal

Before you ask "what file does this code go in?", ask "at what altitude does this work belong?" — split every consequence into in-band (cheap and must be durable before you return the HTTP response) versus out-of-band (slow, flaky, or a fan-out the sender should never wait on). The naive move is to hang all of it on one callback that runs everything synchronously while the sender's browser spins. Campfire draws the line out loud in room.rb:46-49, where receive is just unread_memberships(message) then push_later(message) — two lines, two altitudes, the decision made visible.
When a consequence can't be taken back — a push, an email, anything that touches the world — trigger it from after_create_commit, not after_save and not after_create. The plain ones fire inside the transaction, so a row that later rolls back has already pinged fifty phones about a message that no longer exists, and after_save fires again on every edit so fixing a typo re-notifies the whole room. Campfire's message.rb:12 is after_create_commit -> { room.receive(self) } — the _commit suffix means after-durable, so a ghost row simply can't reach a worker.
For the in-band half — the "who needs an unread badge?" work — reach for one bulk statement over composable scopes, not a loop that fires an UPDATE per member and re-checks presence by loading every user into Ruby. That each is cheap work done expensively, an N+1 hiding inside a callback. Campfire does the whole decision in a single update_all at room.rb:68-70: memberships.visible.disconnected.where.not(user: message.creator).update_all(...).
Hand the slow fan-out to a background job, and make that job the thinnest thread boundary you can — a perform of two or three lines that just calls a model method. The temptation is to stuff the real logic into the worker, where it's now trapped behind a queue you have to boot to test it. Campfire's room/push_message_job.rb, bot/webhook_job.rb, and remove_banned_content_job.rb are all the same ~3-line shape: receive the records, call the model method, return — the work lives on the model where it stays synchronously testable.
Put the precondition on the _later wrapper, before the enqueue, not inside the job. If the guard lives in the worker you'll end up duplicating that defensive if in every job and enqueuing work that can never happen. Campfire's user/bot.rb:51-52 is Bot::WebhookJob.perform_later(self, message) if webhook — the check runs once at the call site, synchronously, so Bot::WebhookJob#perform can assume a webhook exists because a bot without one was never enqueued.
For the rare work that must escape the Rails executor — a raw thread pool firing thousands of pushes — drop it to lib/ and do every Active Record read before you post to the pool. Outside the executor there's no connection management to lean on, so reaching back for an AR row from inside a pool thread leaks connections or crashes. Campfire's web_push/pool.rb:25-31 reads subscription.notification(...) and grabs subscription.id first, then delivery_pool.post carries only a plain notification object and an integer across the boundary — the reads happen at the altitude that has the database, the delivery at the altitude that has the threads.
Let the model own what the consequence is and let the seam own where and when it runs — don't smear the timing decision into the consequence itself. That separation is exactly what lets room.receive stay two readable lines that look like pure intent while still being correct under rollback, edit, and fan-out: the altitude logic lives one layer down, at the _commit trigger and the _later boundary. Treat altitude as a correctness boundary, not a performance tactic — each unit placed at its right altitude absorbs a production edge case (the ghost row, the blocked sender, the re-notify on edit, the connection leak) for free, which is why the whole async surface reads in about a dozen lines.