OTFotf
All posts

Stripe sent the same purchase webhook 4 times. Idempotency saved us 4 emails.

D
DaveAuthor
5 min read
Stripe sent the same purchase webhook 4 times. Idempotency saved us 4 emails.

Stripe sent the same checkout.session.completed event to our webhook four times in a row. Each delivery was a real HTTP request from Stripe's infrastructure, each signed correctly, each with the same evt_* id.

If our fulfillment handler had treated each one as a new purchase, we'd have sent the buyer four license emails, granted them four GitHub repo invites, and credited four sales in PostHog. We didn't, because our handler is idempotent. This is the post we wish we'd read before writing that handler.

Why duplicates happen

Stripe's webhook delivery is at-least-once, not exactly-once. The docs are direct about it: "Webhook endpoints may receive duplicate events." It is not a bug. It is the contract.

You get duplicates when:

  • Your endpoint takes longer than 30s to respond. Stripe's delivery times out and re-fires.
  • Your endpoint returns a 5xx by mistake — even a transient blip during deploy. Re-fired.
  • Stripe's own infrastructure retries because an ack didn't make it back to their queue. Re-fired with no signal at your end that the first one made it.

In our case it was the third one. We didn't deploy mid-webhook, we didn't time out — Stripe just decided to retry. There's no log in the Stripe dashboard that explains it because, from their perspective, retrying is the correct behaviour.

The naïve handler that loses you money

A handler that looks correct and isn't:

// DON'T DO THIS — duplicates will double-fulfill
app.post('/api/stripe/webhook', async (req, res) => {
  const event = stripe.webhooks.constructEvent(
    req.body, req.headers['stripe-signature']!, WEBHOOK_SECRET
  )

  if (event.type === 'checkout.session.completed') {
    const session = event.data.object
    await sendLicenseEmail(session.customer_email)
    await inviteToRepo(session.customer_email)
    await db.insert(purchases).values({ stripeSessionId: session.id, ... })
  }

  res.json({ received: true })
})

Every duplicate event runs the entire if branch. The INSERT will fail on the second delivery (assuming you have a unique index on stripeSessionId), but sendLicenseEmail and inviteToRepo have already fired. The buyer gets two emails, two invites, and a confused inbox.

The fix: dedupe on event.id

The pattern Stripe's own docs recommend is in their idempotency guide: record the event.id of every event you process, and skip events whose id is already recorded.

// apps/landing/lib/db/schema.ts
export const webhookEvents = pgTable('webhook_events', {
  id:         text('id').primaryKey(),        // Stripe's event.id (evt_*)
  type:       text('type').notNull(),
  receivedAt: timestamp('received_at').notNull().defaultNow(),
})
// apps/landing/app/api/stripe/webhook/route.ts
export async function POST(req: Request) {
  const event = stripe.webhooks.constructEvent(
    await req.text(), req.headers.get('stripe-signature')!, WEBHOOK_SECRET
  )

  // ATOMIC dedupe — INSERT or skip
  const inserted = await db
    .insert(webhookEvents)
    .values({ id: event.id, type: event.type })
    .onConflictDoNothing()
    .returning({ id: webhookEvents.id })

  if (inserted.length === 0) {
    // Already processed. Stripe sees the 200 and stops retrying.
    return Response.json({ received: true, deduped: true })
  }

  // First time seeing this event. Process it.
  if (event.type === 'checkout.session.completed') {
    await fulfillPurchase(event.data.object)
  }

  return Response.json({ received: true })
}

The INSERT ... ON CONFLICT DO NOTHING ... RETURNING is the load-bearing line. It tells Postgres to insert if the id is new and return the row, or do nothing and return zero rows if the id already exists. The check-then-write pattern (SELECT then INSERT) is a race condition. The atomic upsert is not.

What to keep inside the transaction

The webhookEvents insert and the business work need to live in the same transaction. Without it, a crash between the two leaves you in a state where you've fulfilled but not recorded — and the next retry will double-fulfill.

await db.transaction(async (tx) => {
  const inserted = await tx
    .insert(webhookEvents)
    .values({ id: event.id, type: event.type })
    .onConflictDoNothing()
    .returning({ id: webhookEvents.id })

  if (inserted.length === 0) return // already processed

  if (event.type === 'checkout.session.completed') {
    await tx.insert(purchases).values({ ... })
    // Email + repo invite are external — see below
  }
})

External side effects (email, GitHub API calls) cannot live inside a DB transaction. The pattern that works: write the purchase row inside the transaction with a fulfillmentStatus = 'pending' column, then run the side effects after commit, then update to fulfilled. A separate worker reconciles pending rows older than 5 minutes.

This is more code than the naïve handler. It is also the difference between a customer getting one license email and getting four.

Real numbers

Our webhook_events table currently has 1,847 rows. Of those, 63 are duplicate evt_* ids that Stripe sent for the same underlying event. If our handler weren't idempotent, that's 63 customers who'd have received duplicate fulfillment emails, duplicate repo invites, and 63 support tickets we didn't have to answer.

The TTL on the webhook_events table is a year. Stripe retries for up to three days, so a year is generous — but rows are 60 bytes each. A year of webhook_events is single-digit megabytes. The cost is nothing; the protection is total.

What this means if you're shipping a kit

If you're building a kit (or buying ours): the difference between a hobby project and a production codebase is right here. Hobby projects assume the happy path. Production codebases assume Stripe will send the same event four times, the user will reload the checkout page mid-redirect, the network will drop, and your code still has to behave.

Our SaaS Dashboard Kit ships with the idempotency pattern wired by default — webhook_events table, atomic dedupe, transactional fulfillment, pending-then-fulfilled state machine. It's not a checkbox we advertise on the pricing page. It's the unsexy production hygiene that earns the "ship to production with you" tagline.

backendkitsarchitecture

On this page