whatbroke.dev

What broke. What surprised. What we shipped anyway.

Cron Timezone Drift Made Scheduled Jobs Fight Over the Same Row

May 24, 2026 · note

Cron Timezone Drift Made Scheduled Jobs Fight Over the Same Row

Scheduled jobs were silently dropping every other run. The pattern was clean: one would succeed, the next would vanish, repeat. Two days to find it.

First afternoon was spent in the wrong layer entirely. The assumption was race conditions in the worker code — something racy about how jobs claimed work. Elaborate locking got added around the claim step. None of it changed the drop pattern.

The actual cause: the host was on UTC, but cron was using the system timezone, which had drifted to PST after a recent dist-upgrade. Two separate cron entries on the same machine, both written as */5 * * * *, were now resolving to different absolute minutes. When they happened to align — same minute, same second — both would try to claim the same job row in MySQL. Row-level locking would let one through. The loser would fail the claim, see no work, and exit silently. No error, no retry, just gone.

The "every other run" pattern was the two crons firing in the same minute often enough to create a visible cadence.

The fix was a one-line addition to /etc/cron.d/whatbroke: pin TZ=UTC explicitly.

The broader lesson: when something is silently dropping work, check whether two processes are politely fighting over the same resource. Row locks, file locks, PID files — anything that causes one contender to back off quietly instead of failing loud.

← back to all posts