Concurrency policies

When a task fires while a previous run is still going, RunWisp has to decide what to do. That decision is on_overlap. Three policies cover nearly every realistic case: queue, skip, and terminate.

The shape of the question

There are really two knobs:

parallelism — the maximum number of concurrent runs allowed for a task. Default 1. Most cron-style work wants 1.
on_overlap — what to do when a new run is requested and parallelism is already saturated.

If len(active) < parallelism, the new run always starts immediately, regardless of policy. on_overlap only kicks in when the task is at its concurrency limit.

The three policies

`queue` (the default for tasks)

The new run goes onto a FIFO queue. As soon as a slot opens — an active run finishes — the next queued run starts. Order is preserved.

[tasks.process-uploads]
cron       = "*/5 * * * *"
on_overlap = "queue"          # default — could be omitted
run        = "/usr/local/bin/process"

Use when:

Each tick must eventually run; missing one means missing work.
Runs are short relative to the schedule, so the queue normally stays empty.
Runs operate on disjoint inputs (a per-tick batch, a queue of jobs).

Watch out for:

Queue growth under back-pressure. If runs reliably take longer than the schedule, the queue grows unboundedly. RunWisp does not cap the queue — observability is your safety net. If you see the queue depth climbing, the task is over its budget; lower the cron frequency or pick skip/terminate.

`skip`

The new run is rejected — it gets persisted to history with end_reason = "skipped" and exit code -1, with the message “task already running, skipping (policy: skip)” recorded against it. The in-flight run keeps going untouched.

[tasks.health-probe]
cron       = "* * * * *"
on_overlap = "skip"
run        = "curl -sf https://example.com/health"

Use when:

The work is idempotent over time — a probe, a poll, a status check. Missing a tick is fine; what matters is that the next one runs.
Stacking would be actively harmful (e.g. you only want one pg_dump process at a time, and a missed tick is preferable to two competing dumps).

The rejected runs are visible in history — that’s the prime directive. You’ll see them in the Web UI under the skipped status (a distinct end-reason, not “failed”), so a chronically-overlapping task surfaces as a pattern, not a silence.

Crucially, end_reason = "skipped" is not a failure:

notify_on_failure does not fire for skipped runs, so a * * * * * health probe with overlap doesn’t spam Slack.
The retry policy never retries a skip — the original run is still going; another retry just races it.
Stats counters separate it from failed/crashed/timeout, so your failure rate stays honest.

`terminate`

The oldest active run is cancelled (RunWisp signals its context; typically translating to SIGTERM, then SIGKILL after the grace period). Once it exits, the new run starts.

[tasks.deploy-hook]
on_overlap = "terminate"
run        = "/usr/local/bin/deploy.sh"

Use when:

Latest wins — for a deploy hook fired twice in quick succession, the second invocation is the one you actually want.
Long-running work that becomes obsolete the moment a new request arrives (rebuild a search index, regenerate a cached report).

Watch out for:

Make sure the script handles SIGTERM cleanly. A pg_dump that’s been killed mid-write leaves a corrupt file. Use a temp file + atomic rename so an interrupted run doesn’t poison the next one.
The cancelled run shows up in history with end reason stopped, so you can audit how often the policy actually fires.

Decision matrix

Situation	Policy
Each tick processes a slice of work; ordering matters	`queue`
Idempotent probe; missing a tick is fine	`skip`
Work supersedes prior work; latest wins	`terminate`
Two things compete for a single resource	`skip`
Backlog of small jobs that should drain	`queue`
Long rebuild that re-runs from scratch each time	`terminate`

`parallelism > 1`

Setting parallelism = N lets up to N runs of the same task execute at once. The on_overlap policy doesn’t fire until you have N active runs.

[tasks.thumbnail-render]
cron        = "* * * * *"
parallelism = 4
on_overlap  = "queue"
run         = "/usr/local/bin/render"

Now the queue policy only engages once 4 renders are in flight. With parallelism = 4 + on_overlap = "skip", you’d get a hard cap of 4 concurrent renders and any 5th tick rejected.

For most cron-style work parallelism = 1 is right. Bump it only when:

The runs are genuinely independent (no shared state, no shared resource).
You have a bursty schedule (* * * * * against work that usually finishes in seconds but occasionally spikes).

For long-running parallel workers, prefer [services.<name>] with instances = N — the supervision model is built for it.

Manual triggers, retries, and the queue

A few interactions worth knowing:

A manual trigger (CLI, REST, UI) goes through the same evaluation. Trigger a skip task while it’s running and your trigger gets the skipped/failed result back, with a clear “task already running” reason.
Retries don’t count against parallelism. Each retry is a fresh run, and the previous run is already terminal by the time the retry fires — there’s no overlap.
The queue drains in FIFO order, irrespective of trigger source. Cron and manual triggers compete fairly.

Single-writer guarantee

Exactly one goroutine inside the daemon owns each task’s run lifecycle. Other code observes state through the in-memory event bus (internal/events/) or by reading the database — never by reaching into the run manager. That’s why parallelism > 1 is safe: the manager serialises the policy decision, even though the runs themselves execute in parallel.

Services and `on_overlap`

Services default to on_overlap = "skip" and are usually fine with that default. The service supervisor keeps instances replicas alive, each in its own slot — overlap doesn’t really happen unless you trigger a service manually while it’s already running, which skip correctly refuses.

Where to next

How scheduling works — how cron firings turn into runs that the policy evaluates.
Retries & timeouts — how a failing run gets retried, and why retries don’t conflict with on_overlap.
[tasks.*] reference — the on_overlap and parallelism fields in the schema.

Concurrency policies

The shape of the question

The three policies

queue (the default for tasks)

skip

terminate

Decision matrix

parallelism > 1

Manual triggers, retries, and the queue

Single-writer guarantee

Services and on_overlap

Where to next

`queue` (the default for tasks)

`skip`

`terminate`

`parallelism > 1`

Services and `on_overlap`