Logs & retention
Every run’s stdout and stderr is captured to a per-run log file under the data directory. SQLite stores run metadata — exit code, duration, timestamps — but not the log bodies. The daemon is not a log aggregator. It’s a log archive for “what did this task print on 2026-04-12 at 03:15?”.
Where logs live on disk
Section titled “Where logs live on disk”{data_dir}/logs/{task-name}/{YYYYMMDD}_{HHMMSS}_{run-id-suffix}.log{data_dir}/logs/{task-name}/{YYYYMMDD}_{HHMMSS}_{run-id-suffix}.log.idx{data_dir}/logs/{task-name}/{YYYYMMDD}_{HHMMSS}_{run-id-suffix}.log.tidx{data_dir}/logs/{task-name}/{YYYYMMDD}_{HHMMSS}_{run-id-suffix}.log.meta- One file per run, not one per task. Each run gets its own isolated log, named after the run’s start time and ULID suffix so files are sortable and unique even at sub-second cadences.
- The
.idxand.tidxsidecars are byte and timestamp indices used by the Web UI to scrub long logs efficiently. - The
.metasidecar tracks rotation accounting and afinalizedflag written when the run cleanly ends. - When a run is deleted by retention, the daemon removes the log,
sidecars, rotation artifacts (
.prev,.idx.prev), and any newly empty parent directory.
log_max_size
Section titled “log_max_size”[tasks.bulky-job]cron = "0 4 * * *"run = "/usr/local/bin/bulky.sh"log_max_size = "50MB"- Default:
100MB. - Scope: per run. A task that runs hourly accumulates one fresh capped file per run, not one shared file growing forever.
- Units:
b,kb,mb,gb,tb(case-insensitive). Bare numbers are parsed as bytes."100MB","1.5gb","4096"are all valid. log_max_size = 0means unbounded. No cap is enforced, no lines are dropped. Use this only when you’ve thought about your disk.
log_on_full
Section titled “log_on_full”What happens when a run’s output hits log_max_size:
| Value | Behaviour |
|---|---|
drop_old | Default. Rotates the current log to .prev, keeps writing fresh output. Earlier output lost. |
drop_new | Stops accepting new lines; the process keeps running. A system line records the truncation. |
kill_task | Cancels the run’s context, terminating the process. The run records as log_overflow (exit code from the kill) — a dedicated end reason that’s still treated as a failure for retries and notifications, so the cause is visible without inspecting logs. |
Whichever policy fires, the daemon writes a synthetic line at the truncation point so a reader scrubbing the log can see exactly when the limit hit. There is no silent drop.
log_on_full also controls what happens when [storage] min_free_space
trips during a run: kill_task cancels the run on disk pressure;
drop_new and drop_old silently stop accepting lines (but the daemon
always raises a log.disk_pressure notification so the operator
discovers the silenced output). See
storage configuration.
drop_old is the default because it preserves the end of the log —
which is usually where the failure is. drop_new is right when the
start is the interesting part (a daemon’s startup banner, a long
batch’s preamble) and the rest is repetitive noise.
Retention: keep_runs and keep_for
Section titled “Retention: keep_runs and keep_for”Retention controls how long old run rows and their log files stick around.
[tasks.metrics]cron = "* * * * *"run = "/usr/local/bin/metrics.sh"keep_runs = 500keep_for = "7d"keep_runs— keep the N most recent runs for this task. Older runs are deleted (row + log files).keep_for— delete runs whosecreated_atis older than the given duration. Accepts extended units, including days and weeks:"7d","2w","36h","30m".- Both at once: both criteria contribute to deletion. A run is removed if either rule says so. The stricter rule wins in practice — set both if you want a hard floor and a hard ceiling.
- Per task: each task’s retention is evaluated independently using
that task’s own settings (or the inherited
[defaults]).
Cleanup runs in the background — by default once an hour — so you may
see slightly more than keep_runs rows briefly between sweeps. That’s
fine. In-flight runs are never deleted.
When retention triggers, both the SQLite row and the log file (with all its sidecars) are removed. There’s no orphaning: a row never points at a missing log, and a log file never lingers without a row.
Live streaming
Section titled “Live streaming”The Web UI’s run page tails logs in real time. The endpoint:
GET /api/tasks/{task-name}/runs/{run-id}/log/streamIt’s a Server-Sent Events stream pushed from the daemon’s in-memory event bus, not a polling tail. New lines arrive within milliseconds of being written.
A few details worth knowing:
- The stream first replays the on-disk history, then transitions to live events. You don’t miss the start.
- The event-bus buffer is bounded (4096 events). If a producer
outpaces a slow client, the oldest queued lines are dropped and
a
LogDroppedEventis sent so the client can show “N lines dropped”. - Connections idle out after 10 minutes. The browser auto-resumes
via the
Last-Event-IDheader. - The bus is in-process and best-effort. It is not a durability mechanism — the source of truth for “what did this run print” is always the on-disk file.
Crash safety
Section titled “Crash safety”Log writes flush and close cleanly on Close(), which fires when the
run ends or on graceful daemon shutdown. The .meta sidecar’s
finalized = true flag is written at that point.
If the daemon is killed mid-write (SIGKILL, power loss):
- The log file is not truncated — partial writes survive on disk. Readers tolerate a partial last line.
- On next startup, any run that was in
runningorpendingstate in SQLite is reconciled tocrashedwith exit code-2. The log file for that run is left as-is — it’s the last thing that process said before the daemon disappeared, and it’s the most useful artifact for debugging.
Combined with the boot semantics for in-flight runs, this means: every run row in your history reaches a terminal state, and every terminal run has a log file you can read.
What the daemon deliberately doesn’t do
Section titled “What the daemon deliberately doesn’t do”- No log aggregation. RunWisp captures logs for the runs it supervises. It does not collect, index, or search across hosts. If you need that, ship the per-run files (or your task’s stdout) to Loki / ELK / CloudWatch / wherever — that’s their job.
- No remote sinks built in. No Fluent Bit, no syslog forwarder, no S3 upload. The TOML schema doesn’t have a place to configure them.
- No log-content notifications. Notifications fire on run lifecycle events (failed, timeout, crashed, etc.), not on patterns matched against captured output. If you want “alert when stderr contains ERROR”, grep in your script and exit non-zero.
Where to next
Section titled “Where to next”- Notifications model — how a failed run ends up in your inbox or chat.
- How scheduling works — what creates the run rows that retention later trims.
[tasks.*]reference —log_max_size,log_on_full,keep_runs,keep_for.