The Dashboard

The dashboard is what you arrive at after the QuickStart: the running container’s control surface, in your browser. It is where you run pipeline steps, inspect outputs, attest that you have looked at them, push results to GitHub or Overleaf, and (optionally) let an AI coding agent work alongside you.

This page is a tour of every panel.

Dashboard overview

Layout

The dashboard has a fixed layout:

  • Top toolbar — container name, active workflow (if any), sync buttons, and the Admin menu.

  • Left panel — the Repos panel for sandbox / toolkit projects, or the Pipeline panel for workflow projects.

  • Top panels — Two “Viewing Windows” to display plots and files.

  • Bottom panel(s) - Terminal window(s)/tab(s) for work inside the container.

Terminal

Click in a terminal section to access a shell session inside the container. The terminal runs in your browser over WebSocket and behaves like a standard terminal emulator. Multiple sessions can run concurrently — open as many as you like.

Terminal

If Claude Code is enabled for the project, run claude from a terminal session to start an in-container coding agent. The agent can in turn ask the dashboard to run steps, generate tests, push to GitHub, and so on — see Agent actions below.

Viewing Window

The Viewing Windows above the terminal(s) display plots and ASCII text files in the container. Supported formats include PDF, PNG, SVG, and JPG. In Workflow mode, the log is displayed in a window.

Figure viewer

Repos panel

The Repos panel replaces the Pipeline panel for sandbox and toolkit projects (the templates without a workflow.json). It lists the git repositories inside the container with their branch, dirty status, and push controls.

Repos panel

When you first open a container, repositories already present in the workspace (cloned by the entrypoint from vaibify.yml) are tracked automatically. If you clone additional repositories from the terminal, vaibify detects them within a few seconds and prompts you to Track or Ignore them.

Dirty detection reflects whether you have made source-level changes. Build artifacts that package managers and compilers leave behind (Python __pycache__/, C *.o, LaTeX *.aux, *.egg-info/, and so on) are filtered out, so a freshly installed repository shows as clean unless you have edited its source files.

Push commits and pushes whatever you have staged in the terminal — git add, git commit, git push rolled into a single button. A secondary Push files… option in the gear menu opens a file picker for selecting specific files to commit.

Pipeline panel

The pipeline panel lists the steps defined in your workflow. Each step shows its name, working directory, and current status.

Pipeline panel

From the top of the panel you can:

  • Run All — execute every enabled step in order.

  • Run Selected — execute only the checked steps.

  • Run From Step — execute from a chosen step to the end.

  • Enable / Disable — toggle individual steps without removing them.

  • Reorder — drag steps to change the execution order.

Adding a step

Click Add Step to open the step editor. Fill in the step name, working directory, and the commands to run. The editor separates data commands (heavy computation) from plot commands (figure generation), so you can re-run just the plotting after tweaking a script without re-running the simulation.

Interactive steps

Mark a step as interactive and the pipeline pauses there, waiting for you to confirm before continuing. Useful when you want to eyeball an intermediate result, adjust a parameter, or hand control to an agent for a specific stage.

Step verification status

Vaibify’s core job is to track what has happened on disk and flag any drift between a step’s current filesystem state and the last time it was validated. Two visual indicators communicate this state: a coloured status dot on the right of each step row and an optional pencil icon (✏) next to the step name.

Status dot

Each step row ends with one of three indicators:

  • Nothing — the step has not yet produced any output, so there is nothing to verify.

  • Orange dot — the step is partially verified. Some of user attestation, unit tests, and dependency analysis pass, or one or more on-disk changes have been detected since the last full pass.

  • Accent-coloured badge (the vaibify favicon) — the step is fully verified: user attestation, unit tests, and dependencies all pass and no on-disk drift has been detected.

When every enabled step is fully verified, the workflow name and the “Workflow” label in the top toolbar shift to the accent colour, giving an at-a-glance sign that the whole pipeline is in a trusted state.

Pencil icon

A pencil next to a step’s name means the step’s on-disk state has moved out of sync with its last validation. The pencil is derived entirely from filesystem timestamps, so it catches changes vaibify did not directly observe — a git pull, a manual pytest at the terminal, an edit from a Claude Code session.

The pencil lights up when any of the following are true:

Drift

Meaning

A data script is newer than the test marker

Tests need to be rerun against the updated script.

A data file is newer than the test marker

Tests need to be rerun against the updated output.

A data script or data file is newer than the last user attestation

Re-inspect the outputs and re-attest.

A plot script or plot file is newer than the last user attestation

Re-inspect the figures and re-attest.

Expanding the step shows a human-readable list of the specific files causing each condition.

Clearing a pencil

  • Re-run tests for the step (from the dashboard or a terminal). Successful completion clears any “newer than the test marker” drift.

  • Re-attest by clicking the verification badge after inspecting the outputs. This clears any “newer than your last attestation” drift.

Expanded-step timestamps

Clicking a step expands its detail view. Four timestamps appear; three are read directly from disk and one is your own attestation.

Row

What it records

Last tests

When pytest for this step last finished, regardless of who ran it.

Data files last modified

When any of the step’s data outputs was last written.

Plot files last modified

When any of the step’s plot outputs was last written.

Last verified

When you last clicked the verification badge to attest the outputs.

Sync status

Symbols before files indicate the current status of a file’s publication on a remote repository.

Sync panel

  • Push to GitHub — commit and push the project repository to its configured remote.

  • Push to Overleaf — sync figures and any selected files to the configured Overleaf project.

  • Archive to Zenodo — upload outputs and receive a DOI.

  • Generate LaTeX — produce ready-to-paste \includegraphics commands for the current figures.

Credentials for these services are resolved from your host’s keychain at request time. They are never written into the container or into vaibify.yml. See Connecting external services for the per-service setup.

Remote-sync panel

The sync panel surfaces one row per configured remote (GitHub, Overleaf, Zenodo). Each row shows the same four pieces of information:

Field

Meaning

Status pill

Green / yellow / red, semantics below.

Summary

<matching>/<total> files match SHA-256, optionally listing the first diverged path.

Last verified

Age of the most recent authoritative SHA-256 verify (e.g. “12m ago”). Empty when the remote has never been authoritatively verified.

Re-verify

A button that runs an authoritative SHA-256 verify against the remote’s current bytes (downloads the files, recomputes hashes, compares against MANIFEST.sha256).

Pill semantics:

  • Green — the most recent SHA-256 authoritative verify reported every file matching the manifest.

  • Yellow — drift detected since the last authoritative verify (the remote’s cheap-poll change-detection layer fired). The remote may or may not actually be out of sync; click Re-verify to find out.

  • Red — an authoritative SHA-256 verify confirmed at least one file’s hash does not match MANIFEST.sha256.

A scheduled background loop re-verifies every configured remote on a configurable cadence (default 6 hours), so the panel reflects recently-validated state even if the user never clicks Re-verify.

Aggregate consistency banner

Above the per-remote rows, a single line summarises the union of all remote states. The three forms produced by scriptSyncManager.js are:

  • Remote consistency: not yet verified — no remote has been verified yet (e.g. immediately after opening a workflow for the first time).

  • Remote consistency: all <K> configured remote(s) in sync — every configured remote authoritatively matched on its last verify; <K> is the count of remotes that reported a verified status. The trailing noun is singularised to remote when <K> is 1, otherwise remotes (e.g. Remote consistency: all 1 configured remote in sync vs. Remote consistency: all 3 configured remotes in sync).

  • Remote consistency: <N> file(s) drifted across <M> of <K> remote(s) — at least one remote reports drift; <N> is the total diverged-file count across all remotes, <M> is the number of remotes with at least one diverged file, and <K> is the count of configured-and-verified remotes. Both the file noun and the remote noun singularise independently when their count is 1 (e.g. Remote consistency: 1 file drifted across 1 of 1 remote vs. Remote consistency: 5 files drifted across 2 of 3 remotes).

Hash-aware step badges

Per-step status indicators distinguish content drift from cosmetic mtime drift. After a fresh clone, file mtimes are reset to checkout time, which historically caused every step badge to render as “stale” even though the bytes had not changed. The dashboard now consults the per-file SHA-256 recorded in the test marker before declaring a step stale:

Indicator

Meaning

Filled badge (the favicon glyph)

Fully verified: tests pass, attestation current, content matches.

Orange dot

Partially verified — at least one of tests, attestation, or dependency analysis is stale.

Hollow circle (◌)

Mtime drifted since the test marker, but the file’s SHA-256 still matches the recorded hash. The step is treated as content-clean; the indicator just signals that a tool touched the file without changing its bytes.

Pencil (✏)

True content drift — the file’s hash differs from the recorded hash. Re-run tests or re-attest.

The hollow-circle case replaces a noisy false-positive class: a post-clone or post-touch mtime bump no longer inflates verification state, so the badges reflect what is actually on disk.

Verification

Steps are verified three ways: 1) unit tests, 2) dependnency checks (if applicable), and 3) user attestation. These three controls are displayed in each Step’s expanded view.

The Unit Tests text is expandable to show detailed information about the step’s unit tests, including generating and running them. Three categories of unit tests exist:

  1. Integrity tests (test_integrity.py) — output files exist, are non-empty, load in their expected format, have the correct shape, and contain no NaN or infinity values.

  2. Qualitative tests (test_qualitative.py) — column names, JSON keys, parameter names, and other categorical content match expectations.

  3. Quantitative tests (test_quantitative.py plus quantitative_standards.json) — numerical output values match stored benchmarks at full double precision, with configurable relative and absolute tolerances.

Test generation is deterministic by default: a Python introspection script runs inside the container, reads each data file, and writes the tests mechanically. No language model is involved on the default path. An LLM-based path is available as a fallback for formats the introspection script cannot read.

See Supported Data Formats for the full list of file types the test generator can read.

vaibify monitors the steps for dependency violations, such as a dependent step not being fully verified or a dependent file being created after a subsequent step was marked verified.

Finally, a button for the (human) records that user’s assessment.

Agent actions

When an AI coding agent is running inside the container (typically Claude Code, started by typing claude in the terminal), it can ask the dashboard to perform named operations on the user’s behalf. These agent actions are the bridge between the agent’s text-only world and the dashboard’s verified state. This scheme enforces deterministic behavior.

Every state-changing operation in the dashboard — running a step, generating tests, pushing to GitHub, archiving to Zenodo — is registered in a single catalog. Each action carries a stable name, the arguments it accepts, and the verification it triggers when it finishes. The agent never invents an action; it picks one from the catalog or it falls back to plain shell commands.

From inside a container terminal, you can list the available actions:

vaibify-do --list

Or describe a single action’s arguments:

vaibify-do --describe run-step

Or invoke one directly (the agent does this for you):

vaibify-do run-step A03

Step labels (A03, I01) come from the pipeline panel — labels are per-type sequential, so A03 is the third automated step and I01 is the first interactive step. The dashboard updates as the action runs; if it produces new files, the affected step’s pencil and status dot react automatically.

Why this matters

Without the catalog, an agent that wants to “run unit tests on step A03” would have to guess at HTTP endpoints or shell out blindly, and the dashboard would silently drift out of sync with what the agent actually did. The catalog makes the agent’s intentions explicit and verifiable: every action it takes is one the user could have taken through the dashboard, and every action triggers the same verification state machine.

Hub mode

The hub is a separate page from the dashboard, but you visit it every time you launch vaibify with no subcommand:

vaibify

The hub lists every container vaibify knows about on the host, with quick-launch buttons to start, stop, or open the dashboard for any of them. From here you can also create a new container (the setup wizard from the QuickStart) or add an existing project that already has a vaibify.yml.

One session per container

Each container managed by vaibify can be open in only one dashboard at a time. This prevents two browsers from issuing conflicting commands to the same container. When a session is already attached, the container appears greyed out on other hubs.

The New vaibify window icon (⧉) in the hub, the workflow picker, and the dashboard’s Admin menu launches a new vaibify session in a new browser tab — useful for working on two projects side by side.