Zum Inhalt springen

Skills / deterministic workflow builder

deterministic workflow builder

Deterministic, auditable workflow execution for Codex and Claude — typed manifests, approval gates, audit trails, and live HTML visualization.

2by @googlarz28d agoMITGitHub →

Installation

Compatibility

Claude CodeCodex

Description

Deterministic Workflow Builder

Build deterministic, auditable, repeatable workflows for Codex and Claude.

This skill turns vague "make it deterministic" requests into a workflow package with:

  • a typed workflow.json manifest
  • explicit steps/*.sh
  • approval gates
  • machine-checkable contracts
  • replayable audits
  • rollback hooks
  • doctor and repair flows
  • bounded AI sidecars that stay advisory

Why

Most agent workflows fail in production because runtime behavior is too implicit:

  • the model decides at execution time
  • approvals are informal
  • outputs are not contract-checked
  • retries and rollback are ad hoc
  • state corruption is unrecoverable

This skill pushes the opposite direction: deterministic runtime, explicit state, and observable evidence.

Prerequisites

  • Python 3.9+ and Bash (macOS/Linux)
  • Claude CLI or Anthropic SDK — only needed for type: "claude" steps and --generate / compile_workflow.py; pure shell workflows work with no AI tooling at all

Try it now

The fastest way to see it work is to run the included example:

git clone https://github.com/googlarz/deterministic-workflow-builder
cd deterministic-workflow-builder

# Preview the two-step hello-world workflow
python3 scripts/run_workflow.py examples/hello-world --list
python3 scripts/run_workflow.py examples/hello-world --dry-run

# Run it
python3 scripts/run_workflow.py examples/hello-world

Output:

  → running  01-greet
  ✓ complete 01-greet  (0.0s)
  → running  02-verify
  ✓ complete 02-verify  (0.0s)

See examples/hello-world/ for the full workflow with annotated workflow.json, step scripts, and rollback hooks.

What It Builds

                user request
                     |
                     v
          +----------------------+
          | compile_workflow.py  |
          +----------------------+
                     |
                     v
    +----------------------------------------+
    | workflow.json + steps/*.sh + prompts   |
    +----------------------------------------+
                     |
          +----------+----------+
          |                     |
          v                     v
   verify_workflow.py     security_audit.py
          |                     |
          +----------+----------+
                     |
                     v
             run_workflow.sh
                     |
   +-----------------+------------------+
   |                 |                  |
   v                 v                  v
 approvals      DAG execution      rollback/repair
   |                 |                  |
   +-----------------+------------------+
                     |
                     v
             audit runs + replay

Core Capabilities

  • Typed schema v4 with migration support
  • Runtime enforcement of produced and consumed artifact contracts
  • Structured approvals with approver, reason, and change reference
  • Parallel DAG execution with deterministic dependency handling
  • Rollback hooks for failure recovery
  • Doctor and repair commands for corrupted or interrupted state
  • Security audit for workflow packages
  • Prompt-asset pinning and sidecar output schemas
  • Benchmarks and tests for regression checking

Workflow Visualization

Every run automatically generates workflow-graph.html — an n8n-style interactive DAG viewer — in the workflow directory. Open it in any browser, no server required.

# Generate or refresh the visualization without running the workflow
python3 scripts/run_workflow.py <workflow-dir> --visualize

Features:

  • Live-updating status via XHR poll (every 3 s) with live / static indicator
  • Color-coded nodes by step type (shell, test, python, json-validate, http-check, approval, …)
  • Bezier edges colored by source step status — green for complete, orange for waiting-approval
  • GATE badge on manual-approval steps
  • Sidecar AI advisor nodes anchored below their consumer
  • Click any node → inspector panel: type, status, script, dependencies, runtime metrics
  • F = fit-to-screen, / = search/filter, minimap, Export SVG
  • Progress bar: N/total complete · M approvals

Repository Layout

deterministic-workflow-builder/
├── SKILL.md
├── README.md
├── VERSION
├── CHANGELOG.md
├── COMPATIBILITY.md
├── acceptance.md
├── assets/
│   ├── policies/          # strict-prod, ai-sidecar-safe, offline-only, …
│   ├── prompts/           # pinned sidecar prompt assets
│   └── sidecar-registry.json
├── benchmarks/            # expected outputs for smoke-testing compile_workflow.py
├── examples/
│   └── hello-world/       # complete runnable two-step example
├── references/
├── scripts/               # all tooling — init, run, verify, audit, migrate, …
└── tests/

Quick Start

Option A — run the example (zero setup):

python3 scripts/run_workflow.py examples/hello-world

Option B — scaffold your own workflow:

python3 scripts/init_deterministic_workflow.py demo-flow --path . --steps collect,review,publish

This creates ./demo-flow/ with a workflow.json, stub step scripts in steps/, and a WORKFLOW_SPEC.md. The step scripts are stubs — they fail on purpose until you fill them in.

Open steps/01-collect.sh and replace the placeholder body:

#!/usr/bin/env bash
set -euo pipefail

ROOT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"

# Your commands here. Example:
git log --oneline -20 > "$ROOT_DIR/artifacts/01-collect.txt"

# Every step must produce its declared artifact so the success gate passes.
touch "$ROOT_DIR/artifacts/01-collect.done"

Also update success_gate in workflow.json for each step to match what the script actually produces. See examples/hello-world/ for a complete working reference.

Once the scripts are real:

python3 scripts/verify_workflow.py ./demo-flow --simulate
python3 scripts/security_audit.py ./demo-flow
python3 scripts/run_workflow.py ./demo-flow --dry-run
python3 scripts/run_workflow.py ./demo-flow

Option C — compile from natural language (requires Claude CLI):

python3 scripts/compile_workflow.py "Fix the failing CI test and make it deterministic." --path .

This calls Claude to generate the full workflow.json and step scripts from the description.

Common Operations

# Inspect
python3 scripts/run_workflow.py <workflow-dir> --list
python3 scripts/run_workflow.py <workflow-dir> --dry-run
python3 scripts/run_workflow.py <workflow-dir> --doctor

# Run
python3 scripts/run_workflow.py <workflow-dir>
python3 scripts/run_workflow.py <workflow-dir> --step 02-review  # single step

# Approvals
python3 scripts/run_workflow.py <workflow-dir> --approve 02-review --approval-reason "checklist passed"

# Recovery
python3 scripts/run_workflow.py <workflow-dir> --repair
python3 scripts/run_workflow.py <workflow-dir> --replay run-0001

# Quality
python3 scripts/lint_determinism.py <workflow-dir>
python3 scripts/verify_workflow.py <workflow-dir> --simulate
python3 scripts/security_audit.py <workflow-dir>
python3 scripts/diff_workflows.py <workflow-dir-before> <workflow-dir-after>

# Improvement
python3 scripts/run_workflow.py <workflow-dir> --improve
python3 scripts/auto_harden_workflow.py <workflow-dir> --write

Testing

python3 -m pip install "ruff>=0.14,<0.15" "pre-commit>=4.3,<5" pytest pytest-cov
ruff check scripts tests && ruff format --check scripts tests
python3 -m unittest discover -s tests -p 'test_*.py'
python3 -m pytest --cov --cov-fail-under=33
pre-commit run --all-files

Release

git tag "v$(cat VERSION)"
git push origin --tags

The release workflow packages the skill, uploads the zip artifact, and publishes a GitHub release.

Intended Use

Use this when you want:

  • "no AI in the loop" runtime behavior
  • reproducible workflow execution
  • approval checkpoints
  • auditable state transitions
  • safe hybrid workflows where AI only suggests and never decides

See SKILL.md for the full Codex skill behavior.

Project Policies

Related Skills