Skills / deterministic workflow builder
deterministic workflow builder
Deterministic, auditable workflow execution for Codex and Claude — typed manifests, approval gates, audit trails, and live HTML visualization.
Installation
Kompatibilitaet
Beschreibung
Deterministic Workflow Builder
Build deterministic, auditable, repeatable workflows for Codex and Claude.
This skill turns vague "make it deterministic" requests into a workflow package with:
- a typed
workflow.jsonmanifest - explicit
steps/*.sh - approval gates
- machine-checkable contracts
- replayable audits
- rollback hooks
- doctor and repair flows
- bounded AI sidecars that stay advisory
Why
Most agent workflows fail in production because runtime behavior is too implicit:
- the model decides at execution time
- approvals are informal
- outputs are not contract-checked
- retries and rollback are ad hoc
- state corruption is unrecoverable
This skill pushes the opposite direction: deterministic runtime, explicit state, and observable evidence.
Prerequisites
- Python 3.9+ and Bash (macOS/Linux)
- Claude CLI or Anthropic SDK — only needed for
type: "claude"steps and--generate/compile_workflow.py; pure shell workflows work with no AI tooling at all
Try it now
The fastest way to see it work is to run the included example:
git clone https://github.com/googlarz/deterministic-workflow-builder
cd deterministic-workflow-builder
# Preview the two-step hello-world workflow
python3 scripts/run_workflow.py examples/hello-world --list
python3 scripts/run_workflow.py examples/hello-world --dry-run
# Run it
python3 scripts/run_workflow.py examples/hello-world
Output:
→ running 01-greet
✓ complete 01-greet (0.0s)
→ running 02-verify
✓ complete 02-verify (0.0s)
See examples/hello-world/ for the full workflow with annotated workflow.json, step scripts, and rollback hooks.
What It Builds
user request
|
v
+----------------------+
| compile_workflow.py |
+----------------------+
|
v
+----------------------------------------+
| workflow.json + steps/*.sh + prompts |
+----------------------------------------+
|
+----------+----------+
| |
v v
verify_workflow.py security_audit.py
| |
+----------+----------+
|
v
run_workflow.sh
|
+-----------------+------------------+
| | |
v v v
approvals DAG execution rollback/repair
| | |
+-----------------+------------------+
|
v
audit runs + replay
Core Capabilities
- Typed schema v4 with migration support
- Runtime enforcement of produced and consumed artifact contracts
- Structured approvals with approver, reason, and change reference
- Parallel DAG execution with deterministic dependency handling
- Rollback hooks for failure recovery
- Doctor and repair commands for corrupted or interrupted state
- Security audit for workflow packages
- Prompt-asset pinning and sidecar output schemas
- Benchmarks and tests for regression checking
Workflow Visualization
Every run automatically generates workflow-graph.html — an n8n-style interactive DAG viewer — in the workflow directory. Open it in any browser, no server required.
# Generate or refresh the visualization without running the workflow
python3 scripts/run_workflow.py <workflow-dir> --visualize
Features:
- Live-updating status via XHR poll (every 3 s) with live / static indicator
- Color-coded nodes by step type (shell, test, python, json-validate, http-check, approval, …)
- Bezier edges colored by source step status — green for complete, orange for waiting-approval
- GATE badge on manual-approval steps
- Sidecar AI advisor nodes anchored below their consumer
- Click any node → inspector panel: type, status, script, dependencies, runtime metrics
- F = fit-to-screen, / = search/filter, minimap, Export SVG
- Progress bar:
N/total complete · M approvals
Repository Layout
deterministic-workflow-builder/
├── SKILL.md
├── README.md
├── VERSION
├── CHANGELOG.md
├── COMPATIBILITY.md
├── acceptance.md
├── assets/
│ ├── policies/ # strict-prod, ai-sidecar-safe, offline-only, …
│ ├── prompts/ # pinned sidecar prompt assets
│ └── sidecar-registry.json
├── benchmarks/ # expected outputs for smoke-testing compile_workflow.py
├── examples/
│ └── hello-world/ # complete runnable two-step example
├── references/
├── scripts/ # all tooling — init, run, verify, audit, migrate, …
└── tests/
Quick Start
Option A — run the example (zero setup):
python3 scripts/run_workflow.py examples/hello-world
Option B — scaffold your own workflow:
python3 scripts/init_deterministic_workflow.py demo-flow --path . --steps collect,review,publish
This creates ./demo-flow/ with a workflow.json, stub step scripts in steps/, and a WORKFLOW_SPEC.md. The step scripts are stubs — they fail on purpose until you fill them in.
Open steps/01-collect.sh and replace the placeholder body:
#!/usr/bin/env bash
set -euo pipefail
ROOT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
# Your commands here. Example:
git log --oneline -20 > "$ROOT_DIR/artifacts/01-collect.txt"
# Every step must produce its declared artifact so the success gate passes.
touch "$ROOT_DIR/artifacts/01-collect.done"
Also update success_gate in workflow.json for each step to match what the script actually produces. See examples/hello-world/ for a complete working reference.
Once the scripts are real:
python3 scripts/verify_workflow.py ./demo-flow --simulate
python3 scripts/security_audit.py ./demo-flow
python3 scripts/run_workflow.py ./demo-flow --dry-run
python3 scripts/run_workflow.py ./demo-flow
Option C — compile from natural language (requires Claude CLI):
python3 scripts/compile_workflow.py "Fix the failing CI test and make it deterministic." --path .
This calls Claude to generate the full workflow.json and step scripts from the description.
Common Operations
# Inspect
python3 scripts/run_workflow.py <workflow-dir> --list
python3 scripts/run_workflow.py <workflow-dir> --dry-run
python3 scripts/run_workflow.py <workflow-dir> --doctor
# Run
python3 scripts/run_workflow.py <workflow-dir>
python3 scripts/run_workflow.py <workflow-dir> --step 02-review # single step
# Approvals
python3 scripts/run_workflow.py <workflow-dir> --approve 02-review --approval-reason "checklist passed"
# Recovery
python3 scripts/run_workflow.py <workflow-dir> --repair
python3 scripts/run_workflow.py <workflow-dir> --replay run-0001
# Quality
python3 scripts/lint_determinism.py <workflow-dir>
python3 scripts/verify_workflow.py <workflow-dir> --simulate
python3 scripts/security_audit.py <workflow-dir>
python3 scripts/diff_workflows.py <workflow-dir-before> <workflow-dir-after>
# Improvement
python3 scripts/run_workflow.py <workflow-dir> --improve
python3 scripts/auto_harden_workflow.py <workflow-dir> --write
Testing
python3 -m pip install "ruff>=0.14,<0.15" "pre-commit>=4.3,<5" pytest pytest-cov
ruff check scripts tests && ruff format --check scripts tests
python3 -m unittest discover -s tests -p 'test_*.py'
python3 -m pytest --cov --cov-fail-under=33
pre-commit run --all-files
Release
git tag "v$(cat VERSION)"
git push origin --tags
The release workflow packages the skill, uploads the zip artifact, and publishes a GitHub release.
Intended Use
Use this when you want:
- "no AI in the loop" runtime behavior
- reproducible workflow execution
- approval checkpoints
- auditable state transitions
- safe hybrid workflows where AI only suggests and never decides
See SKILL.md for the full Codex skill behavior.
Project Policies
- License: MIT
- Contributing guide: CONTRIBUTING.md
- Security policy: SECURITY.md
- Compatibility policy: COMPATIBILITY.md
Aehnliche Skills
last30days skill
AI agent skill that researches any topic across Reddit, X, YouTube, HN, Polymarket, and the web - then synthesizes a grounded summary
frontend slides
Create beautiful slides on the web using Claude's frontend skills
context mode
Context window optimization for AI coding agents. Sandboxes tool output, 98% reduction. 14 platforms
claude seo
Universal SEO skill for Claude Code. 19 sub-skills, 12 subagents, 3 extensions (DataForSEO, Firecrawl, Banana). Technical SEO, E-E-A-T, schema, GEO/AEO, backlinks, local SEO, maps intelligence, Google APIs, and PDF/Excel reporting.
claude ads
Comprehensive paid advertising audit & optimization skill for Claude Code. 250+ checks across Google, Meta, YouTube, LinkedIn, TikTok, Microsoft & Apple Ads with weighted scoring, parallel agents, industry templates, and AI creative generation.
claude obsidian
Claude + Obsidian knowledge companion. Persistent, compounding wiki vault based on Karpathy's LLM Wiki pattern. /wiki /save /autoresearch