Zum Inhalt springen

Skills / audit flow

audit flow

Claude Code skill for interactive system flow tracing across CODE, API, AUTH, DATA, NETWORK layers. SQLite persistence, Mermaid export, git merge driver.

9von @ArunJRKvor 56d aktualisiertMITGitHub →

Installation

Kompatibilitaet

Claude CodeCodex

Beschreibung

An Agent Skill for thinking about systems before (and after) you build them.

Tell your AI agent to trace a flow — it walks through your codebase recording every step in SQLite, then exports Mermaid diagrams, Markdown reports, JSON, and YAML. Works with Claude Code, OpenAI Codex CLI, and any agent that supports the Agent Skills spec.

Why

Most teams jump straight to code. The architecture lives in someone's head, a stale Confluence page, or a whiteboard photo rotting in Slack.

audit-flow makes system thinking a first-class artifact:

  • Before building — sketch the flow as a DAG. Which layers? What triggers what? Where does data move? Your brainstorm session produces a queryable database and exportable diagrams, not ephemeral notes.
  • After building — trace the real implementation. Compare what you planned vs what you built. The ideation flow and the documentation flow live in the same database.
  • When things break — trace the bug path. Your incident post-mortem links back to the original design flow. Findings accumulate across sessions.
  • Over time — 10 audits later, you have a queryable map of your entire system. New engineer joins? audit.py list shows every flow ever traced.

It's TDD for architecture. Trace the flow first, implement, trace again to verify.

What It Looks Like

"Audit the auth login flow"

→ Agent traces through your codebase
→ Records 47 steps across CODE, API, AUTH, DATA, NETWORK layers
→ Flags 3 security concerns
→ Exports Mermaid diagram + Markdown report to docs/audits/
"Brainstorm the export feature before we build it"

→ Agent sketches the flow as a DAG (no code exists yet)
→ Records design questions as findings
→ You iterate on the flow interactively
→ Export the design doc, then build against it
flowchart TD
    subgraph CODE
        T1(["1. fetch API"]):::entryPoint
    end
    subgraph AUTH
        T2["2. check token"]
        T3["3. proceed"]
        T4["4. refresh token"]
    end
    subgraph API
        T5["5. call endpoint"]
    end

    T1 -->|"TRIGGERS"| T2
    T2 -->|"BRANCHES<br/>token valid"| T3
    T2 -->|"BRANCHES<br/>token expired"| T4
    T3 -->|"TRIGGERS"| T5
    T4 -->|"MERGES"| T5

    classDef entryPoint fill:#51cf66,stroke:#2b8a3e
    classDef concern fill:#ff6b6b,stroke:#c92a2a

Install

Via skills CLI (skills.sh)

npx skills add ArunJRK/audit-flow

Via git clone

# From your project root
git clone https://github.com/ArunJRK/audit-flow.git .claude/skills/audit-flow

# Run setup (initializes DB + configures git merge driver)
bash .claude/skills/audit-flow/setup.sh

Manual

Copy the audit-flow/ directory into .claude/skills/ in your project. Your agent discovers it automatically.

Requirements

  • Python 3.8+ (stdlib only — sqlite3, json, csv, argparse)
  • Git (for merge/diff drivers)
  • Optional: pyyaml for YAML export

Zero external dependencies.

How It Works

You say "audit the auth flow" or "brainstorm the payment feature." Your agent:

  1. Creates a session (audit container with git context)
  2. Creates a flow (named DAG with an entry point)
  3. Traces through your code (or sketches a design), inserting tuples — each one a step: which layer, what action, which file
  4. Connects tuples with edges — semantic relations like TRIGGERS, READS, WRITES, BRANCHES, MERGES
  5. Records findings — security concerns, design questions, things the analyst notices
  6. Everything persists in .audit/audit.db (SQLite)
  7. Exports to Mermaid flowcharts, Markdown reports, JSON, YAML
Session (audit container)
  └── Flow (named DAG with entry point)
       └── Tuple (node: layer + action + subject)
            └── Edge (relation + optional condition)
       └── Finding (severity + category + description)

Interactive Workflow

Your agent asks you at each decision point:

Agent: I'll trace the auth flow. Let me set up the session.
       Name: auth-login-audit
       Purpose: security-audit
       Granularity: fine or coarse?

You:   Fine — function-level detail

Agent: [traces through code, recording tuples and edges in SQLite]

       Found 3 concerns:
       - [HIGH] Token stored in localStorage (XSS risk)
       - [MEDIUM] No PKCE in OAuth flow
       - [LOW] Token visible in Redux DevTools

       Export format?

You:   Mermaid and markdown

Agent: [exports to docs/audits/auth-login-audit/]

Use Cases

| Purpose | Use For | |---------|---------| | security-audit | Trace auth flows, find vulnerabilities, compliance checks | | documentation | Document how systems work, onboarding material | | compliance | SOC2/HIPAA evidence, data flow documentation | | ideation | Design new features by sketching flows before code exists | | brainstorming | Free-form idea exploration, what-if scenarios, divergent thinking | | debugging | Trace bugs through the system, reproduce issue paths | | architecture-review | Evaluate system design, identify coupling, review boundaries | | incident-review | Post-mortem flow tracing, root cause analysis |

The Compound Effect

A single audit is useful. Many audits are powerful:

  • Ideation → sketch the flow → Documentation → trace what you built → Security Audit → find what's wrong → Incident Review → trace what broke
  • Same data model, same database, linked by sessions. Your architecture becomes queryable: "show me every AUTH-layer step across all flows" is a SQL query.
  • New engineer runs audit.py list and sees every flow ever traced — with entry points, findings, and Mermaid diagrams.

Layers & Relations

5 layers classify where each step happens:

| Layer | Examples | |-------|---------| | CODE | Function calls, event handlers, components | | API | HTTP endpoints, service boundaries | | AUTH | Authentication, authorization, token ops | | DATA | Database queries, cache reads, state mutations | | NETWORK | External HTTP calls, WebSocket, SSE |

7 relations define how steps connect:

| Relation | Arrow | Meaning | |----------|-------|---------| | TRIGGERS | --> solid | A causes B to execute | | READS | -.-> dotted | A consumes data from B | | WRITES | ==> thick | A mutates data in B | | VALIDATES | --> solid | A checks/verifies B | | TRANSFORMS | --> solid | A converts data for B | | BRANCHES | --> solid | Conditional paths (requires condition label) | | MERGES | --> solid | Multiple paths converge |

Key Features

Non-Linear Flows

Not just linear traces — supports branching and merging:

-- Branch: token check → two outcomes
INSERT INTO edges (from_tuple, to_tuple, relation, condition)
VALUES (5, 6, 'BRANCHES', 'token valid'),
       (5, 7, 'BRANCHES', 'token expired');

-- Merge: both paths converge at the API call
INSERT INTO edges (from_tuple, to_tuple, relation)
VALUES (6, 8, 'TRIGGERS'),
       (7, 8, 'MERGES');

Findings (Not Flow Steps)

What the system does → tuples. What the analyst notes → findings.

This distinction matters. Observations like "no cross-tab sync" or "possible replay attack" are not system actions — they're analyst insights. Recording them as findings keeps diagrams clean and reports useful.

INSERT INTO findings (session_id, flow_id, severity, category, description, tuple_refs)
VALUES ('my-session', 1, 'high', 'token-storage',
        'Access token in localStorage — vulnerable to XSS', '[7, 8]');

Mermaid Export

Auto-generated diagrams with:

  • BFS step numbering from entry point
  • Green entry point marker (stadium shape)
  • Layer-based subgraphs
  • Relation-specific arrow styles (solid/dotted/thick)
  • Observation separation (concern chains → dashed OBSERVATIONS subgraph)
  • HTML entity sanitization for safe labels
  • Configurable direction (TD or LR)

Validation

python scripts/audit.py validate my-session

| Check | Severity | |-------|----------| | BRANCHES without condition | ERROR | | Node count >= 60 | ERROR — must split | | Node count >= 40 | WARN — consider splitting | | Orphan nodes | WARN | | Duplicate labels | WARN | | No entry point | WARN |

Git Merge Driver

SQLite is binary — git merge can't resolve conflicts. This skill includes a custom merge driver:

python scripts/audit.py git-setup    # one-time

On conflict, git auto-calls the driver which:

  1. Opens both SQLite databases
  2. Merges sessions by name (later updated_at wins)
  3. Flows follow parent session winner
  4. Remaps all integer PKs sequentially
  5. Deduplicates findings by content

CSV Backup

python scripts/audit.py csv-export   # DB → .audit/csv/*.csv
python scripts/audit.py csv-import   # CSV → DB

CLI Reference

| Command | Purpose | |---------|---------| | audit.py init | Initialize SQLite database | | audit.py list | List all audit sessions | | audit.py show <session> | Show session overview | | audit.py show <session> <flow> | Show flow details | | audit.py export <session> | Export all formats | | audit.py export <session> -f <flow> | Export specific flow | | audit.py export <session> -F mermaid | Export specific format | | audit.py export <session> -d LR | Horizontal Mermaid layout | | audit.py validate <session> | Validate before export | | audit.py git-setup | Configure git merge/diff drivers | | audit.py csv-export | Backup DB to CSV | | audit.py csv-import | Restore DB from CSV |

Project Structure

audit-flow/
├── SKILL.md           # Agent skill definition (frontmatter + instructions)
├── COMMANDS.md        # SQL reference for manual use
├── EXAMPLES.md        # Full examples with branching flows
├── schema.sql         # SQLite schema (5 tables, 5 views, triggers)
├── scripts/
│   └── audit.py       # CLI tool (~1700 lines, zero dependencies)
├── setup.sh           # One-time setup script
├── LICENSE            # MIT
└── README.md

Data Model

5 tables, 5 views, 2 triggers. Full schema in schema.sql.

sessions  1──N  flows  1──N  tuples  ──  edges
                       1──N  findings

| Table | Purpose | Key | |-------|---------|-----| | sessions | Audit container with git context | name (unique) | | flows | Named DAG within a session | (session_id, name) | | tuples | Flow step: layer + action + subject | auto-increment | | edges | Relationship between tuples | (from_tuple, to_tuple) | | findings | Security/design observations | (session_id, category, description) |

Views: v_session_summary, v_flow_summary, v_layer_distribution, v_concerns, v_branch_merge_points

Design Principles

  • SQLite is the source of truth — never generate output from agent context/memory
  • DB-first — write each tuple before moving to next code location
  • Observations are findings, not flow steps — what the system DOES → tuples; what the analyst NOTES → findings
  • All diagrams generated by export — never hand-craft Mermaid
  • Zero dependencies — Python stdlib only

Also Listed On

Contributing

Issues and PRs welcome. The codebase is intentionally small and dependency-free.

License

MIT

Aehnliche Skills