Zum Inhalt springen

MCP Server / vulca

vulca

Agent-native image-editing SDK for Claude Code. 21 MCP tools + /decompose skill — semantic layer splits, L1–L5 cultural scoring, region inpaint. Powered by ComfyUI, Gemini, or mock.

6von @vulca-orgGitHub →

Transport

sse

Tools (14)

Cluster

What your agent delegates to Vulca

Roadmap

`inpaint_artwork`, `layers_edit`, `layers_redraw`, `layers_transform`, `layers_composite`, `layers_export`, `layers_evaluate`

Roadmap

`evaluate_artwork`, `list_traditions`, `get_tradition_guide`, `search_traditions`

Paper

Venue

arXiv

L1–L5 definitions, 7,410 samples, 9 traditions

arXiv

Cross-cultural expert-level critique evaluation with VLMs

gemini

openai

reference

fusion --skills brand,audience,trend # extra commercial scoring skills # Layers (all 14 subcommands) vulca layers analyze image.png vulca layers split image.png -o dir --mode extract

Provider

Generate

Layered

Multilingual

ComfyUI

English-only

OpenAI

English-only

Dokumentation

VULCA

Agents can plan image edits but can't cut pixels. Vulca is the hands — semantic layer splits, cultural scoring, inpainting — as 21 MCP tools for Claude Code.

Below: Michelangelo's Creation of Adam → 5 semantic layers via /decompose (background · adam · god_and_angels · red_cloak · green_ground), decomposed locally on Apple Silicon (ComfyUI + Ollama) with zero cloud API calls. SDK total: 21 MCP tools · 1454 tests passing.

What happens when you run /decompose

> /decompose assets/showcase/originals/creation-of-adam.jpg

Claude: I'll inspect the fresco and plan a semantic decomposition.
[tool_use: view_image(path="assets/showcase/originals/creation-of-adam.jpg")]
[tool_use: layers_split(
    path="assets/showcase/originals/creation-of-adam.jpg",
    mode="orchestrated",
    plan={"domain": "fresco", "entities": [
        {"name": "background",     "prompt": "the pale cream sky of the ceiling",
         "semantic_path": "background"},
        {"name": "adam",           "prompt": "the reclining nude male figure on the left",
         "semantic_path": "subject.person[0]"},
        {"name": "god_and_angels", "prompt": "the bearded figure of God with the angel group",
         "semantic_path": "subject.person[1]"},
        {"name": "red_cloak",      "prompt": "the billowing dark red cloak around God",
         "semantic_path": "subject.drapery"},
        {"name": "green_ground",   "prompt": "the mossy green ground beneath Adam",
         "semantic_path": "foreground.ground"}]})]
[tool_result: {"layers": [
    {"name": "background",     "semantic_path": "background"},
    {"name": "adam",           "semantic_path": "subject.person[0]"},
    {"name": "god_and_angels", "semantic_path": "subject.person[1]"},
    {"name": "red_cloak",      "semantic_path": "subject.drapery"},
    {"name": "green_ground",   "semantic_path": "foreground.ground"}]}]

Claude: 5 layers extracted — each person figure isolated, drapery and ground separated from the pale fresco sky. Want to redraw God's cloak, composite with a different background, or run another image?

Try it in 60 seconds

Prerequisite: uv installed (provides uvx). Without uv, substitute uvx --from vulca[mcp] vulca-mcp with python -m vulca.mcp_server in Path B.

Path A — plugin install (recommended)

pip install vulca[mcp]==0.17.5
claude plugin install vulca-org/vulca-plugin

Then in Claude Code: > /decompose /path/to/your_image.jpg

Path B — no plugin (power user)

pip install vulca[mcp]==0.17.5

# Register MCP server — add to ~/.claude/settings.json:
# {"mcpServers": {"vulca": {"command": "uvx", "args": ["--from", "vulca[mcp]==0.17.5", "vulca-mcp"]}}}

# Install the /decompose skill:
mkdir -p ~/.claude/skills/decompose
curl -o ~/.claude/skills/decompose/SKILL.md \
  "https://raw.githubusercontent.com/vulca-org/vulca/v0.17.5/.claude/skills/decompose/SKILL.md?utm_source=github-readme&utm_medium=oss&utm_campaign=refresh-2026-04-20"

Why agent-native

Most image SDKs ship a "brain" — a VLM planner that decides what to generate, how to compose, when to stop. Claude Code already has a brain. What it can't do is cut pixels: run SAM + YOLO + DINO + SegFormer, diff masks, score against a cultural rubric, composite with alpha. Vulca is the hands, not another brain.

Practical consequences of this framing:

  • Tools return structured JSON + paths, not prose. The agent inspects, branches, retries.
  • No hidden LLM/planning decisions inside tools — every tool call surfaces its detection report; the agent sees what ran, can branch, and can roll back.
  • The skill we ship (.claude/skills/decompose/SKILL.md) is a declarative prompt, not a wrapper. The agent reads it directly. Future skills follow the same shape.
  • Vulca doesn't host a model — it drives yours (ComfyUI / Gemini / OpenAI / mock) with structured tooling. No overlap with hosted image APIs; the value is in the pixel-level work between "agent planned" and "image on disk."
  • Local-first is a first-class path — ComfyUI + Ollama + MPS tested end-to-end; no cloud key required.

What Vulca takes off your agent's hands

| Cluster | What your agent delegates to Vulca | Skill | Tools | |---|---|:---:|---| | Decompose | Extract 10–20 semantic layers from any image with real transparency. | ✅ /decompose | layers_split (orchestrated), layers_list | | Edit | Redraw one region or one layer without touching the rest. Composite back. | Roadmap | inpaint_artwork, layers_edit, layers_redraw, layers_transform, layers_composite, layers_export, layers_evaluate | | Evaluate | Judge a visual against L1–L5 cultural criteria over 13 traditions with citable rationale. | Roadmap | evaluate_artwork, list_traditions, get_tradition_guide, search_traditions | | Create | Generate a new image from intent + tradition guidance, optionally in structured layers. | — | create_artwork, generate_image | | Brief / Studio | Turn fuzzy visual intent into a reviewable proposal.md; concept sketches and iteration. | ✅ /visual-brainstorm | brief_parse, generate_concepts | | Admin | Expose intermediate artifacts, unload models, archive sessions. | — | view_image, unload_models, archive_session, sync_data |

User intent ─▶ Claude Code (planning) ─▶ Vulca MCP tools ─▶ Image artifacts ─┐
       ▲                                                                    │
       └──────────── visible via view_image ◀───────────────────────────────┘

Roadmap — no promises, just honest order

  • Next skill: /evaluate — reactivates the EMNLP anchor for agent-driven cultural scoring
  • Then: /inpaint (region-level edit), /layered-create (structured generation)
  • Beyond: community-driven — file an issue with your workflow

See docs/agent-native-workflow.md for the deeper walkthrough.


Beyond decomposition, Vulca evaluates any image against a cultural tradition across 5 dimensions (L1 Visual Perception, L2 Technical Execution, L3 Cultural Context, L4 Critical Interpretation, L5 Philosophical Aesthetics) in three modes. The MCP tool is evaluate_artwork; the CLI is vulca evaluate. No agent skill yet — /evaluate is next on the roadmap.

Strict (binary cultural judgment)

$ vulca evaluate artwork.png -t chinese_xieyi

  Score:     90%    Tradition: chinese_xieyi    Risk: low
    L1 Visual Perception         90%  ✓
    L2 Technical Execution       85%  ✓
    L3 Cultural Context          90%  ✓
    L4 Critical Interpretation  100%  ✓
    L5 Philosophical Aesthetics  90%  ✓

Reference (mentor — professional terminology, not a verdict)

$ vulca evaluate artwork.png -t chinese_xieyi --mode reference

  L3 Cultural Context          95%  (traditional)
     To push further: adding a poem (题画诗) for poetry-calligraphy-
     painting-seal (诗书画印) harmony.

Fusion (cross-tradition comparison)

$ vulca evaluate artwork.png -t chinese_xieyi,japanese_traditional,western_academic --mode fusion

  Dimension                Chinese Xieyi  Japanese Trad  Western Acad
  Overall Alignment               93%            90%           8%

  Closest tradition: chinese_xieyi (93%)

Vulca can plan a layer structure from a tradition's knowledge and emit each layer as a separate transparent PNG, with the first layer serving as a style anchor for the rest (Defense 3, v0.14+).

vulca create "水墨山水,松间茅屋" -t chinese_xieyi --layered --provider comfyui
# → 5 layers: paper, distant_mountains, mountains_pines, hut_figure, calligraphy

Works across traditions — photography produces depth layers, gongbi produces line-art + wash layers, brand design produces logo + background + typography.

import vulca
result = vulca.create("水墨山水", provider="comfyui", tradition="chinese_xieyi", layered=True)
for layer in result.layers:
    print(layer.name, layer.path, layer.coverage)

From an agent, invoke via the create_artwork MCP tool (Path A/B above). The /layered-create skill is on the roadmap.

Two orthogonal flows for targeted change:

Region inpaint (no decomposition — pick a region, regenerate only that area):

vulca inpaint artwork.png --region "the sky in the upper portion" \
  --instruction "dramatic stormy clouds" -t chinese_xieyi --provider comfyui

Layer redraw (after /decompose — swap one layer without touching the rest):

vulca layers lock ./layers/ --layer calligraphy_and_seals
vulca layers redraw ./layers/ --layer background_sky \
  -i "warm golden sunset with orange and purple gradients"
vulca layers composite ./layers/ -o final.png

Layer operations available: add, remove, reorder, toggle, lock, merge, duplicate. All provider-agnostic (works with ComfyUI, Gemini, OpenAI, mock).

From an agent, these map to inpaint_artwork, layers_edit, layers_redraw, layers_composite, layers_export. The /inpaint skill is on the roadmap.


Research

| Paper | Venue | Contribution | |-------|-------|--------------| | VULCA Framework | EMNLP 2025 Findings | 5-dimension evaluation framework for culturally-situated multimodal LLM tasks | | VULCA-Bench | arXiv | L1–L5 definitions, 7,410 samples, 9 traditions | | Art Critique | arXiv | Cross-cultural expert-level critique evaluation with VLMs |

Citation

@inproceedings{yu2025vulca,
  title     = {A Structured Framework for Evaluating and Enhancing Interpretive
               Capabilities of Multimodal LLMs in Culturally Situated Tasks},
  author    = {Yu, Haorui and Ruiz-Dolz, Ramon and Yi, Qiufeng},
  booktitle = {Findings of the Association for Computational Linguistics: EMNLP 2025},
  pages     = {1945--1971},
  year      = {2025}
}

@article{yu2026vulcabench,
  title   = {VULCA-Bench: A Benchmark for Culturally-Aware Visual Understanding at Five Levels},
  author  = {Yu, Haorui},
  journal = {arXiv preprint arXiv:2601.07986},
  year    = {2026}
}

13 cultural traditions

chinese_xieyi chinese_gongbi japanese_traditional western_academic islamic_geometric watercolor african_traditional south_asian contemporary_art photography brand_design ui_ux_design default

Custom traditions via YAML — vulca evaluate painting.jpg --tradition ./my_style.yaml.


Apple Silicon / MPS quickstart

pip install vulca[mcp,tools]==0.17.5
# Local stack: ComfyUI + Ollama, full MPS support

See docs/apple-silicon-mps-comfyui-guide.md for the full ComfyUI + Ollama setup tested on MPS.


# Create
vulca create "intent" -t tradition --provider mock|gemini|openai|comfyui
  --layered                    # structured layer generation
  --hitl                       # pause for human review
  --reference ref.png          # reference image
  --colors "#hex1,#hex2"       # color palette constraint
  -o output.png

# Evaluate
vulca evaluate image.png -t tradition --mode strict|reference|fusion
  --skills brand,audience,trend  # extra commercial scoring skills

# Layers (all 14 subcommands)
vulca layers analyze image.png
vulca layers split image.png -o dir --mode extract|regenerate|sam
vulca layers redraw dir --layer name -i "instruction"
vulca layers add dir --name name --content-type type
vulca layers toggle dir --layer name --visible true|false
vulca layers lock dir --layer name
vulca layers merge dir --layers a,b --name merged
vulca layers duplicate dir --layer name
vulca layers composite dir -o output.png
vulca layers export dir -o output.psd
vulca layers evaluate dir -t tradition
vulca layers regenerate dir --provider gemini

# Inpainting
vulca inpaint image.png --region "description or x,y,w,h"
  --instruction "what to change" -t tradition --count 4 --select 1

# Tools (algorithmic, no API cost)
vulca tools run brushstroke_analyze --image art.png -t chinese_xieyi
vulca tools run whitespace_analyze --image art.png -t chinese_xieyi
vulca tools run composition_analyze --image art.png -t chinese_xieyi
vulca tools run color_gamut_check --image art.png -t chinese_xieyi
vulca tools run color_correct --image art.png -t chinese_xieyi

# Utilities
vulca traditions                        # list all traditions
vulca tradition tradition_name          # detailed guide
vulca tradition --init my_style         # generate template YAML
vulca evolution tradition_name          # check evolved weights
vulca sync [--push-only|--pull-only]    # cloud sync
# Python SDK
import vulca
result = vulca.evaluate("artwork.png", tradition="chinese_xieyi")
print(result.score, result.suggestions, result.L3)

# Structured creation
result = vulca.create("水墨山水", provider="comfyui",
                      tradition="chinese_xieyi", layered=True)

# Layer operations
from vulca.layers import analyze_layers, split_extract, composite_layers
import asyncio
layers = asyncio.run(analyze_layers("artwork.png"))
results = split_extract("artwork.png", layers, output_dir="./layers")
composite_layers(results, width=1024, height=1024, output_path="composite.png")

# Self-evolution weights
weights = vulca.get_weights("chinese_xieyi")
# → {"L1": 0.10, "L2": 0.20, "L3": 0.35, "L4": 0.15, "L5": 0.20}
┌──────────────────────────────────────────────────────────────┐
│                         User Intent                          │
└──────┬───────────┬──────────────┬──────────────┬─────────────┘
       │           │              │              │
  ┌────▼──┐  ┌─────▼───┐  ┌──────▼─────┐  ┌─────▼─────┐
  │  CLI  │  │ Python  │  │    MCP     │  │  ComfyUI  │
  │       │  │   SDK   │  │  21 tools  │  │  11 nodes │
  └───┬───┘  └────┬────┘  └──────┬─────┘  └─────┬─────┘
      └───────────┴───────┬──────┴───────────────┘
                          │
                 vulca.pipeline.execute()
                          │
              ┌───────────▼───────────┐
              │    Image Providers    │
              │  ComfyUI │ Gemini     │
              │  OpenAI  │ Mock       │
              └───────────────────────┘

| Provider | Generate | Inpaint | Layered | Multilingual | |----------|----------|---------|---------|--------------| | ComfyUI | ✓ | ✓ | ✓ | English-only | | Gemini | ✓ | ✓ | ✓ | CJK native | | OpenAI | ✓ | — | — | English-only | | Mock | ✓ | ✓ | ✓ | — |

All 8 end-to-end pipeline phases validated on the local stack (ComfyUI + Ollama, Apple Silicon MPS). See the MPS guide linked above.

Every session feeds back into the tradition's L1–L5 weights. Gating: minimum 5 sessions + 3 feedback sessions before weights shift. strict mode reinforces conformance, reference mode tracks exploration.

$ vulca evolution chinese_xieyi

  Dim     Original    Evolved     Change
  L1        10.0%     10.0%      0.0%
  L2        15.0%     20.0%     +5.0%    ← Technical Execution strengthened
  L3        25.0%     35.0%    +10.0%    ← Cultural Context most evolved
  L4        20.0%     15.0%     -5.0%
  L5        30.0%     20.0%    -10.0%
  Sessions: 71

From an agent: the evaluate_artwork MCP tool returns evolved weights alongside scores; no separate skill needed.


Showcase — agent-produced layer separations


Support

  • Issues: github.com/vulca-org/vulca/issues — bug reports, feature requests, workflow needs that should become a skill
  • Plugin: vulca-org/vulca-plugin — version-tracked with the SDK; install via claude plugin install
  • Skill source: .claude/skills/decompose/SKILL.md in this repo — the only source of truth for the /decompose flow
  • Skill source: .claude/skills/visual-brainstorm/SKILL.md/visual-brainstorm turns fuzzy visual intent (topic, optional sketch, optional references) into a reviewable proposal.md. Zero-pixel, Discovery-metadata only. Scoped to 2D illustrative/editorial imagery (poster, illustration, packaging, brand visual, cover art, photography brief, hero visuals for UI).

License

Apache 2.0. See LICENSE.

Issues and PRs welcome. Development syncs from a private monorepo via git subtree.

vulca | hub.ai-engineering.at