Zum Inhalt springen

Skills / AI skills bank

AI skills bank

AI Skills Bank is a unified, multi-tool platform designed to aggregate, manage, and route AI skills across various workflows and AI assistants (such as Antigravity, Claude Code, Cursor, and Copilot).

3von @abdulsamed1vor 44d aktualisiertGitHub →

Installation

Kompatibilitaet

Claude CodeCodexGeminiCursorVS Code

Beschreibung

skills-bank

High-performance skill aggregation, classification & routing platform for AI agents.


� Prerequisites

  • Rust 1.70+ (Install)
  • Git (for repository cloning)
  • ~2GB disk space (for aggregated skills cache)

�📖 Overview

skills-bank aggregates skills (workflows, tasks, specialized agents) from 100+ distributed repositories and provides a unified routing system for AI agents to discover, load, and invoke them efficiently.

Core Design Principles

  • Source-of-Truth Loading: Agents load canonical SKILL.md files directly from source repositories, not from catalogs. This eliminates hallucination risks and optimizes token usage.
  • Hybrid Classification: A dual-stage pipeline combines fast keyword rules (Step A) with LLM-powered semantic classification (Step B) to route skills into 12 domain hubs and 40+ sub-hubs.
  • Smart Deduplication: Skills are deduplicated by name OR description — catching both exact collisions and cross-repo clones with different names but identical content.
  • Multi-Tool Support: Skills sync to major AI tools including GitHub Copilot, Claude-code, free-code (claude-code), Hermes, Cursor, Gemini, Antigravity, OpenCode, Codex, and Windsurf.
  • Token Efficiency: Load minimal metadata first, then source files on-demand—not batch-loading entire catalogs.
  • Interactive TUI: A rich terminal UI (powered by Ratatui) provides real-time dashboard, skill explorer, and pipeline monitoring.

🚀 Quick Start

1. Build the CLI

cd skills-bank/
cargo build --release

2. Run the Full Pipeline

# Interactive setup (first run)
cargo run --release

# Or run all steps in sequence
cargo run --release -- run

# Launch the interactive TUI
cargo run --release -- tui

Interactive Setup (First Time)

cargo run --release -- setup

Launches an interactive wizard to configure:

  • Where skills should be synced (global, workspace, or both)
  • Which AI tools to sync to
  • Repository URLs to clone and aggregate
  • Excluded categories

🎮 Commands Reference

Core Pipeline Commands

| Command | Purpose | When to Use | |---------|---------|-------------| | aggregate | Collect, deduplicate, classify, and route skills from configured repositories to skills-aggregated/ | First run or when repositories change | | sync | Distribute aggregated skills to configured AI tool directories | After aggregation completes | | run | Execute the full pipeline (aggregate → sync) in sequence | Daily updates or automated workflows | | setup | Configure sync targets, repositories, and exclusions interactively | Initial setup only | | add-repo <URL> | Add a new skill repository to the configuration | When onboarding new sources | | doctor | Validate installation and report repository state | Troubleshooting or pre-cleanup inspection | | release-gate | Validate aggregation output integrity | Before releases or production sync | | cleanup-legacy-duplicates | Remove legacy repository folders from src/ or repos/ (only if matching lib/ exists) | Migration from older versions | | tui | Launch interactive terminal dashboard with skill explorer and statistics | Real-time monitoring |

Example Workflows

First-time setup:

cargo run --release -- setup
cargo run --release -- run

Daily aggregation with monitoring:

cargo run --release -- aggregate  # with progress bar
cargo run --release -- tui         # monitor in background

Validate before production sync:

cargo run --release -- doctor
cargo run --release -- release-gate
cargo run --release -- sync

📁 Project Structure

Source Code & Configuration

  • src/ — Rust source code: TUI, fetcher, aggregator, sync engine, classification logic
  • Cargo.toml — Rust manifest (dependencies, metadata, build targets)
  • .skills-bank-cli-config.json — User configuration file (generated by setup, contains sync targets and repository URLs)
  • .env-example — Environment variable template

Generated Outputs (After Aggregation)

  • skills-aggregated/ — Single source of truth containing:
    • routing.csv — Skill-to-hub/sub-hub routing table
    • subhub-index.json — Hub and sub-hub registry
    • hub-manifests.csv — Master index of all skills
    • .skill-lock.json — Aggregation metadata and timestamps
    • Per-hub directories with skills-manifest.json files

Repository Cache

  • lib/ — Canonical cache for cloned skill repositories (populated by aggregate command)

Testing & Documentation

  • tests/ — Integration test suite for pipeline and TUI
  • archive/ — Legacy PowerShell scripts (original PoC phase)
  • package.json — Node.js manifest for npx distribution
  • readme.md — This file

📁 Repository Management

Cloning & Caching

Cache Location: lib/ (not src/) — This is the canonical directory for all cloned repositories.

Clone Strategy:

  • First clone: Shallow clone with git clone --depth 1 --single-branch --no-tags (faster, smaller disk footprint)
  • Subsequent runs: git pull in existing directories (avoid re-cloning)
  • Deduplication: Normalized remote URLs and repository names prevent duplicate clones

Speed Optimization:

  • Parallel cloning via configurable PARALLEL_JOBS
  • Shallow clones reduce disk I/O by ~80% vs. full clones
  • Incremental updates via git pull

Legacy Repository Cleanup

If you have repositories in older locations (src/ or repos/), migrate them:

# Inspect current state
cargo run --release -- doctor

# Remove legacy folders (safe: only deletes if matching lib/ exists and Git remote matches)
cargo run --release -- cleanup-legacy-duplicates

⚠️ Warning: This is destructive. Always run doctor first to inspect repository state.

⚙️ Output Files & Configuration

Generated during aggregation into skills-aggregated/:

| File | Purpose | |------|---------| | routing.csv | Skill-to-hub/sub-hub mappings (name, hub, sub-hub, src_path) | | subhub-index.json | Complete hub and sub-hub registry | | hub-manifests.csv | Master index of all skills across all hubs | | .skill-lock.json | Aggregation metadata (timestamps, repo revisions, dedup stats) | | [hub]/[sub-hub]/skills-manifest.json | Per-sub-hub skill metadata and LLM classification triggers |

These files are used by agents and the TUI for discovery and routing.


🌐 Environment Variables

Copy .env-example to .env to override defaults:

cp .env-example .env

Common variables:

  • SKILLS_BANK_CONFIG — Path to CLI config file (default: .skills-bank-cli-config.json)
  • SKILLS_BANK_CACHE — Cache directory for repositories (default: lib/)
  • SKILLS_BANK_OUTPUT — Output directory for aggregated skills (default: skills-aggregated/)
  • LLM_BATCH_SIZE — Batch size for LLM classification (default: 50)
  • PARALLEL_JOBS — Number of parallel aggregation workers (default: auto-detect CPU count)

See .env-example for all available options.


🎯 Tool Integration Targets

Sync skills to any of these destinations:

| Tool | Project | Global | |---|---|---| | Claude | .claude/skills/ | ~/.claude/skills/ | | free-code (claude-code) | .free-code-config/skills/ | ~/.free-code-config/skills/ | | Hermes | .hermes/skills/ | ~/.hermes/skills/ | | Code (Codex) | .agents/skills/ | ~/.agents/skills/ | | GitHub Copilot | .github/skills/ | ~/.copilot/skills/ | | Cursor | .cursor/skills/ | ~/.cursor/skills/ | | Gemini | .gemini/skills/ | ~/.gemini/skills/ | | Antigravity | .agent/skills/ | ~/.gemini/antigravity/skills/ | | OpenCode | .opencode/skills/ | ~/.config/opencode/skills/ | | Windsurf | .windsurf/skills/ | ~/.codeium/windsurf/skills/ |


🏗️ Classification Architecture

The aggregation pipeline processes 8000+ SKILL.md files through a multi-stage classification system:

 SKILL.md files (8000+)
        │
        ▼
 ┌──────────────┐
 │  YAML Parse   │  Extract name, description, triggers
 └──────┬───────┘
        │
        ▼
 ┌──────────────┐
 │  Keyword      │  Fast token-based routing to hub/sub-hub
 │  Rules        │  (fallback if LLM unavailable)
 └──────┬───────┘
        │
        ▼
 ┌──────────────┐
 │  Dedup        │  Name OR Description HashSet
 │  (two-key)    │  Catches cross-repo clones
 └──────┬───────┘
        │
        ▼
 ┌──────────────────────────────────┐
 │  Hybrid Exclusion + LLM Classify │
 │  Step A: Keyword pre-filter      │
 │  Step B: LLM semantic classify   │
 │         (can return "excluded")  │
 └──────┬───────────────────────────┘
        │
        ▼
 ┌──────────────┐
 │  Output       │  routing.csv, per-hub manifests,
 │  Artifacts    │  skills-index.json
 └──────────────┘

🔍 Classification Improvements (v2.0+)

The keyword-based classification system includes three critical enhancements to eliminate false negatives and resolve sub-hub conflicts:

1. Repository Name Extraction (Substring Matching)

Problem: Repository names like mukul975-anthropic-cybersecurity-skills were not being matched because the system used exact token matching (e.g., only matching the token "security", not the full repo name).

Solution: Introduced infer_hub_from_repo_name() function that:

  • Extracts the repository directory name from the path (the segment right after lib/ or src/)
  • Uses substring matching to catch domain signals (e.g., "cybersecurity-skills" → matches "security")
  • Runs before other inference logic (highest priority)
  • Supports domain keywords:
    • Security: security, cybersecurity, pentest, vulnerability, vibesec, bluebook
    • AI: prompt, agent-skill, llm, ai-skills
    • Mobile (iOS): swiftui, ios-, -ios, swift-patterns, apple-hig, app-store
    • Mobile (Android): android, kotlin
    • Frontend/UI: ui-ux, ui-skills
    • Testing/QA: playwright, testdino

Confidence Score: 98% (near-deterministic, reflects author intent)

2. Sub-Hub Conflict Resolution

Problem: When a skill matched multiple sub-hubs (e.g., python AND security simultaneously), language hubs often won due to their anchor keywords, defeating domain-specialist classification.

Solution: Introduced conflict resolution table (CONFLICT_RESOLUTION) that:

  • Defines precedence rules when multiple sub-hubs match: (losing_hub, losing_sub_hub, winning_hub, winning_sub_hub)
  • Ensures domain specialists always win over languages:
    • security > python | javascript | typescript | rust | golang | java
    • testing-qa > python | javascript | typescript | rust
    • code-review > python | javascript
  • Applied in resolve_conflict() function when multiple candidates score within 5 points of the top score
  • Fallback: hub priority ordering if no explicit rule applies

3. Confidence Boost for Path-Based Inference

Problem: Repository name signals (inferred from path) were scored 95%, allowing lower-confidence LLM results (80%) to potentially override them.

Solution: Raised the confidence score for path-based inference from 95 → 98%

  • Score 98 is now treated as near-deterministic (same tier as explicit canonicalize_assignment logic at 100)
  • Only scores ≥ 100 can override it
  • Prevents low-confidence LLM results from contradicting repository metadata

📊 Example Classification Flow

For a skill in lib/mukul975-anthropic-cybersecurity-skills/:

1. apply_rules() called
   ↓
2. canonicalize_assignment() → no match (0% confidence)
   ↓
3. infer_from_path() called
   ├─ infer_hub_from_repo_name() extracts "mukul975-anthropic-cybersecurity-skills"
   ├─ Finds substring match: "cybersecurity"
   └─ Returns ("code-quality", "security") with 98% confidence
   ↓
4. ✓ Final assignment: code-quality / security
   ✗ LLM classification skipped (98% > 80% threshold)

🔧 Troubleshooting

Issue: Skills not aggregating or taking too long

Check repository state:

cargo run --release -- doctor

This validates all repositories, checks Git remotes, and reports cache status.

Increase parallelism:

export PARALLEL_JOBS=16
cargo run --release -- aggregate

Issue: Sync failing with "junction or symlink" errors

Cause: Existing junctions in sync target directories.

Solution: The sync command automatically skips existing junctions. If conflicts persist:

# Inspect sync targets
dir ~/.claude/skills  # Windows
ls ~/.claude/skills   # macOS/Linux

# Remove conflicting junctions/symlinks manually
rmdir /s ~/.claude/skills\[hub-name]  # Windows
rm -rf ~/.claude/skills/[hub-name]    # macOS/Linux

# Retry sync
cargo run --release -- sync

Issue: LLM classification appears stuck

Check TUI progress:

cargo run --release -- tui

The TUI shows real-time LLM batch progress. If stuck for >5 minutes:

# Check if LLM service (Ollama/Claude) is running
# Restart aggregation with keyword-only fallback
cargo run --release -- aggregate --skip-llm

Issue: "Release gate" validation fails

Check output integrity:

cargo run --release -- release-gate

This validates:

  • All SKILL.md files were processed
  • No orphaned or missing references in routing.csv
  • Deduplication stats match cache state

If failures reported, re-run aggregation:

rm -rf skills-aggregated/
cargo run --release -- aggregate

📈 Performance Characteristics

| Operation | Time | Dependencies | |-----------|------|---------------| | First aggregate (100+ repos, 8000+ skills) | 10-20 min | Network speed, CPU count, LLM latency | | Incremental aggregate (repos already cached) | 2-5 min | LLM classification speed (can skip with --skip-llm) | | Sync to tools (10 tools, all hubs) | 30-60 sec | Disk I/O, junction creation speed | | TUI startup | <1 sec | Manifest parsing | | LLM classification (8000 skills) | 3-8 min | Batch size, LLM throughput |

Optimization Tips:

  • Use PARALLEL_JOBS=auto for optimal CPU utilization
  • Set LLM_BATCH_SIZE=100 for faster LLM processing (requires more GPU/API quota)
  • Run on an SSD for 2-3x faster repository cloning
  • Use shallow clones (default) to reduce disk bandwidth

🤝 Contributing

Development Setup

# Clone and build
git clone <this-repo>
cd skills-bank
cargo build

# Run tests
cargo test

# Format code
cargo fmt

# Check for issues
cargo clippy

Reporting Issues

When reporting bugs, include:

  1. Output of cargo run --release -- doctor
  2. Contents of .skills-bank-cli-config.json (redact sensitive URLs if needed)
  3. Error message and stack trace (if any)
  4. Steps to reproduce

Extending Classification

To add new domain keywords or refine sub-hub routing:

  1. Edit src/classify.rsCONFLICT_RESOLUTION table or keyword rules
  2. Add test cases in tests/
  3. Run cargo test and cargo run --release -- aggregate
  4. Submit PR with classification examples

📄 License

MIT — See package.json for details.

Aehnliche Skills

AI skills bank | hub.ai-engineering.at