Zum Inhalt springen

Skills / blink skill

blink skill

Proactive screen awareness + Claude Vision assistance

โ˜… 3von @zhangtianruiwork-droidvor 29d aktualisiertMITGitHub โ†’

Installation

Kompatibilitaet

Claude CodeCodexVS Code

Beschreibung

๐Ÿ‘๏ธ Blink Skill ยท ่ฎฉไฝ ็š„ AI ๅญฆไผšใ€Œ็œจ็œผใ€่ง‚ๅฏŸไฝ ็š„็”ต่„‘

Teach your AI to blink โ€” watch your screen, understand your context, help without being asked.
ไฝŽๆˆๆœฌ่Žทๅพ— AI ไธปๅŠจๅๅŠฉๅŠŸ่ƒฝ๏ผšๆ„Ÿ็Ÿฅไฝ ๅœจๅšไป€ไนˆ โ†’ ไธปๅŠจ้—ฎไฝ ้œ€ไธ้œ€่ฆๅธฎๅฟ™ โ†’ ๆˆชๅ›พ โ†’ Claude Vision ๅˆ†ๆž โ†’ ็ป™ๅ‡บๅๅŠฉใ€‚

ไธ€ไธช็œจ็œผ๏ผŒAI ๅฐฑ็Ÿฅ้“ไฝ ๅœจๅšไป€ไนˆใ€‚
ไธ้œ€่ฆไฝ ไธปๅŠจๆ่ฟฐ๏ผŒไธ้œ€่ฆๆˆชๅ›พๅ‘็ป™ๅฎƒโ€”โ€”ๅฎƒ่‡ชๅทฑ็œ‹๏ผŒ่‡ชๅทฑ้—ฎ๏ผŒ่‡ชๅทฑๅธฎใ€‚


โœจ What is Blink? ไป€ไนˆๆ˜ฏใ€Œ็œจ็œผใ€๏ผŸ

Most AI assistants are reactive โ€” they wait for you to describe your situation, paste content, or upload screenshots. That's friction.

Blink removes that friction. It's a proactive AI skill that watches your PC activity via a lightweight Python sentinel, and when it detects you're in a meeting, working on a document, watching a video, or coding โ€” it reaches out and asks if you need help. One screenshot is all it takes.

ๅคงๅคšๆ•ฐ AI ๅŠฉๆ‰‹ๆ˜ฏ่ขซๅŠจ็š„โ€”โ€”ไฝ ๅพ—ๆ่ฟฐๆƒ…ๅ†ตใ€็ฒ˜่ดดๅ†…ๅฎนใ€ไธŠไผ ๆˆชๅ›พใ€‚
็œจ็œผ็œๆމไบ†่ฟ™ไธ€ๅˆ‡๏ผšAI ่‡ชๅทฑ็œ‹๏ผŒ่‡ชๅทฑ้—ฎ๏ผŒไธ€ๅผ ๆˆชๅ›พ๏ผŒ็›ดๆŽฅๅๅŠฉใ€‚

PC Sentinel watches your screen (Python, 1s interval)
        โ†“
Detects: Tencent Meeting / WPS doc / Bilibili video / Claude Code
        โ†“
AI asks proactively: "Hey, want me to help with that?"
        โ†“
You say: "Yes" / "ๅธฎๆˆ‘ๆ•ด็†ไธ€ไธ‹" / "Summarize this"
        โ†“
Blink takes ONE screenshot โ†’ Claude Vision analyzes it
        โ†“
Result: meeting notes / document analysis / video summary / code review
        โ†“
Need more? Scroll manually โ†’ say "็ปง็ปญ" โ†’ up to 3 screenshots total

๐Ÿš€ Demo

| Scenario | What Blink Does | |----------|----------------| | ๐ŸŽฅ Tencent Meeting opens | "Want me to record and summarize the meeting?" | | ๐Ÿ“„ WPS document opens | "Want me to help with this document?" โ†’ screenshots & analyzes it | | ๐ŸŽฌ Bilibili/YouTube video | "Want me to summarize this video?" | | ๐Ÿ’ป Claude Code / Codex active | "Want a screenshot analysis of your coding context?" |


๐Ÿ“ Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                        Windows PC                           โ”‚
โ”‚                                                             โ”‚
โ”‚  pc_sentinel.pyw โ”€โ”€writesโ”€โ”€โ–ถ pc-status.json (every 1s)    โ”‚
โ”‚  (Python watcher)            { status, detail, foreground } โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                               โ”‚ fs.watch / poll
                               โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                   Node.js WebSocket Server                   โ”‚
โ”‚                                                             โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”‚
โ”‚  โ”‚ Proactive Layer โ”‚    โ”‚      Blink Engine            โ”‚   โ”‚
โ”‚  โ”‚                 โ”‚    โ”‚                              โ”‚   โ”‚
โ”‚  โ”‚ fs.watch detectsโ”‚โ”€โ”€โ”€โ–ถโ”‚ blinkOnce(task, opts)        โ”‚   โ”‚
โ”‚  โ”‚ meeting / WPS / โ”‚    โ”‚  โ”œโ”€ CDP screenshot (browser) โ”‚   โ”‚
โ”‚  โ”‚ video / coding  โ”‚    โ”‚  โ””โ”€ PowerShell screenshot    โ”‚   โ”‚
โ”‚  โ”‚                 โ”‚    โ”‚      (WPS / system)          โ”‚   โ”‚
โ”‚  โ”‚ _proactiveOn    โ”‚    โ”‚                              โ”‚   โ”‚
โ”‚  โ”‚  Connect()      โ”‚    โ”‚ analyzeAndArchive(task, opts)โ”‚   โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ”‚  โ””โ”€ multi-screen + Word doc  โ”‚   โ”‚
โ”‚                         โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ”‚
โ”‚                                        โ”‚ axios POST          โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                         โ–ผ
                              โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                              โ”‚  Claude Vision   โ”‚
                              โ”‚  API (Anthropic) โ”‚
                              โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                         โ”‚
                              โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                              โ”‚  Browser UI / TTS  โ”‚
                              โ”‚  (WebSocket client) โ”‚
                              โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿ“ Project Structure

changzheng-blink/
โ”œโ”€โ”€ README.md                        # This file
โ”œโ”€โ”€ LICENSE                          # MIT
โ”œโ”€โ”€ skill.md                         # Claude Code skill manifest
โ””โ”€โ”€ snippets/
    โ”œโ”€โ”€ screen_vision.js             # Core: screenshot + Claude Vision
    โ”œโ”€โ”€ proactive-detection.js       # Core: PC state detection + proactive trigger
    โ””โ”€โ”€ blink-functions.js           # Core: blink session management

Integration into ้•ฟๅพๆœบ (pc-server/)

The full server integration lives in the main Changzheng repo:

pc-server/
โ”œโ”€โ”€ server.js                        # WebSocket server + routing
โ”‚   โ”œโ”€โ”€ fs.watch(STATUS_FILE)        # โ† proactive trigger (background apps)
โ”‚   โ”œโ”€โ”€ _proactiveOnConnect(ws)      # โ† trigger on new connection
โ”‚   โ”œโ”€โ”€ _broadcastProactive(...)     # โ† send question to all clients
โ”‚   โ”œโ”€โ”€ _blinkStart(ws, ...)         # โ† start a blink session
โ”‚   โ””โ”€โ”€ _blinkContinue(ws, ...)      # โ† handle "็ปง็ปญ" for next screenshot
โ””โ”€โ”€ lib/
    โ””โ”€โ”€ screen_vision.js             # โ† screenshot + Vision (same as snippets/)

๐Ÿ› ๏ธ Setup

Prerequisites

  • Windows 10/11 (screenshot uses PowerShell)
  • Node.js 18+
  • Python 3.8+ (for pc_sentinel.pyw)
  • Claude API Key (get one here)
  • Chrome/Edge browser (optional, for CDP-based browser screenshots)

Install

npm install ws axios docx

1. Run the PC Sentinel

The sentinel watches active windows and writes state every second:

python pc_sentinel.pyw
# Writes to: ~/.openclaw/workspace/pc-status.json

pc-status.json format:

{
  "status": "meeting",
  "detail": "bg:wemeetapp",
  "foreground": "msedge",
  "fg_title": "My Browser - Microsoft Edge",
  "idle_seconds": 0,
  "updated_at": "2026-04-08T14:30:00"
}

Possible status values: meeting, office, browsing, coding, video, unknown

2. Integrate into your WebSocket server

const screenVision = require('./snippets/screen_vision');
const { watchProactiveTrigger, checkOnConnect } = require('./snippets/proactive-detection');
const { startBlink, handleBlinkContinue } = require('./snippets/blink-functions');

// Watch for proactive triggers
watchProactiveTrigger(STATUS_FILE, getContext, broadcastProactive);

// On new WebSocket connection
wss.on('connection', (ws) => {
  checkOnConnect(ws, getContext, sendFn, sendTTSFn);
});

// In your message router (highest priority)
async function routeMessage(ws, text) {
  // 1. Check for blink continuation
  const continued = await handleBlinkContinue(ws, text, { send, sendTTS, addHistory, reply });
  if (continued) return;

  // 2. Check for proactive confirmation/denial
  if (ws._proactivePending) {
    if (DENY_RE.test(text))    { /* dismiss */ return; }
    if (/* isConfirm */ true)  { await startBlink(ws, text, { send, sendTTS, addHistory }); return; }
  }

  // 3. Normal routing...
}

โš™๏ธ Configuration

| Option | Default | Description | |--------|---------|-------------| | BLINK_MAX | 3 | Maximum screenshots per session | | PROBE_DISMISS_MS | 30 min | Cooldown after user dismissal | | forceSystem | false | Use PowerShell screenshot instead of CDP | | scrollCount | 5 | Screens for analyzeAndArchive (multi-page) |


๐Ÿ”‘ Key State Variables

// Per-connection (on ws object)
ws._proactivePending  // { type, appKey, askedAt } โ€” pending proactive question
ws._blinkSession      // { task, forceSystem, summaries[], count } โ€” active blink session

// Global
_probeDismissed       // Map<key, timestamp> โ€” 30-min cooldown per app
_probeLastKey         // string โ€” last broadcast key (dedup)
_probeCooldown        // number โ€” timestamp of last broadcast

๐Ÿง  How Proactive Detection Works

// pc-status.json is watched for changes
fs.watch(STATUS_FILE, async () => {
  const ctx = await getContext();

  // Parse detail (can be string or object depending on detection method)
  // String: "bg:wemeetapp" (background process)
  // Object: { matched_by, fg_process, fg_window } (foreground process)
  const { matchedBy, fgProcess } = parseDetail(ctx);

  // Detect trigger type
  if (ctx.status === 'meeting' && /wemeet|zoom/i.test(matchedBy + fgProcess)) {
    // โ†’ ask about recording
  } else if (ctx.status === 'office' && /wps/i.test(matchedBy + fgProcess)) {
    // โ†’ ask about document help
  }

  // Use unified key format: "probe:meeting", "probe:document"
  const key = `probe:${triggerType}`;
  if (!_isDismissed(key)) broadcastProactive(...);
});

๐Ÿ“ธ Screenshot Methods

| Method | Used For | How | |--------|----------|-----| | CDP (Chrome DevTools Protocol) | Browser tabs | chrome.tabs.captureVisibleTab via CDP | | PowerShell EncodedCommand | WPS, system-wide | System.Drawing.Graphics.CopyFromScreen via Base64-encoded PS |

PowerShell commands are Base64-encoded as UTF-16LE to avoid quoting issues and AMSI false positives.


๐Ÿ”„ Blink Session Flow

User confirms help
      โ”‚
      โ–ผ
[WPS? โ†’ AppActivate WPS window]
      โ”‚
      โ–ผ
blinkOnce(task, { forceSystem })
      โ”‚
      โ–ผ
Claude Vision analyzes screenshot
      โ”‚
      โ–ผ
Output: "ใ€็ฌฌ1ๅผ ๆˆชๅ›พๅˆ†ๆžใ€‘..."
      โ”‚
      โ–ผ
"็ฟป้กตๅŽ่ฏด็ปง็ปญๅฏไปฅๅ†็œ‹ไธ€ๅฑ (max 3)"
      โ”‚
User scrolls manually
      โ”‚
User says: "็ปง็ปญ" / "ไธ‹ไธ€้กต" / "ๅฅฝไบ†"
      โ”‚
      โ–ผ
blinkOnce(task, { screenIndex: 2, prevSummaries: [...] })
      โ”‚
      โ–ผ
Output: "ใ€็ฌฌ2ๅผ ๆˆชๅ›พๅˆ†ๆžใ€‘..."  (new content only)

๐Ÿค Contributing

PRs welcome! Key areas for improvement:

  • [ ] macOS support (replace PowerShell with screencapture)
  • [ ] Auto-scroll for WPS (currently manual due to window focus issues)
  • [ ] More trigger apps (Slack, Notion, VSCode)
  • [ ] Voice confirmation (STT integration)

๐Ÿ“„ License

MIT ยฉ 2026 ๅฑฑ่€Œ


๐Ÿ‘ฅ Contributors

| | Name | Role | |---|------|------| | ๐Ÿง‘โ€๐Ÿ’ป | ZHANG Tianrui (@zhangtianruiwork-droid) | Creator ยท Architecture ยท Integration | | ๐Ÿค– | Claude (Anthropic) | Co-developer ยท Code generation ยท Vision analysis |

Built as part of ้•ฟๅพๆœบ (Changzheng) โ€” a personal AI assistant that runs 24/7 on Windows, powered by Claude + DeepSeek.

Aehnliche Skills